Starts a coordinated failover from a server to one of its replicas.
FAILOVER
[TO host port
[FORCE]]
[ABORT]
[TIMEOUT milliseconds]
This command will start a coordinated failover between the
currently-connected-to primary and one of its replicas. The failover is
not synchronous, instead a background task will handle coordinating the
failover. It is designed to limit data loss and unavailability of the
cluster during the failover. This command is analogous to the CLUSTER FAILOVER command
for non-clustered Valkey and is similar to the failover support provided
by sentinel.
The specific details of the default failover flow are as follows:
CLIENT PAUSE WRITE, which will
pause incoming writes and prevent the accumulation of new data in the
replication stream.PSYNC FAILOVER,
instructing the target replica to become a primary.PSYNC FAILOVER was accepted it will unpause its clients. If
the PSYNC request is rejected, the primary will abort the failover and
return to normal.The field master_failover_state in INFO replication can be used to track
the current state of the failover, which has the following values:
no-failover: There is no ongoing coordinated
failover.waiting-for-sync: The primary is waiting for the
replica to catch up to its replication offset.failover-in-progress: The primary has demoted itself,
and is attempting to hand off ownership to a target replica.NOTE: During the failover-in-progress phase, the primary
first demotes itself to a replica and then notifies the replica to
promote itself to primary. These two steps are an asynchronous process,
which may result in the simultaneous existence of two nodes as replicas.
In this scenario, for clients that support REDIRECT (explicitly execute
CLIENT CAPA REDIRECT), the
redirection result may bounce back and forth between the two replicas
until the target replica completes the process of promoting itself to
primary. To avoid this situation, during the
failover-in-progress phase, we temporarily suspend the
clients that need to be redirected until the replica truly becomes the
primary, and then resume the execution.
If the previous primary had additional replicas attached to it, they
will continue replicating from it as chained replicas. You will need to
manually execute a REPLICAOF
on these replicas to start replicating directly from the new
primary.
The following optional arguments exist to modify the behavior of the failover flow:
TIMEOUT milliseconds – This option allows
specifying a maximum time a primary will wait in the
waiting-for-sync state before aborting the failover attempt
and rolling back. This is intended to set an upper bound on the write
outage the Valkey cluster can experience. Failovers typically happen in
less than a second, but could take longer if there is a large amount of
write traffic or the replica is already behind in consuming the
replication stream. If this value is not specified, the timeout can be
considered to be “infinite”.
TO HOST PORT – This option allows
designating a specific replica, by its host and port, to failover to.
The primary will wait specifically for this replica to catch up to its
replication offset, and then failover to it.
FORCE – If both the TIMEOUT and
TO options are set, the force flag can also be used to
designate that that once the timeout has elapsed, the primary should
failover to the target replica instead of rolling back. This can be used
for a best-effort attempt at a failover without data loss, but limiting
write outage.
NOTE: The primary will always rollback if the
PSYNC FAILOVER request is rejected by the target
replica.
The failover command is intended to be safe from data loss and
corruption, but can encounter some scenarios it can not automatically
remediate from and may get stuck. For this purpose, the
FAILOVER ABORT command exists, which will abort an ongoing
failover and return the primary to its normal state. The command has no
side effects if issued in the waiting-for-sync state but
can introduce multi-primary scenarios in the
failover-in-progress state. If a multi-primary scenario is
encountered, you will need to manually identify which primary has the
latest data and designate it as the primary and have the other
replicas.
NOTE: REPLICAOF is disabled
while a failover is in progress, this is to prevent unintended
interactions with the failover that might cause data loss.
Simple string
reply: OK if the command was accepted and a coordinated
failover is in progress. An error if the operation cannot be
executed.
O(1)
@admin @dangerous @slow
ACL, ACL CAT, ACL DELUSER, ACL DRYRUN, ACL GENPASS, ACL GETUSER, ACL HELP, ACL LIST, ACL LOAD, ACL LOG, ACL SAVE, ACL SETUSER, ACL USERS, ACL WHOAMI, BGREWRITEAOF, BGSAVE, COMMAND, COMMAND COUNT, COMMAND DOCS, COMMAND GETKEYS, COMMAND GETKEYSANDFLAGS, COMMAND HELP, COMMAND INFO, COMMAND LIST, COMMANDLOG, COMMANDLOG GET, COMMANDLOG HELP, COMMANDLOG LEN, COMMANDLOG RESET, CONFIG, CONFIG GET, CONFIG HELP, CONFIG RESETSTAT, CONFIG REWRITE, CONFIG SET, DBSIZE, DEBUG, FLUSHALL, FLUSHDB, INFO, LASTSAVE, LATENCY, LATENCY DOCTOR, LATENCY GRAPH, LATENCY HELP, LATENCY HISTOGRAM, LATENCY HISTORY, LATENCY LATEST, LATENCY RESET, LOLWUT, MEMORY, MEMORY DOCTOR, MEMORY HELP, MEMORY MALLOC-STATS, MEMORY PURGE, MEMORY STATS, MEMORY USAGE, MODULE, MODULE HELP, MODULE LIST, MODULE LOAD, MODULE LOADEX, MODULE UNLOAD, MONITOR, PSYNC, REPLCONF, REPLICAOF, RESTORE-ASKING, ROLE, SAVE, SHUTDOWN, SLOWLOG, SLOWLOG GET, SLOWLOG HELP, SLOWLOG LEN, SLOWLOG RESET, SWAPDB, SYNC, TIME.