This document provides information about how Valkey reacts to different POSIX signals such as SIGTERM
and SIGSEGV
.
The SIGTERM
and SIGINT
signals tell Valkey to shut down gracefully. When the server receives this signal, it does not immediately exit. Instead, it schedules a shutdown similar to the one performed by the SHUTDOWN
command. The scheduled shutdown starts as soon as possible, specifically as long as the current command in execution terminates (if any), with a possible additional delay of 0.1 seconds or less.
If the server is blocked by a long-running Lua script, kill the script with SCRIPT KILL
if possible. The scheduled shutdown will run just after the script is killed or terminates spontaneously.
This shutdown process includes the following actions:
CLIENT PAUSE
and the WRITE
option.shutdown-timeout
(default 10 seconds) for replicas to catch up with the primary’s replication offset.fsync
system call on the AOF file descriptor to flush the buffers on disk.IF the RDB file can’t be saved, the shutdown fails, and the server continues to run in order to ensure no data loss. Likewise, if the user just turned on AOF, and the server triggered the first AOF rewrite in order to create the initial AOF file but this file can’t be saved, the shutdown fails and the server continues to run. No further attempt to shut down will be made unless a new SIGTERM
is received or the SHUTDOWN
command is issued.
Since Redis OSS 7.0, the server waits for lagging replicas up to a configurable shutdown-timeout
, 10 seconds by default, before shutting down. This provides a best effort to minimize the risk of data loss in a situation where no save points are configured and AOF is deactivated. Before version 7.0, shutting down a heavily loaded primary node in a diskless setup was more likely to result in data loss. To minimize the risk of data loss in such setups, trigger a manual FAILOVER
(or CLUSTER FAILOVER
) to demote the primary to a replica and promote one of the replicas to a new primary before shutting down a primary node.
The following signals are handled as a Valkey crash:
Once one of these signals is trapped, Valkey stops any current operation and performs the following actions:
When the child performing the Append Only File rewrite gets killed by a signal, Valkey handles this as an error and discards the (probably partial or corrupted) AOF file. It will attempt the rewrite again later.
When the child performing an RDB save is killed, Valkey handles the condition as a more severe error. While the failure of an AOF file rewrite can cause AOF file enlargement, failed RDB file creation reduces durability.
As a result of the child producing the RDB file being killed by a signal, or when the child exits with an error (non zero exit code), Valkey enters a special error condition where no further write command is accepted.
MISCONFIG
error.This error condition will persist until it becomes possible to create an RDB file successfully.
Sometimes the user may want to kill the RDB-saving child process without generating an error. This can be done using the signal SIGUSR1
. This signal is handled in a special way: it kills the child process like any other signal, but the parent process will not detect this as a critical error and will continue to serve write requests.