SIGTERM is a Linux signal that Unix-based operating systems issue when they want to terminate a running process. In normal circumstances, your application should respond to a SIGTERM by running cleanup procedures that facilitate graceful termination. If processes aren’t ready to terminate, they may elect to ignore or block the signal.
An understanding of SIGTERM can help you work out why containers in Docker and Kubernetes are being stopped. The signal is intended to let your application detect and respond to impending terminations initiated outside the running process. It’ll be issued before the harsher SIGKILL signal, which is used to forcefully terminate a process after it’s ignored a SIGTERM.
In this article, you’ll learn how SIGTERM signals impact your Kubernetes containers, and how you can handle them so you can implement graceful terminations. First, you’ll explore the purpose of SIGTERM and how it’s used by the kernel, giving you a full understanding of what happens when a container is terminated.
What Is SIGTERM?
SIGTERM (signal 15) is the Unix process termination signal. It’s the first method the kernel uses when it needs to kill a process. That might be in response to a user request, such as the <terminal inline>kill<terminal inline> command, or external pressures, like a low memory situation.
SIGTERM’s purpose is to let processes gracefully terminate of their own accord, instead of being forcefully killed by the kernel. A forced termination can lead to data corruption, so it’s used as the option of last resort. Applications with long-lived database connections, open file handles, or active network communications can intercept SIGTERM signals to finish their current activity and safely close stateful resources.
A process that exits after a SIGTERM will emit the status code 143. This is also what you’ll see in Docker and Kubernetes when a container is terminated due to the SIGTERM signal.
Issuing a SIGTERM
To see SIGTERM in action, open two terminals. In the first terminal, run <terminal inline>sleep<terminal inline> to create a long-running command:
This will block the terminal window while the command runs for five minutes. Switch to your second terminal, and run <terminal inline>ps -aux<terminal inline> to discover the process ID (PID) of the <terminal inline>sleep<terminal inline> command:
In this example, we can see the <terminal inline>sleep<terminal inline> command is executing as PID 3074856.
Pass the PID to the <terminal inline>kill<terminal inline> command to issue a SIGTERM to the process. Despite its name, <terminal inline>kill<terminal inline> issues a SIGTERM by default, allowing the process to stop gracefully.
In your first terminal window, where you ran the <terminal inline>sleep<terminal inline> command, you should see the process terminate and drop back to the shell:
Inspecting the exit code in this terminal will reveal it to be 143, indicating the process in response to a SIGTERM signal.
What About SIGKILL?
SIGKILL (signal 9, exit code 137) is issued later in the process termination sequence. While SIGTERM can be seen as a “please stop when possible,” SIGKILL is an urgent “stop now.”
Processes aren’t able to handle, block, or ignore SIGKILL. Once one’s been issued, the process will be terminated by the kernel. It has almost immediate effect, killing the process without giving it a chance to delay.
You can issue a SIGKILL using <terminal inline>kill<terminal inline> with the <terminal inline>-9<terminal inline> flag. This instructs the command to send a SIGKILL instead of SIGTERM. Create two new terminal windows to repeat the example from above. Run <terminal inline>sleep 300<terminal inline> in the first and then kill the process using the second window:
The process is PID <terminal inline>3084295<terminal inline>; now it can be sent a SIGKILL with <terminal inline>kill -9<terminal inline>:
Back in the first window, you’ll see <terminal inline>sleep<terminal inline> exit and a <terminal inline>Killed<terminal inline> message appear in the terminal output:
Retrieving the exit code will confirm it’s 137, meaning a SIGKILL signal was received.
The Importance of Understanding SIGTERM
Handling SIGTERM ensures your applications terminate properly without risking data corruption. Understanding the differences between SIGTERM and SIGKILL can also help you identify the reason why a process has been stopped. If it received a SIGTERM, it’s an indication you could have reacted to the signal to prevent a bad state occurring.
On the other hand, applications that are receiving SIGKILL signals can indicate there’s bigger problems with your environment. The kernel should only issue a SIGKILL when it needs to immediately cull its process list. This is normally due to the out-of-memory (OOM) killer intervening to prevent RAM exhaustion. Regular unexpected SIGKILLs should be investigated by checking if your host has enough physical memory to reliably support its workloads.
SIGTERM In Kubernetes
The Kubernetes pod termination process is based on the SIGTERM and SIGKILL mechanism. When a pod is terminated, the foreground processes of its containers will receive a SIGTERM signal. The containers continue running, offering them an opportunity to gracefully terminate.
If the container hasn’t stopped thirty seconds after the SIGTERM was sent, Kubernetes gives up waiting and uses SIGKILL to forcefully terminate it. The SIGKILL stops all the running processes associated with the pod’s containers. Kubernetes also removes the pod itself and any other related objects. The thirty second delay is configurable; you’ll see how to change this below.
All this occurs each time you delete a Kubernetes pod. To issue a SIGTERM to a pod, simply use the <terminal inline>kubectl delete<terminal inline> command. This always defaults to graceful termination.
The command returns once all the containers in the pod have actually terminated. If there’s a delay after running the command, it’s often because one or more of the container processes are handling the SIGTERM signal to gracefully terminate themselves. (Object finalizers are the other common source of delays when removing pods.)
kubectl does have a way to force an immediate deletion. Adding the <terminal inline>--force<terminal inline> flag to a <terminal inline>kubectl delete<terminal inline> command will send a SIGKILL to the container processes immediately after the SIGTERM is issued. This permits instantaneous deletion of the pod.
Changing the Grace Period
As mentioned above, Kubernetes defaults to allowing thirty seconds for container SIGTERM handlers to complete. When that time expires, a SIGKILL will be issued to force the container’s termination.
This value can be changed by setting the spec.terminationGracePeriodSeconds field on your pods. It defines the maximum time Kubernetes will wait after issuing a SIGTERM to terminate a container within the pod.
Applying this pod to your cluster (<terminal inline>kubectl apply -f pod.yaml<terminal inline>) allows its containers a longer period in which they can gracefully terminate. If a container used all the available time, a <terminal inline>kubectl delete pod my-pod<terminal inline> command would seem to hang for ninety seconds before a SIGKILL is issued.
It’s important to note that it’s still possible for a container to be killed immediately, without receiving a SIGTERM or a corresponding grace period. Evictions due to Kubernetes detecting an out-of-memory (OOM) condition occur instantly with an unavoidable SIGKILL, just like standard Linux processes.
Handling SIGTERM in Your Code
Now that you know what SIGTERM does and when it’s used, it’s a good idea to modify your own applications to properly support it. Each programming language provides its own mechanism to listen for and handle operating system signals. Attaching a handler to SIGTERM will let you run code just before the process terminates.
Here’s a simple example in Python:
The code configures a signal handler that calls the <terminal inline>handle_sigterm()<terminal inline> function when a SIGTERM is received.
Here’s the same code implemented in Node.js:
A SIGTERM handler makes sense in any program which could be interrupted during a long-lived operation that needs to run to completion.
Using ContainIQ to Monitor Kubernetes SIGTERM Signals
ContainIQ’s monitoring platform can be used to track and issue alerts when containers are terminated due to SIGTERM and SIGKILL signals. It includes an events dashboard that logs pod restarts, and lets you filter by timeframes and Kubernetes namespaces.
You can use ContainIQ to spot when a container receives a SIGTERM signal because its controlling pod has been deleted. It also shows SIGKILL events when Kubernetes forcefully evicts pods because of a resource contention condition.
The platform’s alerting capabilities can be configured to send instant notifications when a container restart occurs. These can keep you abreast of unexpected terminations due to SIGTERM signals and other external factors.
Linux sends SIGTERM signals to processes when they’re about to be terminated. The process can handle the signal to implement graceful cleanup procedures, such as ending network activity and closing file handlers. It should then exit in a timely manner to fulfill the termination request. A process that doesn’t terminate in response to a SIGTERM may be forcefully killed by a later SIGKILL signal.
Kubernetes uses SIGTERM and SIGKILL within its own container termination process. Deleting a pod first issues a SIGTERM to the pod’s containers, providing time to clean up that matches the configured grace period. Containers that don’t quit in time will receive a SIGKILL that enacts an instant termination.
When it comes to watching for SIGTERM and SIGKILL signals, ContainIQ’s solution helps you keep tabs on container terminations and associated events. The platform’s integrated alerts and filterable reporting screens remove the burden of manually sifting through logs to uncover a pod’s signal history.