Graceful Shutdown — Handling SIGTERM in Python Services
Imagine your Python program as a person in a house. KeyboardInterrupt (Ctrl+C) is like tapping them on the shoulder, asking them to leave. SIGTERM (from an orchestrator) is like a formal eviction notice. If the person only responds to shoulder taps, they'll miss the eviction notice and be forcibly removed, leaving things messy.
The Setup
You write a daemon to process background jobs. You handle KeyboardInterrupt so you can shut down cleanly on your machine, but when deploying to production, Kubernetes terminates your process abruptly, leaving jobs half-finished.
What Does This Print?
import sys
import time
def cleanup():
print("Releasing locks, flushing buffers, exiting cleanly!")
print("Daemon running...")
try:
# Simulate a processing loop
for tick in range(1, 3):
print(f"Saving step {tick}")
time.sleep(1)
except KeyboardInterrupt:
cleanup()
sys.exit(0)
# Note: If OS sends SIGTERM instead of SIGINT (Ctrl+C), does this block run?
The Output
When a container orchestrator halts a container, it emits a SIGTERM signal. Standard Python applications catching KeyboardInterrupt will completely miss this signal because CPython does not convert SIGTERM into Python exceptions natively. As a result, the application exits abruptly without running finally blocks, causing lost logs or database write corruption.
Why Python Does This
At the OS level, different signals trigger different handler paths. Python automatically hooks SIGINT (signal 2, triggered by Ctrl+C) and maps it to a KeyboardInterrupt exception, allowing typical try-except handlers to execute. However, SIGTERM (signal 15) is handled by the default OS C-handler, which abruptly halts process memory and exits immediately with code 143. To run cleanups or finalize operations, you must register a custom signal handler using the standard library signal module that explicitly translates SIGTERM into a clean exit sequence or raises a SystemExit exception.
The Fix
import sys
import time
import signal
def cleanup(signum, frame):
print("SIGTERM received! Releasing locks, flushing buffers, exiting cleanly!")
sys.exit(0) # Triggers standard exit, executing try-finally paths
# Fix: Explicitly register the custom cleanup handler for SIGTERM
signal.signal(signal.SIGTERM, cleanup)
print("Daemon running...")
try:
for tick in range(1, 3):
print(f"Saving step {tick}")
time.sleep(1)
finally:
# This block now runs on SIGTERM cleanups safely
print("Finalizing thread context resources.")
Registering a signal handler for SIGTERM allows the Python application to explicitly intercept the shutdown signal sent by container orchestrators. This provides a designated entry point to execute cleanup routines (e.g., flushing data, releasing resources) before exiting, ensuring data integrity and a clean shutdown.
How This Fails in Real Systems
A high-volume data pipeline daemon suffered from database index corruption twice a week. Operators discovered that deployment updates on Kubernetes sent SIGTERM signals to pods, which immediately exited mid-transaction. Once a signal handler was registered to intercept SIGTERM, corruption incidents dropped to zero.
Key Takeaway
KeyboardInterrupt is sufficient for graceful shutdown in containerized applications, failing to realize that orchestration systems send SIGTERM signals which are not handled by default, leading to abrupt process termination and data loss.