← Python Code Deployment & Monitoring
Browse Python Concepts

Docker for Python — Writing a Minimal Dockerfile

Mental Model

Imagine Python's stdout as a pipeline. In a regular terminal, it's a direct, unbuffered pipe to your screen. In a Docker container, it's often a buffered pipe to a file or another process, waiting to be filled before flushing. Setting PYTHONUNBUFFERED=1 forces it to act like a direct pipe.

Rule: Always set PYTHONUNBUFFERED=1 in your Dockerfiles to ensure that log records are emitted immediately to the container runtime.

The Setup

You deploy a lightweight Python background worker to production inside a minimal Docker container. Locally, it prints updates every second, but in production, running docker logs displays absolutely nothing for minutes.

What Does This Print?

Broken code
Python
import time
import sys

print("Background service initialized...", flush=False)
for idx in range(1, 4):
    print(f"Processing database transaction log #{idx}")
    # Intentionally sleeping to simulate persistent background activity
    time.sleep(1)
Predict what happens when you run this script inside a standard Docker container and inspect the logs using docker logs while it runs.

The Output

What actually happens
(Nothing is displayed for several seconds, then all logs appear at once after the buffer fills or the process terminates)

When running Python inside a Docker container, your outputs may seem frozen. This is because CPython detects that its standard output is being piped to a non-interactive stream (the container log driver) rather than an interactive terminal device (a TTY). Consequently, Python switches its stdout buffering mode from line-buffered to block-buffered, holding up to 8KB of output in memory before outputting any logs.

Why Python Does This

At the system call level, Python checks isatty(sys.stdout.fileno()). If it returns False, Python buffers standard output. This design optimization reduces costly system-level write operations when redirecting logs to text files. However, within a container runtime like Docker, standard streams are directed to log sockets. Block buffering causes messages to accumulate in memory instead of flowing sequentially. This hides tracebacks and application status messages during deployment. To override this behavior, set the PYTHONUNBUFFERED environment variable to 1, which forces standard output streams to bypass internal buffering blocks completely.

The Fix

Corrected pattern
Python
import time
import sys

# Fix: Run with environment variable PYTHONUNBUFFERED=1 set in the Dockerfile
# Or, explicitly pass flush=True to force lines to be written instantly
print("Background service initialized...", flush=True)
for idx in range(1, 4):
    print(f"Processing database transaction log #{idx}", flush=True)
    time.sleep(1)

Setting PYTHONUNBUFFERED=1 forces CPython to operate in unbuffered mode for its standard streams. This overrides the default behavior where Python buffers output when stdout is not an interactive TTY, ensuring logs are emitted immediately to the container runtime.

How This Fails in Real Systems

A high-throughput messaging consumer was deployed to Amazon ECS. When memory exhaustion killed the task, the logging dashboard showed absolutely nothing, leaving the on-call team blind. It was later found that the process was crashing silently on boot, but because of block buffering, 3KB of fatal traceback logs were sitting unwritten in CPython's buffer when the OS terminated the task. The outage persisted for 4 hours until PYTHONUNBUFFERED was enabled.

Key Takeaway

Always set PYTHONUNBUFFERED=1 in your Dockerfiles to ensure that log records are emitted immediately to the container runtime.
Common mistake: Developers assume Python's default output buffering behavior is consistent across all execution environments, including Docker containers, not realizing that CPython changes its buffering strategy when stdout is not a TTY.