threading vs asyncio vs multiprocessing — When to Use Which
Visualize three tools in a toolbox: multiprocessing is a heavy-duty excavator for massive, independent CPU-intensive jobs; asyncio is a finely tuned racing car for fast, cooperative I/O-bound tasks; and threading is a regular car best for simple, concurrent I/O or background operations within a single process (acknowledging GIL for CPU).
The Setup
A systems engineer builds a performance-critical service that performs massive network scraping and then tokenizes the HTML payloads for a machine learning pipeline, attempting to use Python's standard threading module for both.
What Does This Print?
import threading
import time
import hashlib
# Simulating CPU-bound HTML processing (hashing payload)
def process_payload(data):
for _ in range(1_000_000):
hashlib.sha256(data.encode()).hexdigest()
def run_threaded_pipeline():
start = time.perf_counter()
threads = []
# Creating 4 threads to do heavy CPU tasks
for i in range(4):
t = threading.Thread(target=process_payload, args=(f"payload_data_{i}",))
threads.append(t)
t.start()
for t in threads:
t.join()
print(f"Pipeline executed in {time.perf_counter() - start:.2f} seconds")
run_threaded_pipeline()
The Output
The execution time is not improved. Because hashing is a CPU-bound operation, threading fails to achieve parallelism due to the GIL. For CPU-bound tasks, we must use multiprocessing. Conversely, if the task was I/O-bound (like requesting websites), spawning processes would incur immense memory overhead, and asyncio or threading would be far more efficient.
Why Python Does This
CPython implements three primary concurrency paradigms, each optimized for different bottlenecks: 1. asyncio: Best for cooperative I/O (millions of slow sockets). It runs single-threaded, using an OS multiplexer (like epoll or kqueue) to suspend/resume tasks, ensuring minimum memory consumption. 2. threading: Best for preemptive I/O (older APIs, library compatibility). It uses real OS threads, but is limited by the GIL, making it poor for CPU tasks. 3. multiprocessing: Best for CPU-bound tasks (math, processing). It bypasses the GIL by spawning separate Python interpreter instances, which can be mapped to independent CPU cores, but requires heavy IPC and memory overhead.
The Fix
from concurrent.futures import ProcessPoolExecutor
import time
import hashlib
# Keep computational payload separate from thread constraints
def process_payload(data):
for _ in range(1_000_000):
hashlib.sha256(data.encode()).hexdigest()
def run_parallel_pipeline():
start = time.perf_counter()
# Multiprocessing spawns separate processes to run on multiple CPU cores
with ProcessPoolExecutor(max_workers=4) as executor:
futures = [
executor.submit(process_payload, f"payload_data_{i}")
for i in range(4)
]
# Block and fetch results
for f in futures:
f.result()
print(f"Pipeline executed in {time.perf_counter() - start:.2f} seconds")
if __name__ == "__main__":
run_parallel_pipeline()
multiprocessing creates separate processes, each with its own Python interpreter and GIL, enabling true parallel execution on multiple CPU cores for CPU-bound tasks. asyncio uses a single thread and an event loop to efficiently manage many I/O-bound operations by switching between tasks during await calls, avoiding blocking. threading (due to GIL) is suitable for I/O-bound tasks where thread-blocking I/O operations can release the GIL.
How This Fails in Real Systems
An ad-tech server parsed user clickstreams and concurrently made external bidding requests. The developers originally selected multiprocessing for everything. Under high traffic, the containerized application exceeded its memory limits (OOM) within minutes because each spawned process duplicated the massive memory footprint of the ad metadata dictionary. The service was rescued by refactoring the external network bidding to use asyncio, reducing memory consumption by 90% and stabilizing the container.
Key Takeaway
threading for CPU-bound tasks, or asyncio for tasks that don't primarily involve awaitable I/O), leading to performance bottlenecks or unnecessary complexity.