collections.deque vs list for Queues
Imagine a line of people. With a list, removing the person at the front means everyone behind them has to step forward. With a deque, it's like a magical revolving door where the front person simply steps out, and the next person is instantly at the front, with no one else needing to move.
collections.deque instead of a list to avoid O(N) shift costs.The Setup
You are implementing an in-memory queue to process logs from a high-throughput websocket endpoint. You store data in a list and pop old messages as new ones arrive to maintain a sliding execution window.
What Does This Print?
import time
queue = list(range(100000))
start = time.perf_counter()
# Dequeue 20,000 items from the front of the list
for _ in range(20000):
queue.pop(0)
duration = time.perf_counter() - start
print(f"Time taken using list: {duration:.5f}s")
print(f"Remaining items: {len(queue)}")
The Output
While 0.18 seconds is fast for a single call, this scales quadratically. If the queue had 1,000,000 items, popping from the front would take upwards of 10-20 seconds. This is because every pop(0) triggers a massive memory reallocation cycle.
Why Python Does This
Python lists are dynamic arrays. When you remove the first element (pop(0)), the underlying C array must shift all remaining pointers one index to the left to keep memory contiguous. This makes popping from the left an O(N) operation. collections.deque is implemented as a doubly-linked list of blocks. Appending or popping from either end is an O(1) operation because it only involves re-pointing block references without shifting any adjacent elements.
The Fix
import time
from collections import deque
# Initialize deque instead of list for efficient O(1) left popping
queue = deque(range(100000))
start = time.perf_counter()
# Dequeue 20,000 items from the front
for _ in range(20000):
queue.popleft() # O(1) operation
duration = time.perf_counter() - start
print(f"Time taken using deque: {duration:.5f}s")
print(f"Remaining items: {len(queue)}")
collections.deque is implemented as a doubly linked list of fixed-size blocks. This structure allows constant-time (O(1)) additions and removals from both ends because only pointers need to be updated, not the underlying array data.
How This Fails in Real Systems
A background task runner used a list to queue system jobs. Under light load, the queue stayed short and healthy. When a traffic spike grew the queue to 300,000 items, the task processor hung indefinitely, spending over 95% of its CPU time shifting array elements in memory.
Key Takeaway
collections.deque instead of a list to avoid O(N) shift costs.pop(0) or insert(0) on large lists.