__slots__ for Memory-Efficient Classes
Imagine a regular Python object as a house with a dedicated, potentially large, storage closet (__dict__) for all its belongings (attributes). When you use __slots__, it's like building a house with fixed-size shelves directly embedded into the walls for specific items, eliminating the need for a separate, bulky closet.
The Setup
A service creates millions of small, similar objects representing telemetry data points in memory. Over time, the service's memory usage grows unexpectedly high, leading to OOM (Out Of Memory) errors in containerized environments. The team suspects the sheer number of objects is the problem.
What Does This Print?
import sys
class DataPoint:
def __init__(self, x: float, y: float, timestamp: int, sensor_id: str):
self.x = x
self.y = y
self.timestamp = timestamp
self.sensor_id = sensor_id
# Create many instances
num_points = 100000
points = [DataPoint(i * 1.0, i * 2.0, i, f"sensor_{i % 10}") for i in range(num_points)]
# Calculate approximate memory usage for one instance
print(f"Number of DataPoint instances: {num_points}")
print(f"Size of one DataPoint instance (sys.getsizeof): {sys.getsizeof(points[0])} bytes")
print(f"Approximate total memory for instances (sys.getsizeof * N): {sys.getsizeof(points[0]) * num_points / (1024 * 1024):.2f} MB")
The Output
Each 'DataPoint' instance, by default, will carry a __dict__ attribute. This dictionary stores the instance's attributes (x, y, timestamp, sensor_id) and incurs significant overhead per instance, typically 200-300 bytes even when empty. The sys.getsizeof output for an instance will only reflect the object's base size plus its explicit attributes, not the full __dict__ overhead.
The reported 56 bytes per instance is misleading; the real memory usage per instance is much higher due to the underlying __dict__ for each DataPoint, leading to a much larger overall memory footprint than what sys.getsizeof suggests without deeper inspection.
Why Python Does This
By default, Python classes allow dynamic attribute assignment, meaning you can add new attributes to an instance at any time. This flexibility is achieved by storing instance attributes in a dictionary (__dict__) attached to each object. Each __dict__ is a hash map, consuming a substantial amount of memory itself, even when relatively empty, due to its internal structure and hash table overhead. When you have many small objects, this per-instance __dict__ overhead quickly dwarfs the memory required for the actual data stored in the attributes, leading to disproportionately high memory consumption. __slots__ explicitly tells Python to use a fixed-size array instead of a __dict__ for instance attributes, removing this dictionary overhead.
The Fix
import sys
class DataPoint:
__slots__ = ('x', 'y', 'timestamp', 'sensor_id') # FIX: Define __slots__ to prevent __dict__ creation
def __init__(self, x: float, y: float, timestamp: int, sensor_id: str):
self.x = x
self.y = y
self.timestamp = timestamp
self.sensor_id = sensor_id
num_points = 100000
points = [DataPoint(i * 1.0, i * 2.0, i, f"sensor_{i % 10}") for i in range(num_points)]
print(f"Number of DataPoint instances: {num_points}")
print(f"Size of one DataPoint instance (sys.getsizeof with __slots__): {sys.getsizeof(points[0])} bytes")
print(f"Approximate total memory for instances (sys.getsizeof * N): {sys.getsizeof(points[0]) * num_points / (1024 * 1024):.2f} MB")
# Attempting to add a new attribute not in __slots__ will raise an AttributeError
try:
points[0].new_attribute = "test"
except AttributeError as e:
print(f"\nAttempting to add new_attribute failed as expected: {e}")
Declaring __slots__ tells Python not to create a __dict__ for each instance. Instead, it pre-allocates space for attributes in a more compact C-struct-like layout, significantly reducing the memory footprint per instance by removing the overhead of the hash table used for __dict__.
How This Fails in Real Systems
A financial analytics service processing real-time market data used simple Python objects to represent individual trades. With a trading volume peak, the Kubernetes pods running the service frequently restarted due to OOM errors, consuming 4-5 GB of RAM. After profiling with memory_profiler, it was discovered that each Trade object, despite holding only a few numerical fields, occupied over 300 bytes due to its __dict__. Implementing __slots__ reduced the memory footprint per object by more than 70%, immediately resolving the OOM issues and allowing the service to handle peak loads on existing hardware.
Key Takeaway
__dict__ overhead, leading to excessive memory consumption, especially in data-intensive applications.