← Python Code Memory & Data Structures
Browse Python Concepts

Everything Is an Object — id(), Reference Counting, GC

Mental Model

Think of Python objects as living independently in memory, not strictly tied to the variables that point to them. Variables are just labels; an object persists as long as any label (reference) points to it, even if those labels form a cycle.

Rule: When designing structures with back-references or mutual dependencies, use weak references to prevent cycle leaks.

The Setup

You are optimizing a long-running batch processing engine that loads configuration elements and constructs large graph dependencies. Memory usage keeps climbing, and you notice objects are persisting even after the parent engine scope is closed.

What Does This Print?

Broken code
Python
import sys
import ctypes

class Node:
    def __init__(self, value):
        self.value = value
        self.edges = []

def build_cycle():
    n1 = Node(1)
    n2 = Node(2)
    n1.edges.append(n2)
    n2.edges.append(n1)
    return id(n1), id(n2)

id1, id2 = build_cycle()
# We check if memory addresses are still holding live data
print(f"Address 1 holding data: {ctypes.cast(id1, ctypes.py_object).value.value}")
Predict what happens when you run this code. Does accessing the reference-cycle addresses after the function exits print the node values, or does it trigger a segmentation fault?

The Output

What actually happens
Address 1 holding data: 1

It prints the node values successfully: Even though the function scope has exited and the variables n1 and n2 are no longer accessible directly, the raw memory addresses still contain the valid Node objects. This is because reference counting alone failed to clean them up due to the circular reference. The garbage collector has not run yet, meaning the memory is leaked until the next collection cycle.

Why Python Does This

Under CPython, every object is represented by a PyObject structure containing ob_refcnt and ob_type. When variables go out of scope, Python decrements ob_refcnt. If it hits zero, Python frees the memory. However, circular references prevent the reference count from hitting zero (each node has an edge pointing to the other). Python relies on a generational cyclic garbage collector to detect and break these reference cycles. The GC runs periodically, not instantly, leaving these objects alive in memory until a collection occurs.

The Fix

Corrected pattern
Python
import weakref

class Node:
    def __init__(self, value):
        self.value = value
        self.edges = []

def build_weak_cycle():
    n1 = Node(1)
    n2 = Node(2)
    n1.edges.append(weakref.ref(n2))  # weak ref — does not increment refcount
    n2.edges.append(weakref.ref(n1))
    # Return weak refs to observe collection from outside the function
    return weakref.ref(n1), weakref.ref(n2)

ref1, ref2 = build_weak_cycle()
# n1 and n2 go out of scope; only weak refs remain, so refcounts hit 0 immediately
print(f"Node 1 still alive: {ref1() is not None}")  # False — collected at once
print(f"Node 2 still alive: {ref2() is not None}")  # False — collected at once

Weak references do not increment ob_refcnt. When n1 and n2 go out of scope, nothing increments their reference counts, so both hit zero immediately and are freed by normal reference counting — no cyclic GC pass required. The weakref.ref objects themselves return None once their target is collected, signalling that the memory has been reclaimed.

How This Fails in Real Systems

A distributed microservice using an in-memory graph to cache route topologies ran out of memory every 4 hours. Developers used hard-coded back-links in Node structures. The GC couldn't keep up with high-throughput updates, leading to a slow memory leak that triggered container OOM kills during peak hours.

Key Takeaway

When designing structures with back-references or mutual dependencies, use weak references to prevent cycle leaks.
Common mistake: Developers assume that once a variable goes out of scope, its associated object is immediately eligible for garbage collection, especially when no other strong references are directly visible.