← Python Code Databases
Browse Python Concepts

The N+1 Query Problem — How Lazy Loading Creates It

Mental Model

Imagine fetching a list of books and then, for each book, running to the library's "author" section to get the author's details. Lazy loading is like this: one trip for books, then 'N' individual trips for 'N' authors. Eager loading is like asking for all books and their authors in one efficient trip.

Rule: Always specify eager loading options like selectinload or joinedload on relationships when querying parent objects if you plan to access their children.

The Setup

You are building an API endpoint to list user posts. You write a query to fetch all users and iterate over them to compile their posts, completely unaware that SQLAlchemy is hitting the database with a separate SQL query for every single row.

What Does This Print?

Broken code
Python
import logging
from sqlalchemy import create_engine, select, ForeignKey, Column, Integer, String
from sqlalchemy.orm import declarative_base, relationship, sessionmaker

# Configure SQL logging to visualize database hits
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

Base = declarative_base()
class User(Base):
    __tablename__ = 'users'
    id = Column(Integer, primary_key=True)
    username = Column(String)
    posts = relationship("Post", back_populates="author")

class Post(Base):
    __tablename__ = 'posts'
    id = Column(Integer, primary_key=True)
    title = Column(String)
    author_id = Column(Integer, ForeignKey('users.id'))
    author = relationship("User", back_populates="posts")

engine = create_engine('sqlite:///:memory:')
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()

u1, u2, u3 = User(username="u1"), User(username="u2"), User(username="u3")
session.add_all([u1, u2, u3, Post(title="p1", author=u1), Post(title="p2", author=u2), Post(title="p3", author=u3)])
session.commit()

print("--- Executing query ---")
users = session.scalars(select(User)).all()

print("--- Accessing posts ---")
for user in users:
    # Accessing user.posts triggers an implicit database query per iteration
    print(f"User: {user.username} has posts: {[p.title for p in user.posts]}")
Count the exact number of SQL queries generated when fetching three users and accessing their posts. Is it one query or four?

The Output

What actually happens
--- Executing query --- INFO:sqlalchemy.engine.Engine:SELECT users.id AS users_id, users.username AS users_username FROM users --- Accessing posts --- INFO:sqlalchemy.engine.Engine:SELECT posts.id AS posts_id, posts.title AS posts_title, posts.author_id AS posts_author_id FROM posts WHERE ? = posts.author_id User: u1 has posts: ['p1'] INFO:sqlalchemy.engine.Engine:SELECT posts.id AS posts_id, posts.title AS posts_title, posts.author_id AS posts_author_id FROM posts WHERE ? = posts.author_id User: u2 has posts: ['p2'] INFO:sqlalchemy.engine.Engine:SELECT posts.id AS posts_id, posts.title AS posts_title, posts.author_id AS posts_author_id FROM posts WHERE ? = posts.author_id User: u3 has posts: ['p3']

The code generates 4 queries in total (1 to load the users, and then 3 separate queries to load each user's posts). This is the classic N+1 query problem. By default, SQLAlchemy relationships use lazy="select". This means the related collection is not loaded until the attribute (user.posts) is accessed inside Python.

Why Python Does This

SQLAlchemy utilizes dynamic Python descriptors (InstrumentedAttribute) to override attribute access (user.posts). When you access posts, the descriptor triggers a lookup. If the collection is not marked as loaded inside the object's InstanceState, SQLAlchemy invokes the current session's state engine, compiles an ad-hoc SQL select statement on the fly using the target object's primary key, and executes it immediately. This happens dynamically at runtime on every iteration.

The Fix

Corrected pattern
Python
from sqlalchemy import create_engine, select, ForeignKey, Column, Integer, String
from sqlalchemy.orm import declarative_base, relationship, sessionmaker, selectinload

Base = declarative_base()
class User(Base):
    __tablename__ = 'users'
    id = Column(Integer, primary_key=True)
    username = Column(String)
    posts = relationship("Post", back_populates="author")

class Post(Base):
    __tablename__ = 'posts'
    id = Column(Integer, primary_key=True)
    title = Column(String)
    author_id = Column(Integer, ForeignKey('users.id'))
    author = relationship("User", back_populates="posts")

engine = create_engine('sqlite:///:memory:')
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()

u1, u2 = User(username="u1"), User(username="u2")
session.add_all([u1, u2, Post(title="p1", author=u1), Post(title="p2", author=u2)])
session.commit()

# FIX: Use selectinload to eagerly fetch relationships in exactly 2 optimized queries
stmt = select(User).options(selectinload(User.posts))
users = session.scalars(stmt).all()

for user in users:
    print(f"User: {user.username} posts: {[p.title for p in user.posts]}")

Using selectinload(User.posts) tells SQLAlchemy to fetch all User objects in one query, and then fetch all associated Post objects for those users in a second, single optimized query (using an IN clause), eliminating the N individual queries.

How This Fails in Real Systems

An e-commerce platform displayed products along with reviews on its home page. When traffic spiked, the site began hanging. Profiling revealed a default lazy relationship loading reviews inside a list of 100 products, resulting in 101 queries per page render, crashing the database connection pool in under 10 minutes.

Key Takeaway

Always specify eager loading options like selectinload or joinedload on relationships when querying parent objects if you plan to access their children.
Common mistake: Developers assume that accessing related objects in a loop after querying parent objects will efficiently load all children in one go, unaware that SQLAlchemy's default lazy loading performs a separate query for each access.