← Python Code Loops, Functions & Scopes
Browse Python Concepts

yield and Generators — Lazy Evaluation in Python

Mental Model

Think of calling a generator function like ordering a custom-made meal. You place the order, and the chef (the generator function) acknowledges it by handing you a ticket (the generator object). The actual cooking (the code inside the generator) only begins when you try to eat the meal (iterate the generator).

Rule: Always split generator functions into an eager validation stage and a lazy generator stage if prerequisites must be verified at invocation time.

The Setup

You write a log-parsing service that opens a large file, filters lines with a generator, and measures execution times. You expect the file reading and validation exceptions to raise immediately when the generator function is called.

What Does This Print?

Broken code
Python
import os

def log_reader(filepath):
    if not os.path.exists(filepath):
        raise FileNotFoundError(f"Missing log file: {filepath}")
    with open(filepath, 'r') as f:
        for line in f:
            yield line.strip()

# We pass a non-existent file path
try:
    stream = log_reader("non_existent_system_log.log")
    print("Generator object successfully created.")
except FileNotFoundError:
    print("Caught file not found error immediately!")
Predict what the code prints. Will the FileNotFoundError be caught in the try-except block when calling log_reader?

The Output

What actually happens
Generator object successfully created.

The code prints "Generator object successfully created." and does not catch the exception. This is because calling a generator function does not execute any code inside the function body. Instead, it only returns a generator object. The code inside log_reader, including the file existence check and the FileNotFoundError raise, will not execute until the generator is iterated over using next() or a loop.

Why Python Does This

In CPython, a function definition containing the yield keyword is compiled with a specific flag in its code object: CO_GENERATOR. When called, the interpreter checks for this flag and bypasses standard frame execution. Instead of executing the bytecode, it instantiates a PyGenObject which wraps the frame, arguments, and local variables, then returns it. The frame remains suspended in memory at bytecode offset 0. Execution only begins when the frame's generator is advanced via tp_iternext (triggered by next()), making error-checking lazy and temporally uncoupled from instantiation.

The Fix

Corrected pattern
Python
import os

def log_reader(filepath):
    # Eagerly validate prerequisites before entering the generator frame
    if not os.path.exists(filepath):
        raise FileNotFoundError(f"Missing log file: {filepath}")
    
    # Inner generator function to stream values
    def _generator():
        with open(filepath, 'r') as f:
            for line in f:
                yield line.strip()
    return _generator()  # Return the generator after eager checks pass

Splitting the generator function into an eager setup phase and a lazy yield phase ensures that critical preconditions, like file existence, are verified at the moment the generator object is created, allowing immediate feedback on invalid inputs, rather than deferring the error until the first iteration.

How This Fails in Real Systems

An asynchronous backup worker initialized file generators for critical configuration directories. Because exceptions were deferred until downstream iteration, the microservice started successfully, but silently failed during scheduled backup execution hours later when the missing directories caused uncaught runtime exceptions.

Key Takeaway

Always split generator functions into an eager validation stage and a lazy generator stage if prerequisites must be verified at invocation time.
Common mistake: Developers expect generator functions to execute their initial setup logic (like file existence checks) immediately upon being called, similar to regular functions.