Python dataclasses — Replacing __init__ Boilerplate
Imagine dataclasses building a single __init__ method by concatenating the fields of the parent and child in declaration order. If the parent introduces defaults, any new non-default fields in the child will break the generated signature.
kw_only=True parameter on the decorator.The Setup
You are building a configuration layer for a distributed application. You define a base class with global default settings, then attempt to subclass it to add specific database configurations that require dynamic input. When you try to start the service, Python refuses to compile your classes.
What Does This Print?
from dataclasses import dataclass
@dataclass
class BaseConfig:
environment: str = "production"
timeout: int = 30
@dataclass
class DatabaseConfig(BaseConfig):
connection_string: str
The Output
When this module is imported, Python raises a TypeError at class definition time before a single instance is even created. Python generated an __init__ method for DatabaseConfig by combining fields from the parent class and the child class. The resulting generated signature is equivalent to def __init__(self, environment: str = "production", timeout: int = 30, connection_string: str). In Python, arguments with defaults must always follow arguments without defaults. Because connection_string has no default, it violates this syntactical rule.
Why Python Does This
CPython constructs dataclasses by scanning the class's __mro__ (Method Resolution Order) and collecting fields declared with type annotations. It stores them in a flat dictionary order from parent to child. When generating the underlying __init__ method's bytecode, Python respects standard function signature semantics. Since class variables with default values are inherited first, they occupy the leading positions in the synthesized argument list. The interpreter cannot dynamically reorder arguments to place non-default parameters first without violating inheritance guarantees, resulting in an immediate compile-time syntax error mapping to TypeError.
The Fix
from dataclasses import dataclass, field
@dataclass
class BaseConfig:
environment: str = "production"
timeout: int = 30
@dataclass
class DatabaseConfig(BaseConfig):
# By providing a default factory or default value, we respect the inherited parameter order.
# Alternatively, we can use kw_only=True on the dataclass to allow any parameter order.
connection_string: str = field(default="postgresql://localhost:5432/db")
Giving connection_string a default value means every field in the generated __init__ signature now has a default, so the ordering rule is satisfied. An alternative is @dataclass(kw_only=True) on the subclass, which marks all its fields as keyword-only — keyword-only arguments are exempt from the positional ordering constraint entirely, so they can appear after defaulted positional arguments.
How This Fails in Real Systems
In a fast-growing fintech pipeline, a developer refactored a shared configurations library to use dataclasses. They introduced an inherited BaseConfig with default retry logic. Subclasses without default configurations were imported dynamically by worker processes, crashing the entire Celery task fleet upon deployment with immediate startup crashes that took 35 minutes of rollback time to resolve.
Key Takeaway
kw_only=True parameter on the decorator.__init__ signature, which dataclasses automatically generate.