← Python Code AI Agents & LLM Apps
Browse Python Concepts

Structured Output — Getting Reliable JSON from an LLM

Mental Model

Imagine an LLM as a very enthusiastic but sometimes messy chef. You ask for a specific dish (JSON), but it might present it on a fancy, decorated plate (markdown fences), or with extra garnish (conversational text). Before you can use the dish, you need to clean it off the plate and remove any non-food items.

Rule: Never assume LLMs return clean JSON strings; always use regex cleansing or an orchestration SDK before validation.

The Setup

You are building a microservice that extracts metadata from customer emails. You instruct the LLM to return data matching a Pydantic structure, then immediately run standard Pydantic validation on the response text.

What Does This Print?

Broken code
Python
from pydantic import BaseModel, ValidationError

class LeadMetadata(BaseModel):
    deal_value: float
    contact_email: str

# Simulated typical raw LLM response wrapping JSON in markdown code blocks
raw_llm_response = """```json
{
  "deal_value": 15000.00,
  "contact_email": "sales@enterprise.com"
}
```"""

try:
    # Attempting to load the markdown response directly
    lead = LeadMetadata.model_validate_json(raw_llm_response)
    print("Success! Deal value:", lead.deal_value)
except ValidationError as e:
    print("Validation crashed:", e)
Predict what happens when Pydantic parses the markdown-wrapped JSON output.

The Output

What actually happens
Validation crashed: 1 validation error for LeadMetadata Invalid JSON: json error: expected value at line 1 column 1 [type=invalid_json, input_value='```json\n{\n "deal_val...}\n```', input_type=str]

Pydantic fails to parse the JSON and throws a ValidationError. Despite the content containing perfectly formatted JSON, the model's output helper appended the markdown wrapping code blocks ( `json ... ` ). Pydantic's model_validate_json expects a clean JSON string; the presence of prepended and appended markdown characters breaks the rust-backed JSON deserializer.

Why Python Does This

Under the hood, Pydantic V2's deserialization is written in Rust (pydantic-core) to maximize speed. The Rust JSON parser is highly strict about compliance with the JSON specification. It expects the very first byte of the string to be the start of a valid JSON structure ({ or [). It does not scan, ignore, or strip leading non-whitespace characters like markdown fences. Therefore, passing an LLM string that contains markdown styling results in an instant structural parsing failure before any fields are even evaluated. You must programmatically extract the raw JSON substring or use structured-output integration tools like instructor that handle underlying regex extraction, clean-up, and retry policies before Pydantic receives the payload.

The Fix

Corrected pattern
Python
import re
from pydantic import BaseModel

class LeadMetadata(BaseModel):
    deal_value: float
    contact_email: str

def parse_llm_json(raw_input: str) -> LeadMetadata:
    # Regex to extract content from markdown fences or grab raw curly braces
    match = re.search(r"(?:```json)?\s*(\{.*?\})\s*(?:```)?", raw_input, re.DOTALL)
    if not match:
        raise ValueError("No valid JSON structure found in output")
        
    clean_json = match.group(1)
    # Safely feed only the clean JSON string to Pydantic
    return LeadMetadata.model_validate_json(clean_json)

The regex r"(?:`json)?\s*(\{.*?\})\s*(?:`)?" with re.DOTALL targets the actual {...} JSON object directly, stripping any surrounding markdown fences. By extracting only the raw JSON string before passing it to model_validate_json, Pydantic's Rust parser receives input that starts with { — exactly what the JSON spec requires — and validation succeeds regardless of how the LLM wrapped its output.

How This Fails in Real Systems

An automated invoice pipeline extracted purchase order values using LLMs. When the model provider updated its default completion weights, the model began wrapping JSON outputs in markdown formatting. This triggered instant parsing crashes in production, blocking over 2,000 invoices and stalling supply chains for 36 hours.

Key Takeaway

Never assume LLMs return clean JSON strings; always use regex cleansing or an orchestration SDK before validation.
Common mistake: Developers assume LLMs will always return perfectly clean, raw JSON when instructed, neglecting the common tendency of models to wrap output in markdown or other conversational elements.