Instructor: Structured LLM Outputs
Purpose
Implementation patterns for using Instructor to extract validated, type-safe structured data from LLM responses. Instructor wraps any LLM client (OpenAI, Anthropic, Ollama, etc.) and uses Pydantic models as the return type schema — automatically retrying with the validation error as feedback when the model produces malformed output. It is the standard tool for building reliable data extraction, classification, and structured generation pipelines.
Examples
- Entity extraction from unstructured text
- Document classification with confidence scores
- Structured reasoning outputs (chain-of-thought → typed answer)
- Multi-entity batch extraction pipelines
Architecture
Installation:
pip install instructor pydantic
# Provider extras:
pip install "instructor[anthropic]" # Claude
pip install "instructor[openai]" # OpenAIBasic extraction pattern:
import instructor
from pydantic import BaseModel, Field
from anthropic import Anthropic
client = instructor.from_anthropic(Anthropic())
class UserProfile(BaseModel):
name: str = Field(description="Full name")
age: int = Field(ge=0, le=120, description="Age in years")
email: str = Field(description="Email address")
profile = client.messages.create(
model="claude-3-5-haiku-latest",
max_tokens=512,
messages=[{
"role": "user",
"content": "Jane Doe is 28 years old, her email is jane@example.com"
}],
response_model=UserProfile
)
print(profile.name) # "Jane Doe"
print(profile.age) # 28With OpenAI:
from openai import OpenAI
client = instructor.from_openai(OpenAI())
result = client.chat.completions.create(
model="gpt-4o-mini",
response_model=UserProfile,
messages=[{"role": "user", "content": "Extract: Bob Smith, 35, bob@co.com"}]
)Nested models and enums:
from enum import Enum
from typing import Optional
class Sentiment(str, Enum):
POSITIVE = "positive"
NEGATIVE = "negative"
NEUTRAL = "neutral"
class Address(BaseModel):
city: str
country: str
class ReviewAnalysis(BaseModel):
sentiment: Sentiment
confidence: float = Field(ge=0.0, le=1.0)
key_points: list[str] = Field(description="Main points, max 5 items")
address_mention: Optional[Address] = None
analysis = client.messages.create(
model="claude-3-5-haiku-latest",
max_tokens=512,
messages=[{"role": "user", "content": "Analyze: 'Great product, shipped fast from London!'"}],
response_model=ReviewAnalysis
)Custom validators with automatic retry:
from pydantic import field_validator
import re
class StructuredCode(BaseModel):
language: str
code: str
@field_validator("code")
def no_placeholder_comments(cls, v):
if "# TODO" in v or "pass" == v.strip():
raise ValueError("Code must be complete, not a stub")
return v
# Instructor sends the ValueError message back to the LLM and retries
snippet = client.messages.create(
model="claude-3-5-haiku-latest",
max_tokens=512,
max_retries=3,
messages=[{"role": "user", "content": "Write a Python hello world function"}],
response_model=StructuredCode
)Streaming partial results:
from pydantic import BaseModel
class Report(BaseModel):
title: str
sections: list[str]
summary: str
# Stream as fields populate — good for real-time UI updates
for partial in client.messages.create_partial(
model="claude-3-5-haiku-latest",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a short report on transformers"}],
response_model=Report
):
print(f"\rTitle: {partial.title or '...'}", end="")Streaming iterable of items:
class Task(BaseModel):
title: str
priority: str # "high" | "medium" | "low"
# Returns items one by one as LLM generates them
for task in client.messages.create_iterable(
model="claude-3-5-haiku-latest",
max_tokens=512,
messages=[{"role": "user", "content": "List 5 ML project tasks"}],
response_model=Task
):
print(f"- [{task.priority}] {task.title}")Batch extraction pipeline:
from concurrent.futures import ThreadPoolExecutor
def extract_one(text: str) -> UserProfile:
return client.messages.create(
model="claude-3-5-haiku-latest",
max_tokens=256,
messages=[{"role": "user", "content": text}],
response_model=UserProfile
)
texts = ["Alice, 30, alice@co.com", "Bob Smith 25 bob@co.com"]
with ThreadPoolExecutor(max_workers=4) as pool:
profiles = list(pool.map(extract_one, texts))Using local models via Ollama:
from openai import OpenAI
client = instructor.from_openai(
OpenAI(base_url="http://localhost:11434/v1", api_key="ollama"),
mode=instructor.Mode.JSON
)
result = client.chat.completions.create(
model="llama3.1",
response_model=UserProfile,
messages=[{"role": "user", "content": "Extract: ..."}]
)Key design guidelines:
- Use
Field(description=...)— the description is injected into the JSON schema the LLM sees - Use
Field(ge=..., le=..., min_length=...)to constrain values and reduce retries - Keep models flat when possible; deeply nested models increase generation difficulty
- For long batches, set
max_retries=1to fail fast and log failures separately