Pydantic AI + Ertas
Build production agents with Pydantic AI — the type-safe Python agent framework from the Pydantic team, with first-class support for fine-tuned local models via OpenAI-compatible endpoints.
Overview
Pydantic AI is the agent framework from the team behind Pydantic and Logfire — the validation library that became Python's de facto standard for runtime data integrity, used by virtually every modern Python web framework. Pydantic AI extends that same type-safety discipline to LLM agents: every tool, every input, every output is a typed Pydantic model, validated at runtime, with the full Pydantic error reporting and IDE autocompletion that Python developers already rely on. The framework reached its v1 release in April 2026 and rapidly became the rising-star alternative to LangGraph for teams that prioritize type safety and developer ergonomics over graph-based orchestration features. The early-May 2026 v1.90.x and v1.93.x point releases added an explicit `tool_choice` setting on `Agent`, dedicated `OutputToolCallEvent` and `OutputToolResultEvent` events for fine-grained streaming, and OpenAI Conversations API state support — improvements that make production usage more controllable and observable.
The framework's design borrows directly from FastAPI: dependencies are declared via type hints, tools are functions decorated with `@agent.tool`, and validation is automatic. Unlike older agent frameworks that handle structured output through prompt engineering and post-hoc parsing, Pydantic AI uses the underlying model's native function-calling capabilities and surfaces validation failures as Python exceptions you can catch and retry. This produces dramatically more reliable agents — when something goes wrong, you get a clear stack trace, not a model that hallucinates fields and silently fails downstream.
Pydantic AI is model-agnostic. The framework supports OpenAI, Anthropic, Google, and any OpenAI-compatible endpoint through its provider abstraction, which makes it a natural fit for self-hosted local models served via Ollama, vLLM, or LM Studio. The combination of Pydantic AI's type safety and a fine-tuned local model is particularly compelling: the framework enforces schema compliance at the boundary, the fine-tuned model produces high-quality outputs in the trained format, and together they deliver agents that are both reliable and economical.
How Ertas Integrates
Ertas-trained models work with Pydantic AI through its OpenAI provider configured against a local endpoint. After fine-tuning your model in Ertas Studio and exporting to GGUF, you serve it via Ollama, vLLM, or Ertas Cloud, then point Pydantic AI's `OpenAIModel` provider at the endpoint. Every agent in your codebase — whether it makes a single LLM call or coordinates a complex multi-step workflow — uses your Ertas-trained model as the underlying engine.
The integration is particularly powerful because of how Pydantic AI handles structured outputs. The framework's `result_type` parameter accepts any Pydantic model, and the agent will validate the LLM's output against that schema, raising a `ValidationError` if the output doesn't conform. For fine-tuned models specifically — where Ertas Studio's training process can include schema-conformance examples — this becomes a self-reinforcing loop: the model produces outputs that match the schema, Pydantic AI validates them, and any failures (which become rare after fine-tuning) surface as catchable exceptions you can retry or log.
For mobile deployments via the Ertas Deployment CLI, Pydantic AI's lightweight runtime is a strong fit. The framework adds minimal overhead beyond the LLM call itself, which matters when running an agent loop on-device through llama.cpp. Type safety on the host application's side (whether the host is a Python backend or a mobile app calling out to a local inference server) makes contracts explicit and reduces the integration surface where errors typically hide.
Getting Started
- 1
Fine-tune your domain model in Ertas Studio
Train on data that includes the structured outputs and tool calls your Pydantic AI agent will use. Studio's JSONL format maps directly to Pydantic AI's `messages` and `result_type` patterns.
- 2
Deploy to an OpenAI-compatible endpoint
Export to GGUF and serve via Ollama, vLLM, or Ertas Cloud. Pydantic AI's OpenAI provider connects to any endpoint that exposes the standard chat-completion API.
- 3
Install Pydantic AI and configure the model provider
Install pydantic-ai. Create an `OpenAIModel` (or `OpenAIProvider`) instance pointed at your Ertas inference endpoint with your model name.
- 4
Define typed tools and result schemas
Declare tools as decorated Python functions with type hints. Define result schemas as Pydantic models. Pydantic AI handles validation, retries on schema failures, and clear error reporting automatically.
- 5
Run the agent and handle validation
Call `agent.run()` or `agent.run_sync()` with your input. Pydantic AI returns a typed result. Catch `ValidationError` for structured retry logic, or rely on the framework's built-in retry behavior.
from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
# Point Pydantic AI at your Ertas-trained model served via Ollama
model = OpenAIModel(
"ertas-support-agent-7b",
base_url="http://localhost:11434/v1",
api_key="not-needed",
)
class TicketTriage(BaseModel):
category: str
priority: int # 1-5
suggested_team: str
requires_human: bool
agent = Agent(
model,
result_type=TicketTriage,
system_prompt="You triage customer-support tickets for a SaaS product.",
)
@agent.tool
async def lookup_customer(ctx, customer_id: str) -> dict:
"""Fetch customer record from the CRM."""
return await crm.get_customer(customer_id)
# The agent returns a validated TicketTriage object — or raises ValidationError
result = agent.run_sync(
"Customer 12345 says the dashboard hasn't loaded for 30 minutes."
)
print(result.data.priority, result.data.suggested_team)Benefits
- Type safety at every boundary — tools, inputs, outputs are all typed Pydantic models
- FastAPI-like ergonomics that Python developers already know
- Native function-calling support — no prompt-engineered JSON parsing
- Automatic schema validation with catchable Python exceptions on failure
- Lightweight runtime — adds minimal overhead beyond the LLM call itself
- Pairs naturally with fine-tuned models that produce schema-conformant outputs
- Integration with Logfire for production observability
Related Resources
Fine-Tuning
Function Calling
Inference
Structured Output
Fine-Tuning for Structured Output: Beyond JSON Mode to Guaranteed Schemas
Fine-Tuning for Tool Calling: How to Build Reliable AI Agents with Small Models
Fine-Tuning for Better JSON Output: Why Small Models Struggle and How to Fix It
LangGraph
Ollama
OpenAI Agents SDK
smolagents
vLLM
Ertas for SaaS Product Teams
Ertas for Customer Support
Ertas for Data Extraction
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.