OpenAI Agents SDK + Ertas
Build agents with OpenAI's official Agents SDK — a minimal, composable framework that works equally well with OpenAI APIs and self-hosted Ertas-trained models via OpenAI-compatible endpoints.
Overview
The OpenAI Agents SDK is OpenAI's official Python framework for building agents, released as the successor to the experimental Swarm project. It is deliberately minimal — a few core primitives (Agent, Tool, Handoff, Runner) that compose into arbitrarily complex agentic workflows without the conceptual overhead of older frameworks. The SDK is documented and supported as a first-class part of the OpenAI ecosystem and ships with built-in integrations for tracing, evaluation, and the OpenAI Responses API. The May 7, 2026 v0.17.0 release bumped the default RealtimeAgent model to `gpt-realtime-2` and continued the SDK's roughly biweekly cadence — a useful baseline when pinning the SDK version in production.
The framework's defining design choice is that it does *not* lock you into OpenAI's models. The Agents SDK accepts any model accessible through an OpenAI-compatible API, which includes Ollama, vLLM, LM Studio Server, llama.cpp's HTTP server, and most modern self-hosted inference runtimes. This makes the SDK a particularly attractive entry point for teams that want OpenAI's developer experience and tooling but plan to run on local or fine-tuned models in production.
The SDK is also TypeScript-friendly via its companion `@openai/agents` package, which mirrors the Python API and is designed to plug into the same backends. For teams building on top of Vercel AI SDK, Next.js, or React Native — the JavaScript ecosystem most mobile-app builders already work in — the OpenAI Agents SDK is one of the most accessible Python-or-TypeScript agent frameworks in 2026.
How Ertas Integrates
Ertas-trained models work with the OpenAI Agents SDK through the framework's standard model configuration. After fine-tuning in Studio and exporting to GGUF, you serve the model via Ollama, vLLM, or Ertas Cloud, then configure the SDK's `OpenAIChatCompletionsModel` (or `OpenAIResponsesModel`) with your endpoint URL and a placeholder API key. Every agent, every handoff, and every tool call now runs against your fine-tuned model.
The SDK's lightweight design pairs particularly well with the cost-control story Ertas customers care about. The Agents SDK was designed under the assumption that agents make many model calls per task — which is precisely why the agentic cost cliff bites mobile app builders so hard. By swapping the underlying model from a frontier API call to a fine-tuned local model, the per-task economics flip from API costs that scale with users to fixed inference costs that don't. The SDK doesn't change shape when you swap models; only the bill changes.
For production observability, the SDK's tracing system (which logs every model call, tool invocation, and handoff) works transparently regardless of where the model is served. Teams can develop against OpenAI's hosted models in development, ship Ertas-trained models in production, and inspect both with the same tools — useful both for debugging and for building the production-trace datasets that feed back into Studio for incremental fine-tuning.
Getting Started
- 1
Fine-tune your task-specific model in Ertas Studio
Train on data that captures your agent's expected reasoning patterns and tool-use traces. The OpenAI Agents SDK calls models the same way OpenAI's API does, so any data format that works for OpenAI fine-tuning works for Studio.
- 2
Deploy to an OpenAI-compatible endpoint
Export to GGUF and serve via Ollama, vLLM, or Ertas Cloud. The SDK calls any endpoint that exposes the standard chat-completion or responses API.
- 3
Install the OpenAI Agents SDK and configure the model
Install `openai-agents` (Python via pip) or `@openai/agents` (TypeScript via npm). Create an Agent with a model configured to point at your Ertas inference endpoint via the model_settings parameter.
- 4
Define tools and handoffs
Add Python or TypeScript functions as tools using the SDK's @function_tool decorator. Define handoffs to other agents for multi-agent workflows. The SDK handles the orchestration loop automatically.
- 5
Run the agent with built-in tracing
Call `Runner.run_sync()` (or its async/streaming variants) with your input. The SDK's tracing system logs every step. Use the trace data both for debugging and as input to Studio for continuous model improvement.
from agents import Agent, Runner, function_tool
from agents.models import OpenAIChatCompletionsModel
from openai import AsyncOpenAI
# Point the SDK at your Ertas-trained model served via Ollama
client = AsyncOpenAI(
base_url="http://localhost:11434/v1",
api_key="not-needed",
)
model = OpenAIChatCompletionsModel(
model="ertas-mobile-assistant-4b",
openai_client=client,
)
@function_tool
async def search_user_calendar(user_id: str, date: str) -> list[dict]:
"""Look up calendar events for a user on a given date."""
return await calendar_api.events(user_id, date)
@function_tool
async def book_meeting(user_id: str, time: str, duration_min: int) -> dict:
"""Book a meeting at the requested time."""
return await calendar_api.create(user_id, time, duration_min)
scheduling_agent = Agent(
name="Scheduling Assistant",
model=model,
instructions="Help the user check availability and book meetings.",
tools=[search_user_calendar, book_meeting],
)
# Run with built-in tracing
result = Runner.run_sync(
scheduling_agent,
input="Find me a 30-minute slot tomorrow afternoon and book it with the design team.",
)
print(result.final_output)Benefits
- Lightweight, composable design — minimal conceptual overhead
- Model-agnostic by default — works with any OpenAI-compatible endpoint
- First-class TypeScript companion (@openai/agents) for JavaScript projects
- Built-in tracing, evaluation, and handoff primitives
- Drop-in replacement for OpenAI hosted models with self-hosted Ertas-trained alternatives
- Designed for the agentic-cost-cliff use case — works the same whether you're paying per token or running locally
- Production-grade with OpenAI's tooling support behind it
Related Resources
Fine-Tuning
Function Calling
GGUF
Inference
Fine-Tuning for Tool Calling: How to Build Reliable AI Agents with Small Models
Stop Paying GPT-4 to Call Your APIs: Fine-Tune a Local Tool-Calling Model
Building Reliable AI Agents with Fine-Tuned Local Models: Complete Guide
LangGraph
Ollama
Pydantic AI
Vercel AI SDK
vLLM
Ertas for Customer Support
Ertas for AI Automation Agencies
Ertas for Voice Agent Fine-Tuning
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.