Agno + Ertas

Build agents with Agno — the modern rebrand of Phidata, a clean SDK with built-in memory, tools, and reasoning, that runs equally well against OpenAI APIs or Ertas-trained local models.

Overview

Agno is the 2025 rebrand of Phidata, a Python agent framework that won early traction for its clean API surface and pragmatic handling of memory, tool use, and reasoning loops. The rebrand coincided with a tighter focus on production deployment and the launch of an optional managed platform layer (agno.ai) for teams that want hosted agent execution alongside the open-source SDK. The SDK itself remains Apache 2.0 licensed and the framework's design philosophy hasn't changed — keep the abstractions simple, make tools easy to add, and stay out of the way for teams that want to compose their own orchestration. The early-May 2026 v2.6.3 release added Team HITL (human-in-the-loop) approvals in AgentOS chat, Gmail and Calendar context providers, a Mongo scheduler, multimodal Gemini file search, and an experimental multi-framework mode — expanding Agno from a clean SDK into a more complete platform without disrupting the core programming model.

The framework occupies a thoughtful middle ground in the 2026 agent ecosystem. It's lighter-weight than LangGraph (no graph state machines), more opinionated than Smolagents (it includes memory, knowledge, reasoning patterns out of the box), and more Pythonic than the OpenAI Agents SDK (it leans into Pythonic conventions rather than mirroring the OpenAI API exactly). For teams that want a clean Python agent SDK with batteries included — chat memory, vector knowledge stores, structured reasoning, multi-agent teams — Agno hits a productive sweet spot.

Agno is model-agnostic by design. The framework supports OpenAI, Anthropic, Google, AWS Bedrock, Cohere, and any OpenAI-compatible endpoint through its provider abstraction. For teams running fine-tuned models on Ollama, vLLM, or self-hosted inference, the integration is a few lines of configuration.

How Ertas Integrates

Ertas-trained models work with Agno through its `OpenAIChat` provider configured against a local endpoint. After fine-tuning in Studio and exporting to GGUF, you serve the model via Ollama, vLLM, or Ertas Cloud, then point Agno's model provider at your endpoint. Every agent — whether it has memory, knowledge retrieval, or multi-agent orchestration — uses your fine-tuned model as the underlying engine.

Agno's batteries-included design is a particularly strong fit for fine-tuned models. The framework's built-in memory, knowledge retrieval, and reasoning patterns assume the model is competent at structured tool use and consistent output formatting — exactly the qualities fine-tuning produces and that general-purpose 7B models often lack. Combined with an Ertas-trained model, Agno's higher-level abstractions become more reliable in production rather than fragile.

For teams that want the Agno managed platform's hosted agent execution but with self-hosted inference, the framework supports a hybrid configuration: agent code runs on Agno's platform, model calls go to your own inference endpoint. This pattern lets teams get the platform's operational features (deployments, traces, evaluation) while keeping inference costs fixed and data sovereignty intact.

Getting Started

1
Fine-tune your task-specific model in Ertas Studio
Train on data that includes the structured patterns Agno relies on: tool calls, memory references, knowledge-retrieval traces. Studio's JSONL format maps cleanly to Agno's message conventions.
2
Deploy to an OpenAI-compatible endpoint
Export to GGUF and serve via Ollama, vLLM, or Ertas Cloud. Agno calls any endpoint that exposes the standard chat-completion API.
3
Install Agno and configure the model
Install agno-python. Create an `OpenAIChat` provider pointed at your Ertas inference endpoint with your model name.
4
Build agents with built-in memory, knowledge, and tools
Use Agno's batteries-included primitives: persistent memory via SQLite or PostgreSQL, vector knowledge stores, tool integrations, and structured reasoning. Compose into multi-agent teams as needed.
5
Optionally deploy to the Agno managed platform
Push your agent code to agno.ai for hosted execution while keeping inference on your own endpoint. Or self-host the entire stack with Agno's reference deployment.

python

from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.tools.duckduckgo import DuckDuckGoTools
from agno.knowledge.url import UrlKnowledge
from agno.vectordb.lancedb import LanceDb

# Point Agno at your Ertas-trained model served via Ollama
model = OpenAIChat(
    id="ertas-research-assistant-7b",
    base_url="http://localhost:11434/v1",
    api_key="not-needed",
)

# Knowledge retrieval over a curated source set
knowledge = UrlKnowledge(
    urls=["https://docs.your-company.com/handbook"],
    vector_db=LanceDb(table_name="company_handbook", uri="./data/lance"),
)

agent = Agent(
    name="Research Assistant",
    model=model,
    knowledge=knowledge,
    tools=[DuckDuckGoTools()],
    instructions="Answer questions using internal handbook knowledge first, web search second.",
    show_tool_calls=True,
    markdown=True,
)

agent.print_response("What's our policy on remote work and how does it compare to industry norms?")

Build a research assistant with Agno backed by an Ertas-trained 7B model, with built-in vector knowledge retrieval over your handbook and DuckDuckGo tooling for external context.

Benefits

Clean Pythonic SDK — minimal conceptual overhead
Batteries included — memory, knowledge stores, structured reasoning, multi-agent teams
Model-agnostic by default — works with any OpenAI-compatible endpoint
Optional managed platform (agno.ai) for hosted execution with self-hosted inference
Apache 2.0 licensed — clean commercial use story
Pairs naturally with fine-tuned models that produce reliable structured outputs
Active development with strong community traction in 2026