DSPy + Ertas

Build optimizable LLM pipelines with DSPy — Stanford's declarative framework that programs language models like a compiler programs hardware, working seamlessly with Ertas-trained local models.

Overview

DSPy is the declarative LLM programming framework from Stanford NLP, designed around a fundamentally different philosophy than most agent frameworks: rather than writing prompts and chains by hand, you describe the *signature* of each step (input types, output types, task description), and DSPy's optimizer figures out the prompts, demonstrations, and few-shot examples that make the system perform well. The metaphor the project uses is that DSPy is to language models what a compiler is to hardware — it programs the model to satisfy a declared specification rather than asking the developer to hand-tune every prompt.

The practical implication is that DSPy systems are both more maintainable and more performant than equivalent hand-written prompt chains. When a model changes (different size, different vendor, different fine-tune), DSPy's optimizer recompiles the prompts for the new model rather than requiring the developer to hand-tune each one. When the task specification changes, the same recompilation produces a new system without the prompt-engineering churn that defines most LLM application maintenance work.

DSPy is fully model-agnostic. The framework supports any chat-completion-style endpoint via its LM abstraction, including OpenAI, Anthropic, Together, local servers (Ollama, vLLM, llama.cpp), and any custom backend that implements the simple LM interface. For teams running fine-tuned models on Ertas, DSPy's optimization layer becomes a powerful complement: the fine-tuned model handles domain-specific competence, DSPy handles the prompt optimization on top.

How Ertas Integrates

Ertas-trained models work with DSPy through its `LM` abstraction. After fine-tuning in Studio and exporting to GGUF, you serve the model via Ollama, vLLM, or Ertas Cloud, then configure `dspy.LM` (or the OpenAI provider) to point at your endpoint. DSPy's optimizers — BootstrapFewShot, MIPROv2, BootstrapFewShotWithRandomSearch — then compile prompts that get the most out of your fine-tuned model.

The combination of fine-tuning and DSPy optimization is unusually powerful. Fine-tuning teaches the model the domain's vocabulary, output formats, and patterns. DSPy optimization then finds the prompt structure and demonstration set that elicit the model's best performance on the specific task. Empirically, the two improve different dimensions and stack: a fine-tuned model with DSPy-compiled prompts often outperforms either approach alone, sometimes by significant margins on benchmark tasks.

For teams iterating on agent systems, DSPy's declarative model has a particularly nice property: when you collect new training data and fine-tune a new model in Studio, you can recompile your DSPy program against the new model in a single step. The system automatically adapts its prompts and demonstrations to the new model's behavior, rather than requiring manual re-tuning of every prompt in the pipeline.

Getting Started

1
Fine-tune your domain model in Ertas Studio
Train on data that captures your domain's terminology, output patterns, and reasoning style. DSPy will optimize prompts to elicit the best behavior from this fine-tuned base.
2
Deploy to an OpenAI-compatible endpoint
Export to GGUF and serve via Ollama, vLLM, or Ertas Cloud. DSPy calls any compatible endpoint through its LM abstraction.
3
Install DSPy and configure the LM
Install dspy. Configure `dspy.LM(model='openai/ertas-...', api_base='http://localhost:11434/v1')` and set it as the default LM with `dspy.configure(lm=...)`.
4
Define signatures and modules declaratively
Describe each step as a DSPy signature: input fields, output fields, task description. Compose into modules (Predict, ChainOfThought, ProgramOfThought, ReAct). The framework handles prompt construction.
5
Compile with an optimizer and evaluate
Use `BootstrapFewShot`, `MIPROv2`, or another DSPy optimizer with a small training set and evaluation metric. The optimizer compiles prompts that maximize your metric on the fine-tuned model.

python

import dspy

# Point DSPy at your Ertas-trained model served via Ollama
lm = dspy.LM(
    "openai/ertas-financial-analyst-7b",
    api_base="http://localhost:11434/v1",
    api_key="not-needed",
)
dspy.configure(lm=lm)

# Declarative signature: describe the task, not the prompt
class FinancialQuestion(dspy.Signature):
    """Answer questions about a company's financial filings with citations."""
    filing: str = dspy.InputField(desc="Excerpt from the company's 10-K filing")
    question: str = dspy.InputField()
    answer: str = dspy.OutputField(desc="A precise answer with section citations")

# Module: chain-of-thought reasoning over the signature
analyst = dspy.ChainOfThought(FinancialQuestion)

# Compile with an optimizer using a small labeled training set
optimizer = dspy.BootstrapFewShot(metric=citation_quality_metric)
compiled_analyst = optimizer.compile(analyst, trainset=labeled_train)

# Run the compiled program — DSPy handles all the prompt construction
result = compiled_analyst(
    filing="In Q4 we recognized $42M in deferred revenue...",
    question="What was the change in deferred revenue and why?"
)
print(result.answer)

Build an optimizable financial-analysis program with DSPy and an Ertas-trained 7B model. DSPy compiles prompts; the fine-tuned model provides domain competence; the two combined outperform either alone.

Benefits

Declarative — describe signatures, not prompts
Automatic prompt optimization — compiler-style approach to LLM engineering
Recompile when models change rather than hand-tuning prompts
Stacks with fine-tuning — different optimization dimensions, additive gains
Fully model-agnostic — works with any LM endpoint including Ertas-trained local models
Strong research lineage from Stanford NLP with active community development