Cookbook

Five worked recipes for fine-tuning Ertas models on real on-device use cases: support bots, summarisers, code completion, transcript cleanup, and structured extraction.

The Cookbook is the answer to "OK, but what would I actually build with this?" Each recipe walks an end-to-end on-device fine-tune from scratch: pick the use case, decide what data to collect or synthesise, choose a base model, set training hyperparameters, run a probe set, and ship into a real app shape (iOS, Android, desktop, or web). The recipes are not lifted from any single customer; they take real-world product surfaces from companies you have heard of and use them as anchors so the trade-offs feel concrete.

Customer Support Bot

Read the customer support bot guide.

Document Summarizer

Read the document summarizer guide.

Structured Data Extraction

Read the structured data extraction guide.

Voice Transcript Cleanup

Read the voice transcript cleanup guide.

Code Completion

Read the code completion guide.

How to read these recipes

Every recipe has the same backbone:

The problem: who needs this, why a fine-tune is the right answer, and why on-device beats a hosted API for this specific shape.
The dataset: what rows look like, where to source or synthesise them, and how many you need before the fine-tune starts paying back.
The base model: which model from the Ertas catalogue fits the task, with the reasoning behind the pick.
Training config: the hyperparameters that move the needle, plus an honest read on cost and wall-clock time.
Integration: at least one of the Ship paths with adapted code samples.
Probe set: 8 to 10 sample prompts you can run by hand to confirm the model learned what you wanted, before you wire up a full eval.
Limits: where the model will quietly fail and what to do about it.

You can read a recipe in two ways. End-to-end if you are picking a use case for the first time and want to feel the shape of the work. Skim-to-section if you already know what you are building and want a specific answer (typically the dataset section, since that is where most projects under-invest).

The five recipes

Recipe	Anchor scenario	Base model	Hardest part
Customer support bot	A SaaS company ships a desktop helper trained on its public docs and past support tickets	Gemma 4 E2B (3B class)	Authoring refuse-unknown behaviour
Document summariser	A browser adds "TL;DR for any open tab" that runs without sending the page to a server	Gemma 4 E2B (3B class)	Holding the summary length and tone steady across genres
Structured data extraction	An expense card extracts line items from photographed receipts on-device	Qwen 2.5 3B Instruct	Producing strict JSON and recovering when OCR is half-wrong
Voice transcript cleanup	A meeting-notes app adds offline cleanup so interviews and standups work in airplane mode	Llama 3.2 3B Instruct	Sourcing realistic ASR noise without breaking diarisation
Code completion	A game studio adds completions for its scripting language inside its creator app	Qwen 2.5 Coder 3B	Authoring a fill-in-the-middle dataset that does not leak the answer into the prefix

The order is roughly easiest to hardest in terms of data work. Support is the gentlest first project (the dataset is largely already written: your docs, your tickets). Code completion is the hardest: it needs the largest dataset, a fill-in-the-middle authoring pipeline with prefix-leak gotchas, and the broadest task surface, where subtle failures (wrong API, wrong dialect) hide in plausible-looking output. Extraction and transcript sit between them: both have well-defined output shapes and paired-data sources that travel from public corpora to your project with reasonable effort.

Before you start

Two prerequisites the recipes assume:

You have a clear picture of where the model will run. Browse the Ship section first if you do not. The choice between iOS-only, Android-only, desktop via Ollama, or web via WebAssembly changes the base model you pick (and sometimes the dataset shape, e.g. you may want shorter outputs for web because of memory ceilings).
You can describe the success criteria in one sentence. "The bot answers 80% of frequently-asked product questions correctly without making up features," "the summariser fits in 60 words and does not invent numbers," and so on. Recipes without a one-sentence success criterion tend to produce models that no one trusts to ship.

If either is missing, work through Concepts and Picking a base model before you start, then come back.

How to adapt a recipe

The recipes use specific companies as worked examples because abstract advice doesn't stick. None of them are Ertas customers; they are stand-ins picked for shape, not identity. Read the customer-support-bot recipe as "any SaaS with public docs and a support team," the code-completion recipe as "any vertical that has its own scripting language or framework," and so on.

The hyperparameters in the recipes are starting points that work for the anchor scenario. Move them when your situation differs: a smaller dataset wants more epochs and lower learning rate; a much bigger dataset wants fewer epochs and a slightly higher learning rate. The Training tips page covers the heuristics.

What's next

Customer support bot

Start here. The most-asked use case and the gentlest first project.

Picking a base model

If you haven't picked one, do that before opening a recipe.

JSONL format

The five Ertas schemas. Every recipe assumes you can author one.

Ship

Where each recipe lands, after training.