Instruction tuning

    What instruction tuning is, when to use it, and how to design instructions that train cleanly.

    Most fine-tunes in Studio are instruction tunings: you teach the model to follow a specific kind of directive in a specific style. This page covers what instruction tuning means in practice, when it is the right approach, and how to design the instruction side of your dataset so the model picks up the right behaviour.

    What instruction tuning means

    An instruction-tuned model takes a directive (the instruction), optionally combined with context (the input), and produces a response (the output). The general shape is:

    Instruction: Summarise this article in two sentences.
    Input: [the article text]
    Output: [the summary]
    

    The model learns three things at once:

    • The format: instruction goes here, input goes here, output goes here.
    • The style: what summaries look like in your dataset.
    • The behaviour: what makes a "good" summary in your context.

    Almost every common fine-tuning use case (customer support, summarisation, classification, structured generation, style transfer, code completion with a hint) is some flavour of instruction tuning.

    When instruction tuning is the right tool

    It is the right tool when:

    • You want the model to do a clear, specific kind of task.
    • You have (or can generate) examples of the task being done well.
    • The task has a clear input and a clear desired output.

    It is not the right tool when:

    • You want the model to absorb a body of factual knowledge with no particular task attached. That is continued pretraining, not instruction tuning. Closer to the Train action module than the Fine-Tune module.
    • You want the model to choose between options at inference time. That is preference optimisation (see SFT vs DPO).
    • You want to expand the model's context window or change its tokenizer. Those are architectural changes, not fine-tuning.

    For everything else, instruction tuning is the default.

    Designing good instructions

    The instruction is half the training signal. A vague or inconsistent instruction set produces a vague model.

    Be specific

    Each instruction should describe a clear task. Compare:

    • Bad: "Help the user."
    • Good: "Respond to the customer's question about their order. Be concise and apologise once if the customer is upset."

    The good instruction tells the model exactly what to do, in what tone, with what kind of structure.

    Be consistent across the dataset

    If 80% of your rows say "Respond to the customer's question" and 20% say "Help the customer with their issue," the model learns that those two instructions mean different things. Pick one phrasing and stick with it for the same task.

    If you have multiple task types in one dataset, make sure each task has its own consistent instruction phrasing. Don't blur the boundaries.

    Include the rules

    If your task has constraints, put them in the instruction:

    • "Respond in JSON with fields severity, category, next_step."
    • "Limit response to 50 words."
    • "Do not include the customer's name."

    The model learns the constraints. If you only enforce them at inference time (system prompt), the fine-tune does nothing about them.

    Vary inputs, not instructions

    For a given task, vary the input widely and keep the instruction roughly constant. This teaches the model that the instruction names the task, and the input is the data to apply it to.

    A common mistake: paraphrasing the instruction every row. The model then learns that the instruction is also data, and you weaken the connection between "this instruction" and "this kind of response."

    Single-turn vs multi-turn schemas

    Ertas's five JSONL schemas (see JSONL format) split into two conceptual groups:

    • Single-turn: instruction/output, input/output (+metadata), or text for corpus-style. User signals the task once, model answers once. Easier to author, smaller per-row payload.
    • Multi-turn: conversations (ShareGPT-style from/value) or messages (ChatML-style role/content). Customer support, dialogue agents, anything where context accumulates over multiple turns.

    A common pattern: train on a single-turn instruction dataset first to teach the base task, then a smaller multi-turn dataset for the follow-up cases. This sometimes outperforms one big multi-turn-only dataset because the model learns the core behaviour first and the multi-turn polish on top.

    System prompts

    If you plan to use a system prompt at inference time, include it in some (not necessarily all) of your training rows. This teaches the model to respect the system prompt rather than treat it as noise.

    For single-turn schemas, the system directive lives inside the instruction field:

    {"instruction": "[System: You are a polite customer support agent.] Respond to: I want a refund.", "output": "I am sorry to hear that, could you share your order number?"}

    For the multi-turn messages schema, use the system role explicitly:

    {"messages": [
      {"role": "system", "content": "You are a polite customer support agent."},
      {"role": "user", "content": "..."},
      {"role": "assistant", "content": "..."}
    ]}

    Mixing rows with and without a system prompt is fine. About half-and-half is a reasonable default.

    Where the model attaches its loss

    Ertas's trainer computes loss only on the output portion of each row (the output field for single-turn schemas, the assistant content in the messages schema, or the gpt value in the conversations schema). The user-side content is treated as context.

    Consequences:

    • The model is graded on producing good outputs, not on memorising inputs. Good.
    • Long inputs do not cost more loss-wise but do cost more compute (the forward pass is over the full sequence). Keep inputs reasonable.
    • The model can pay attention to the instruction even though loss is not computed on it. Phrasing the instruction matters because the model attends to it.

    Pitfalls specific to instruction tuning

    A handful of failure modes that show up repeatedly:

    • The model echoes the instruction in its output. Usually means too few rows or rows where the output accidentally restates the instruction. Fix: drop rows where the output starts with a paraphrase of the instruction.
    • The model produces correct outputs but in the wrong field. Happens when conversations-format rows have malformed roles. Check that every assistant message in your data is in fact the assistant turn.
    • The model refuses tasks the base would have handled. Often because some rows in your dataset have refusal-flavoured outputs ("I cannot help with that"). Audit for these.
    • The model adds preamble. "Sure, here is the summary you asked for:" before every output. Caused by training rows whose outputs include this preamble. Strip preambles from the data.

    What's next