Back to blog
    Fine-Tune a Tutoring AI for EdTech: Subject-Specific Models That Don't Hallucinate Curriculum
    edtechtutoringfine-tuningeducationcurriculumsegment:agency

    Fine-Tune a Tutoring AI for EdTech: Subject-Specific Models That Don't Hallucinate Curriculum

    Generic AI tutors hallucinate curriculum and use inconsistent methodology. A fine-tuned model trained on your course content tutors in your pedagogy, at your difficulty level, without inventing facts.

    EErtas Team·

    A student using GPT-4 to help with your online algebra course may get an explanation that uses different notation than your course, references concepts from a later unit, or — worse — gives a subtly wrong answer that the student accepts as correct because it sounds authoritative.

    Generic AI has no concept of "your course." It knows math globally, not your course's specific scope and sequence. A fine-tuned tutoring model trained on your curriculum content knows exactly what concepts have been covered, what notation you use, what difficulty level is appropriate, and how to explain things the way your instructors do.

    What Makes a Good EdTech Tutoring Model

    Curriculum awareness: The model only references concepts introduced at or before the current module. No spoilers, no advanced concepts a student has not seen yet.

    Pedagogical consistency: The model uses the same problem-solving approach as the course. If your algebra course teaches elimination before substitution for solving systems of equations, the tutor uses elimination first.

    Non-hallucinating on specific facts: For history, science, or factual subjects — the model should be calibrated to what is in the curriculum, not pulling from its general training data.

    Appropriate Socratic style: Instead of just giving answers, a good tutor asks guiding questions: "What happens if you apply the distributive property to the left side?"

    Data Collection

    Source: Your course content team

    What to collect:

    • Course transcripts (video lectures, written explanations)
    • Worked examples with step-by-step solutions
    • Historical student questions + instructor/TA answers (from discussion forums, support tickets, live session recordings)
    • Common misconceptions and their correct explanations

    Format each as a tutoring interaction: The goal is (student question, tutoring response) pairs. The tutoring response should model good pedagogy — guided question + explanation, not just the answer.

    {"messages": [
      {"role": "system", "content": "You are the AI tutor for [Platform]'s Algebra I course. You are helping students through Module 4: Solving Linear Equations. Students have completed: variables, expressions, and simple one-step equations. They have NOT yet covered: systems of equations, quadratics, or functions. Use the same notation as the course: variables in italics, 'solve for x' phrasing. Guide with questions before giving answers. Maximum response length: 200 words."},
      {"role": "user", "content": "I don't understand how to solve 3x + 7 = 22. Can you help?"},
      {"role": "assistant", "content": "Of course! Let's work through it step by step.\n\nFirst, tell me: what's our goal when we 'solve for x'? What do we want *x* to be alone on one side of the equation? Right — we want *x* by itself.\n\nLook at the left side: 3*x* + 7. There are two things happening to *x*: being multiplied by 3 AND having 7 added to it. To 'undo' those, we go in reverse order.\n\nStep 1: Undo the addition. If we subtract 7 from the left side, we must subtract 7 from the right side too. What do you get?\n\n(Try it, then I'll confirm your answer before we move to step 2.)"}
    ]}
    

    Collect 600-1,500 tutoring pairs across your course topics. More is better; quality matters more than quantity.

    Handling "I Don't Know" Gracefully

    Train the model to acknowledge when a question is outside the course scope or beyond what it should answer:

    {"messages": [
      ...,
      {"role": "user", "content": "Can you just solve all 20 homework problems for me?"},
      {"role": "assistant", "content": "I can't solve your homework for you — that would mean you miss the practice that actually makes the concepts stick. But I'm here to help you work through any specific problem you're stuck on.\n\nWhich problem is giving you the most trouble? Show me what you've tried so far and I'll help you figure out where to go next."}
    ]}
    

    And for out-of-scope questions:

    {"messages": [
      ...,
      {"role": "user", "content": "Can you explain how to solve quadratic equations?"},
      {"role": "assistant", "content": "Quadratic equations are coming up in Module 7 — that's a great thing to look forward to! For now, we're focused on linear equations in Module 4. Is there anything from this module I can help clarify? If you're curious about where quadratics fit in, I'm happy to give a sneak peek without spoiling the full lesson."}
    ]}
    

    Evaluation

    Hold out 10-15% of your tutoring pairs for evaluation. After training, measure:

    Curriculum containment: Does the model reference concepts outside the current module's scope? Run a set of questions that would normally elicit out-of-scope responses and measure the rate of appropriate boundary-setting.

    Pedagogical quality: Have a sample of tutoring responses rated by an instructor on: (1) correctness, (2) age/level appropriateness, (3) quality of Socratic guidance. Target: 85%+ rated "good" or "excellent."

    Non-answer rate: How often does the model just give the answer instead of guiding the student? Target: less than 20% pure answers (the rest should involve a guiding question).

    Integration Into the Learning Platform

    Contextual tutoring: The tutor should know which module the student is in. Pass module context in every API call:

    const tutorResponse = await fetch('http://your-ollama-server:11434/api/chat', {
      method: 'POST',
      body: JSON.stringify({
        model: 'algebra-tutor-v2',
        messages: [
          {
            role: 'system',
            content: `Student is in Module ${moduleNumber}: ${moduleName}. Topics covered so far: ${coveredTopics.join(', ')}.`
          },
          // Previous conversation turns
          ...conversationHistory,
          { role: 'user', content: studentMessage }
        ]
      })
    });
    

    Session monitoring: Log all tutoring sessions. Flag sessions where the model gives a potentially incorrect answer (instructor review queue). These incorrect examples become your next retraining batch.

    The Retainer Case

    Tutoring models need ongoing maintenance because:

    • New course modules → new content → model needs to know new material
    • Common misconceptions identified in sessions → targeted training examples
    • Curriculum updates → model should reflect current content, not old versions
    • New student cohorts may have different question patterns

    Retainer: $700-1,200/month (for a single subject), covering quarterly retraining and session monitoring.


    Ship AI that runs on your users' devices.

    Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

    Further Reading

    Ship AI that runs on your users' devices.

    Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

    Keep reading