
结构化输出微调:超越 JSON Mode 实现有保障的 Schema
JSON mode 给您有效的 JSON。微调给您有保障的 Schema 合规——每个字段、每个类型、每次。以下是如何训练模型输出您应用期望的精确结构。
您的应用期望一个有 8 个字段的 JSON 对象。GPT-4 大多数时候给您想要的。但在 95% Schema 合规率下,每 20 次 API 调用有 1 次产生解析器无法处理的输出。每天 10,000 次调用意味着每天 500 次失败。
微调后 Schema 合规从 95% 提示到 99.5%+ 微调。
结构化输出层级
| 级别 | 方案 | 合规率 |
|---|---|---|
| 1 | 基于提示 | 80-90% |
| 2 | JSON Mode | 95-98% |
| 3 | Function Calling API | 99%+ |
| 4 | 微调模型 | 99.5-99.9% |
| 5 | 微调 + 约束解码 | 100% |
构建 Schema 合规训练数据集
- 定义正式 JSON Schema
- 生成 500-1,000 个验证通过的训练示例
- 变化输入而非仅输出
- 显式包含边缘情况(空数组、null、长字符串)
- 格式化为训练对——助手回复必须是纯 JSON,无 markdown
常见错误
- 训练数据不一致——每个格式约定必须 100% 一致
- 未训练空/null 情况
- 输入过于同质
- 在美化 JSON 上训练(浪费 token)
- 未验证训练数据
对比指标
| 设置 | Schema 合规 | 平均延迟 | 每千次费用 |
|---|---|---|---|
| GPT-4o + 提示 | 93-96% | 1.8s | $2.50-8.00 |
| 微调 8B + 语法约束 | 100% | 0.35s | $0(本地) |
Ship AI that runs on your users' devices.
Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
延伸阅读
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Keep reading

Fine-Tuning for Better JSON Output: Why Small Models Struggle and How to Fix It
How fine-tuning dramatically improves JSON output reliability in small models — from 60% valid JSON to 99%+ compliance, with practical techniques for structured output tasks.

Fine-Tuned Models for LangGraph Agents: Replace GPT-4 in Your Agent Stack
LangGraph agents default to GPT-4, but most agent tasks — routing, tool selection, response generation — work better with fine-tuned models trained on your specific workflows.

Fine-Tuned Models for CrewAI: Multi-Agent Workflows Without API Costs
A CrewAI workflow with 4 agents making 20+ LLM calls per task can cost $2-5 per execution on GPT-4. Fine-tuned local models make multi-agent workflows economically viable.