
从提示词工程到微调:迁移实战手册
从提示词工程迁移到微调的实用手册——何时做出切换、如何将提示词转化为训练数据,以及分步迁移过程。
你有一个花了数周打磨的系统提示词。它有 2,000 个 token,塞满了示例、边缘情况指令和格式规则。它能用——基本上。但它脆弱、昂贵,且以持续消耗时间的方式不一致。
这是将该提示词迁移到微调模型的实战手册。分五步,团队已用此方法降低 60-80% 的成本同时提升输出一致性。
迁移过程:五个步骤
步骤 1:记录当前提示词和预期行为
冻结当前系统。记录 50-100 个代表性输入及其实际输出。
步骤 2:从提示词中提取训练数据
提示词中的每个示例都是等待提取的训练样本。逐行审查提示词,为每条指令创建 10-20 个输入-输出对。
步骤 3:生成 1,000-2,000 个额外示例
使用当前提示词 + API 组合生成 3,000-5,000 个输出。严格过滤——只保留质量达标的输出。
步骤 4:微调较小的模型
Llama 3.1 8B 或 Qwen 2.5 7B,LoRA rank 16-32,2-4 个 epoch。在 Ertas Studio 上 30-90 分钟。
步骤 5:严格对比质量
对比准确率(通常提高 5-15 个百分点)、一致性(微调行为更稳定)、延迟(减少 30-50%)和成本(通常降低 10-50 倍)。
成本对比实例
| 指标 | 提示词 + GPT-4o | 微调 Llama 8B |
|---|---|---|
| 系统提示词 | 1,800 token | 0 token |
| 平均请求成本 | $0.024 | $0.001 |
| 月度成本(3,000 请求/天) | $2,160 | $90(自托管) |
| 准确率 | 83% | 91% |
| 一致性(重试相同输出) | 78% | 97% |
微调模型训练成本 $40,不到两天就收回成本。
Ship AI that runs on your users' devices.
Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Keep reading

When NOT to Fine-Tune: 5 Cases Where RAG, Prompting, or APIs Are Better
An honest guide to when fine-tuning is the wrong approach — covering five common scenarios where RAG, prompt engineering, or API calls deliver better results with less effort.

Synthetic Data for Fine-Tuning: How to Generate Training Data That Actually Works
A practical guide to generating synthetic training data for fine-tuning — covering prompt strategies, quality filtering, distribution matching, and the 80/20 rule for mixing real and synthetic data.

Prompt Engineering Has a Ceiling. Here's What Comes After.
Prompt engineering can take you far — but every agency and developer hits the wall eventually. Here's what the ceiling looks like, why it exists, and what techniques come after.