
微调 Qwen 2.5 用于多语言应用
Qwen 2.5 覆盖 29 种语言,18 万亿训练 token。以下是如何为多语言分类、支持和内容生成微调它——无需每种语言单独模型。
大多数开源语言模型以英语为主。Qwen 2.5 不同。阿里巴巴在 18 万亿 token 上训练,覆盖 29 种语言,对非拉丁文字和从右到左语言有真正的投入。
为什么 Qwen 在多语言上胜出
- 英语约 40%,中文约 25%,欧洲语言约 15%,亚洲语言约 10%,阿拉伯语/印地语等约 7%
- 152K 词汇分词器:中文约 1.5 token/字符(vs Llama 的 2-3)
多语言基准
| 基准 | Qwen 2.5 7B | Llama 3.3 8B |
|---|---|---|
| MGSM(多语言数学) | 72.4% | 61.2% |
| XNLI(跨语言 NLI) | 78.6% | 69.4% |
多语言微调配置
LoRA rank 32(高于单语的 16),学习率 1.5e-4,4-5 epochs。混合语言示例在每个 batch 中。
Qwen 2.5 7B vs Llama 3.3 8B 非英语任务
| 语言 | Qwen 2.5 7B | Llama 3.3 8B |
|---|---|---|
| 英语 | 95% | 96% |
| 德语 | 93% | 84% |
| 中文 | 92% | 71% |
| 阿拉伯语 | 89% | 63% |
| 日语 | 91% | 68% |
| 印地语 | 87% | 58% |
| 平均 | 91.9% | 76.6% |
如果您的应用涉及任何非英语语言,Qwen 2.5 是显而易见的选择。
Ship AI that runs on your users' devices.
Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
延伸阅读
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Keep reading

Fine-Tuning Phi-4: Microsoft's Best Small Model for Enterprise Tasks
Phi-4 14B outperforms GPT-4 on math benchmarks while running 15x faster on local hardware. Here's how to fine-tune it for classification, extraction, and structured output tasks.

Fine-Tuning Gemma 3: Google's Lightweight Model for On-Device Deployment
Gemma 3 is optimized for on-device inference — phones, tablets, edge hardware. Here's how to fine-tune it for mobile AI features and IoT applications that run without a server.

Fine-Tuned Models for LangGraph Agents: Replace GPT-4 in Your Agent Stack
LangGraph agents default to GPT-4, but most agent tasks — routing, tool selection, response generation — work better with fine-tuned models trained on your specific workflows.