
电商产品推荐模型微调:完整实战指南
通用推荐引擎缺乏对产品语义关系的理解。以下是如何基于您的产品目录和购买历史微调模型,构建能提升客单价的推荐系统。
协同过滤推荐"其他客户也购买了 X"。它对热门产品有效,但在长尾产品上失败——而且完全没有语义理解。购买露营炉具的客户可能需要看到点火器和防水火柴。协同过滤只有在足够多其他客户恰好一起购买了这些商品时才知道这一点。
微调语言模型从语义上理解产品关系。您可以在产品目录、购买历史和客户交互上训练它,构建一个了解您产品实际用途的推荐引擎。
您要构建的内容
一个以产品(或购物车)为输入,返回排序推荐的模型。示例:
输入:
Customer is viewing: Jetboil Flash Camping Stove
Cart contains: Sleeping bag, tent stakes
Recommend 5 complementary products from our catalog.
输出:
1. MSR Fuel Canister 110g — Required fuel for Jetboil Flash operation
2. UCO Stormproof Matches — Backup ignition, waterproof
3. Sea to Summit Alpha Light Spork — Lightweight utensil for camp cooking
4. Nalgene Wide Mouth Bottle 32oz — Water container for cooking
5. Bear Canister — Required food storage for backcountry camping
这不是协同过滤。模型知道露营炉具需要燃料,防水火柴是备用点火装置,烹饪餐具具有上下文相关性。它从对产品目录的语义理解中产生这些推荐。
数据要求
**最低可行数据集:**500 个(产品/购物车,推荐)对
**推荐数据集大小:**2,000-5,000 个示例
所需数据:
- 带描述的产品目录(SKU、名称、描述、类别、属性)
- 购买历史(订单 ID、每个订单的产品 ID)
- 已转化的现有推荐(如果有推荐数据)
如何从购买历史构建训练示例:
- 取包含 3 个以上商品的订单
- 使用已购买的商品作为"购物车"输入
- 使用其余商品作为"正确推荐"输出
- 过滤掉推荐商品不是明显类别匹配的订单(减少噪声)
步骤 1:准备数据集
导出产品目录为 JSON:
{
"sku": "JB-FLASH-BLK",
"name": "Jetboil Flash Camping Stove",
"description": "Fast, lightweight backpacking stove. Boils 1L water in 100 seconds. Requires Jetboil fuel canister.",
"category": "Camping Stoves",
"attributes": {
"weight": "13.1 oz",
"compatible_fuel": "Jetboil fuel canisters",
"use_case": "backpacking, ultralight camping"
}
}
导出购买历史并构建推荐对为 JSONL:
{"messages": [{"role": "user", "content": "Customer viewing: Jetboil Flash Camping Stove. Recommend 3 complementary products."}, {"role": "assistant", "content": "1. Jetboil JetPower Fuel 100g — Required fuel for Jetboil Flash\n2. UCO Stormproof Match Kit — Backup ignition\n3. Sea to Summit Alpha Light Spork Set — Lightweight eating utensil for camp cooking"}]}
{"messages": [{"role": "user", "content": "Cart: 2-person backpacking tent, sleeping bag. Recommend 3 complementary products."}, {"role": "assistant", "content": "1. Sleeping bag liner — Adds warmth and extends temperature range\n2. Tent footprint — Protects tent floor from abrasion\n3. Headlamp — Essential for camp use at night"}]}
步骤 2:使用 Ertas 训练
- 在 Ertas 中上传 JSONL 数据集
- 验证数据集(Ertas 会标记格式错误)
- 选择基础模型(Mistral 7B 或 Llama 3 8B 适合推荐任务)
- 启动训练任务
2,000 个示例的数据集,训练通常需要 30-60 分钟。使用默认 LoRA 设置,除非有特定理由调整。
步骤 3:评估
评估集应在训练前保留——使用数据集的 10-15% 进行评估。
关键指标:
- 模型是否推荐目录中的真实产品(幻觉率)?
- 推荐是否语义相关?
- 推荐是否足够具体?
评估方法: 从评估集中选取 50-100 个产品通过模型。手动为每个推荐集评分:3 = 全部相关,2 = 大部分相关,1 = 大部分不相关。目标平均分:≥2.5。
步骤 4:集成到 Shopify / 电商平台
从 Ertas 导出训练好的模型为 GGUF,使用 Ollama 部署。
Shopify 集成(通过自定义应用或 Shopify Functions):
// In your product page component
async function getRecommendations(product, cart) {
const response = await fetch('http://your-ollama-server:11434/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'your-recommendation-model',
messages: [
{
role: 'user',
content: `Customer viewing: ${product.title}. Cart: ${cart.map(i => i.title).join(', ')}. Recommend 4 complementary products.`
}
],
stream: false
})
});
const data = await response.json();
return parseRecommendations(data.message.content);
}
解析模型输出以提取产品名称,然后通过 Shopify Product API 匹配到目录。
衡量改善
跟踪前后对比:
- 推荐点击率(推荐组件上的 CTR)
- 推荐加入购物车(二次添加率)
- 平均订单价值(终极指标)
使用训练良好的推荐模型相对于通用协同过滤的典型改善:
- CTR:+15-30%
- 二次添加率:+20-40%
- AOV:+3-8%
对于月收入 50,000 美元的商店,5% 的 AOV 提升意味着每月 2,500 美元——每年 30,000 美元。模型在几周内就能收回训练成本。
Ship AI that runs on your users' devices.
Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
延伸阅读
- 电商 AI 代理机会——完整的电商垂直领域概览
- 电商客户服务 AI——支持工单自动化
- 无 API 费用的 Shopify AI——替换 Shopify 中的 OpenAI 调用
- 一次微调,按月收费——产品化服务模式
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Keep reading

The E-Commerce AI Agency Opportunity: $8,000-25,000 Projects That Repeat
E-commerce brands are overpaying for AI they do not own. Here's the specific opportunity for AI agencies: the use cases, the buyers, the pricing, and why e-commerce has the best data for fine-tuning.

E-Commerce Customer Service AI: Build a Fine-Tuned Support Model
Replace expensive GPT-4 support calls with a fine-tuned model trained on your ticket history. Here's the full build: data prep, training, deployment, and accuracy targets.

E-Commerce Product Catalog AI Classification: Fine-Tuned Category Models
Manually categorizing thousands of SKUs is expensive and inconsistent. A fine-tuned classifier trained on your taxonomy reduces categorization time by 80% and improves consistency across your catalog.