電商產品目錄 AI 分類：微調類別模型

每月新增 100 到 500 個 SKU 的電商品牌，面臨目錄管理問題：每個新產品都需要被分類、標記、賦予屬性，並放置在正確的導覽結構中。手動完成，每個產品需要 5 到 15 分鐘——每月直接勞動 8 到 75 小時。

在你的分類法上訓練的微調分類器，能在幾秒鐘內以 90% 以上的準確率完成此操作。這是一個直接的 AI 代理交付成果：清晰的前後指標、快速的構建時間，以及明顯的月費合約理由（每個月都有新產品進來）。

分類器做什麼

輸入： 產品資料（名稱、描述、品牌、任何現有屬性）

輸出： 跨多個維度的分類：

主要類別（服裝 > 男裝 > 外套）
次要標籤（防水、保暖、可折疊）
性別／尺碼範圍
材質分類
價格層級
搜尋關鍵字

模型輸出結構化 JSON，你的目錄管理系統直接攝入。

範例：

輸入：

Product: Arc'teryx Beta AR Jacket Men's
Description: All-round waterproof shell for mountain activities. GORE-TEX Pro fabric, fully seam-taped, helmet-compatible hood. 485g.

輸出：

{
  "primary_category": "Clothing > Men's > Jackets & Coats > Rain Jackets",
  "secondary_categories": ["Hiking", "Mountaineering", "Skiing"],
  "attributes": {
    "waterproof": true,
    "material": "GORE-TEX Pro",
    "insulation": "none",
    "gender": "mens",
    "weight_oz": 17.1,
    "packable": true
  },
  "tags": ["waterproof", "shell", "gore-tex", "mountaineering", "packable", "alpine"],
  "price_tier": "premium",
  "meta_keywords": ["waterproof jacket mens", "gore-tex jacket", "mountain shell", "rain jacket hiking"]
}

為何微調模型優於通用 AI

帶有提示的通用 GPT-4 可以在基本層面上分類產品。問題是：

它不了解你的分類法。 你的商店有一個有 3 到 4 個層級的特定類別結構。通用 AI 會創造你導覽中不存在的類別。
它不了解你的屬性詞彙。 你的「價格層級」定義、你的材質分類、你的活動標籤——這些都是商店特定的。通用 AI 只是猜測。
它不針對你的邊緣案例進行校準。 帶有防水外殼的羽絨夾克屬於哪個類別？在戶外運動店銷售的男士泳裝——泳裝還是戶外裝備？你過去的目錄決策編碼了你的答案；微調模型學習了它。

構建資料集

來源： 你現有的已分類產品目錄——你已手動分類的每個產品都是一個訓練樣本。

大小目標： 1,000 到 5,000 個產品（涵蓋你的類別範圍）

構建：

{"messages": [
  {"role": "system", "content": "You are a product classification assistant for [Brand]. Classify products according to our taxonomy. Always output valid JSON matching the schema provided."},
  {"role": "user", "content": "Classify this product:\nName: Patagonia Nano Puff Jacket Womens\nDescription: Lightweight insulated jacket with PrimaLoft Gold Insulation Eco. Wind resistant DWR finish. Packs into chest pocket. 9.5 oz."},
  {"role": "assistant", "content": "{\"primary_category\": \"Clothing > Women's > Jackets & Coats > Insulated Jackets\", \"secondary_categories\": [\"Hiking\", \"Travel\", \"Skiing\"], \"attributes\": {\"waterproof\": false, \"material\": \"PrimaLoft Gold\", \"insulation\": \"synthetic\", \"gender\": \"womens\", \"weight_oz\": 9.5, \"packable\": true}, \"tags\": [\"insulated\", \"packable\", \"lightweight\", \"synthetic-fill\", \"primaloft\"], \"price_tier\": \"premium\", \"meta_keywords\": [\"insulated jacket women\", \"packable down jacket\", \"lightweight insulated jacket\"]}"}
]}

包含你分類法中每個類別的樣本。每個頂級類別目標 20 到 50 個樣本。

訓練配置

對於具有結構化 JSON 輸出的分類任務：

基礎模型：Mistral 7B Instruct 在結構化輸出任務上表現良好
LoRA rank：8 到 16（較低的 rank 對分類來說已足夠）
Epochs：3 到 5（分類任務收斂較快）

模型需要學習：(1) 你的類別結構，(2) 你的屬性詞彙，(3) 如何輸出有效的 JSON。

評估

保留 10% 的資料集。訓練後，運行評估集並測量：

主要指標： 正確的主要類別分配（完全匹配）

次要指標：

標籤精確率（分配的標籤中正確的比例）
標籤召回率（正確標籤中被分配的比例）
JSON 有效性（100% 的輸出應可解析）
屬性準確率（個別欄位準確率）

使用構建良好的 2,000 個以上樣本資料集的典型結果：在保留集上有 88% 到 94% 的正確主要類別。

整合

新產品攝入的批次分類管道：

import requests
import json

def classify_product(name: str, description: str) -> dict:
    response = requests.post(
        'http://your-ollama-server:11434/api/chat',
        json={
            "model": "product-classifier",
            "messages": [
                {
                    "role": "user",
                    "content": f"Classify this product:\nName: {name}\nDescription: {description}"
                }
            ],
            "stream": False
        }
    )

    content = response.json()['message']['content']

    try:
        return json.loads(content)
    except json.JSONDecodeError:
        # Extract JSON from response if wrapped in text
        import re
        json_match = re.search(r'\{.*\}', content, re.DOTALL)
        if json_match:
            return json.loads(json_match.group())
        raise ValueError(f"Could not parse classification output: {content}")

# Process new products CSV
import csv
with open('new_products.csv') as f:
    for row in csv.DictReader(f):
        classification = classify_product(row['name'], row['description'])
        # Push to your catalog management system
        update_catalog(row['sku'], classification)

將此作為新產品匯入的夜間任務運行。客服人員審閱捕捉需要手動更正的 6% 到 12%。

此用例的月費合約結構

目錄分類的月費合約理由如下：

新產品持續到來 → 模型自動處理
分類法變更（新類別、重組導覽） → 模型需要重新訓練
準確率監控 → 在分類漂移污染目錄之前發現它

月費合約套餐：每月 300 到 500 美元

包含：每月批次處理新產品、每季度用新樣本重新訓練、準確率監控儀表板、客服人員回饋的糾錯管道

Ship AI that runs on your users' devices.

Free plan with 30 credits/mo, no card required. Paid plans from $25/mo USD.

or view pricing →

電商產品目錄 AI 分類：微調類別模型

分類器做什麼

為何微調模型優於通用 AI

構建資料集

訓練配置

評估

整合

此用例的月費合約結構

延伸閱讀

Ship AI that runs on your users' devices.

Keep reading

電商客服 AI：構建微調支援模型

為房地產微調房源描述 AI：逐步指南

為電商微調產品推薦模型：完整演示