E-Commerce Product Catalog AI Classification: Fine-Tuned Category Models

E-commerce brands adding 100-500 new SKUs per month face a catalog management problem: every new product needs to be categorized, tagged, attributed, and placed in the right navigation structure. Done manually, this takes 5-15 minutes per product — 8-75 hours per month in direct labor.

A fine-tuned classifier trained on your taxonomy does it in seconds per product, at 90%+ accuracy. This is a straightforward AI agency deliverable: clear before/after metrics, fast build time, and an obvious retainer justification (new products come in every month).

What the Classifier Does

Input: Product data (name, description, brand, any existing attributes)

Output: Classification across multiple dimensions:

Primary category (Clothing > Men's > Outerwear)
Secondary tags (waterproof, insulated, packable)
Gender/size range
Material classification
Price tier
Search keywords

The model outputs structured JSON that your catalog management system ingests directly.

Example:

Input:

Product: Arc'teryx Beta AR Jacket Men's
Description: All-round waterproof shell for mountain activities. GORE-TEX Pro fabric, fully seam-taped, helmet-compatible hood. 485g.

Output:

{
  "primary_category": "Clothing > Men's > Jackets & Coats > Rain Jackets",
  "secondary_categories": ["Hiking", "Mountaineering", "Skiing"],
  "attributes": {
    "waterproof": true,
    "material": "GORE-TEX Pro",
    "insulation": "none",
    "gender": "mens",
    "weight_oz": 17.1,
    "packable": true
  },
  "tags": ["waterproof", "shell", "gore-tex", "mountaineering", "packable", "alpine"],
  "price_tier": "premium",
  "meta_keywords": ["waterproof jacket mens", "gore-tex jacket", "mountain shell", "rain jacket hiking"]
}

Why a Fine-Tuned Model Outperforms Generic AI

Generic GPT-4 with a prompt can classify products at a basic level. The problems:

It does not know your taxonomy. Your store has a specific category structure with 3-4 levels. Generic AI invents categories that do not exist in your navigation.
It does not know your attribute vocabulary. Your "price tier" definitions, your material classifications, your activity tags — these are store-specific. Generic AI guesses.
It is not calibrated to your edge cases. A down jacket with a waterproof shell is in what category? A swimsuit for men sold in an outdoor sports store — swimwear or outdoor gear? Your past catalog decisions encode your answer; a fine-tuned model learns it.

Building the Dataset

Source: Your existing classified product catalog — every product you have already categorized manually is a training example.

Size target: 1,000-5,000 products (covering your category range)

Construction:

{"messages": [
  {"role": "system", "content": "You are a product classification assistant for [Brand]. Classify products according to our taxonomy. Always output valid JSON matching the schema provided."},
  {"role": "user", "content": "Classify this product:\nName: Patagonia Nano Puff Jacket Womens\nDescription: Lightweight insulated jacket with PrimaLoft Gold Insulation Eco. Wind resistant DWR finish. Packs into chest pocket. 9.5 oz."},
  {"role": "assistant", "content": "{\"primary_category\": \"Clothing > Women's > Jackets & Coats > Insulated Jackets\", \"secondary_categories\": [\"Hiking\", \"Travel\", \"Skiing\"], \"attributes\": {\"waterproof\": false, \"material\": \"PrimaLoft Gold\", \"insulation\": \"synthetic\", \"gender\": \"womens\", \"weight_oz\": 9.5, \"packable\": true}, \"tags\": [\"insulated\", \"packable\", \"lightweight\", \"synthetic-fill\", \"primaloft\"], \"price_tier\": \"premium\", \"meta_keywords\": [\"insulated jacket women\", \"packable down jacket\", \"lightweight insulated jacket\"]}"}
]}

Include examples from every category in your taxonomy. Aim for 20-50 examples per top-level category.

Training Configuration

For classification tasks with structured JSON output:

Base model: Mistral 7B Instruct performs well on structured output tasks
LoRA rank: 8-16 (lower rank is fine for classification)
Epochs: 3-5 (classification tasks converge quickly)

The model needs to learn: (1) your category structure, (2) your attribute vocabulary, (3) how to output valid JSON.

Evaluation

Hold out 10% of your dataset. After training, run the evaluation set and measure:

Primary metric: Correct primary category assignment (exact match)

Secondary metrics:

Tag precision (tags assigned that are correct)
Tag recall (correct tags that were assigned)
JSON validity (100% of outputs should be parseable)
Attribute accuracy (individual field accuracy)

Typical results with a well-constructed 2,000+ example dataset: 88-94% correct primary category on held-out set.

Integration

Batch classification pipeline for new product ingestion:

import requests
import json

def classify_product(name: str, description: str) -> dict:
    response = requests.post(
        'http://your-ollama-server:11434/api/chat',
        json={
            "model": "product-classifier",
            "messages": [
                {
                    "role": "user",
                    "content": f"Classify this product:\nName: {name}\nDescription: {description}"
                }
            ],
            "stream": False
        }
    )

    content = response.json()['message']['content']

    try:
        return json.loads(content)
    except json.JSONDecodeError:
        # Extract JSON from response if wrapped in text
        import re
        json_match = re.search(r'\{.*\}', content, re.DOTALL)
        if json_match:
            return json.loads(json_match.group())
        raise ValueError(f"Could not parse classification output: {content}")

# Process new products CSV
import csv
with open('new_products.csv') as f:
    for row in csv.DictReader(f):
        classification = classify_product(row['name'], row['description'])
        # Push to your catalog management system
        update_catalog(row['sku'], classification)

Run this as a nightly job on new product imports. Agent review catches the 6-12% that need manual correction.

Retainer Structure for This Use Case

The retainer for catalog classification is justified by:

New products arrive continuously → model processes them automatically
Taxonomy changes (new categories, restructured navigation) → model needs retraining
Accuracy monitoring → catching classification drift before it pollutes your catalog

Retainer package: $300-500/month

Includes: Monthly batch processing of new products, quarterly retraining with new examples, accuracy monitoring dashboard, corrections pipeline for agent feedback

Ship AI that runs on your users' devices.

Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

View early bird pricing or join the waitlist →