
Shopify AI Assistant Without OpenAI API Costs: The Local Model Approach
Shopify stores spending $500-5,000/month on AI API costs can replace those calls with a local fine-tuned model. Here's the architecture, the Shopify integration, and the cost math.
If your Shopify store has AI features — product Q&A, support chat, size recommendations, search — your API invoice grows with every user. At 50,000 monthly sessions with 3 AI interactions each, you are making 150,000 API calls per month. At $0.01-0.03 per call, that is $1,500-4,500/month and climbing.
The alternative: a fine-tuned model that runs on your own server. One flat monthly infrastructure cost regardless of volume. Better performance on your store's specific catalog and policies. No per-token charges.
The Cost Reality
| Metric | OpenAI API (GPT-4o) | Local Fine-Tuned Model |
|---|---|---|
| 150,000 API calls/month | $1,500-4,500/month | — |
| Infrastructure (VPS) | — | $20-40/month |
| Fine-tuning + setup | — | $8,000-12,000 (one-time) |
| Accuracy on brand-specific questions | 65-75% | 85-93% |
| Payback period | — | 3-6 months |
For a store doing 150,000 AI interactions/month, the model pays back its development cost in under 3 months, then saves $1,500-4,500/month indefinitely.
Architecture Overview
Shopify Storefront
↓
Shopify Theme JS (chat widget / product page component)
↓ HTTP POST
Your API Gateway (Node.js / Python FastAPI — can host on Railway, Fly.io, or your VPS)
↓
Ollama API (OpenAI-compatible) — running your fine-tuned GGUF model
↓
Response returned to storefront
The critical piece: Ollama exposes an OpenAI-compatible API. Any code written for OpenAI's SDK can point to your Ollama endpoint with a one-line change.
Step 1: Prepare Your Store Data
Product catalog: Export all product data from Shopify Admin → Products → Export CSV. Clean and convert to JSON. Each product becomes training context.
Support history: If you have Gorgias or Zendesk, export resolved tickets from the last 12 months. These become (question, answer) training pairs.
Policies: Write out your return policy, shipping policy, size guide, and warranty terms as structured text. These go in the system message.
Step 2: Train With Ertas
Construct your JSONL dataset from the product Q&A pairs and support history. Add examples covering:
- Product questions ("Does this come in size XL?", "Is this waterproof?")
- Shipping questions ("How long will delivery take?", "Do you ship internationally?")
- Return questions ("Can I return this?", "How do I start a return?")
- Size/fit questions (using your actual size guide data)
Upload to Ertas, validate, train. For a typical Shopify store, the dataset is 800-2,000 examples. Training takes 30-60 minutes.
Step 3: Deploy the Model on Ollama
On your VPS (Hetzner CX22, DigitalOcean Droplet, or similar):
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Import your fine-tuned GGUF from Ertas
# Download the GGUF file from Ertas exports
# Create a Modelfile
cat > Modelfile << EOF
FROM ./your-shopify-model.gguf
SYSTEM "You are the AI shopping assistant for [Brand Store]. Help customers with product questions, sizing, shipping, and returns based on our current catalog and policies."
EOF
# Create the model
ollama create shopify-assistant -f Modelfile
# Verify it works
ollama run shopify-assistant "Do you have blue sneakers in size 10?"
Step 4: Build the API Gateway
A simple Express.js API that forwards requests from your Shopify theme to Ollama:
// server.js
import express from 'express';
import cors from 'cors';
const app = express();
app.use(express.json());
app.use(cors({ origin: 'https://your-store.myshopify.com' }));
app.post('/chat', async (req, res) => {
const { message, productContext } = req.body;
const prompt = productContext
? `Customer is viewing: ${productContext.title} (${productContext.description}). Question: ${message}`
: message;
try {
const response = await fetch('http://localhost:11434/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'shopify-assistant',
messages: [{ role: 'user', content: prompt }],
stream: false
})
});
const data = await response.json();
res.json({ response: data.message.content });
} catch (error) {
res.status(500).json({ error: 'Assistant unavailable' });
}
});
app.listen(3000);
Deploy this on the same VPS or on Railway/Fly.io for managed hosting.
Step 5: Add to Your Shopify Theme
In your theme's product.liquid or in a custom section:
// chat-widget.js
class ShopifyAIChat {
constructor(apiUrl, productHandle) {
this.apiUrl = apiUrl;
this.productHandle = productHandle;
this.productContext = window.ShopifyProductData || null; // Set via liquid
}
async ask(question) {
const response = await fetch(`${this.apiUrl}/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: question,
productContext: this.productContext
})
});
const data = await response.json();
return data.response;
}
}
// Initialize
const chat = new ShopifyAIChat('https://your-api.fly.dev', '{{ product.handle }}');
Pass product context via Liquid:
<!-- In your product template -->
<script>
window.ShopifyProductData = {
title: "{{ product.title | json }}",
description: "{{ product.description | strip_html | json }}",
tags: {{ product.tags | json }},
variants: {{ product.variants | json }}
};
</script>
Maintenance: Keep the Model Current
Your store changes. New products, updated policies, seasonal inventory. The model needs to reflect these changes.
Quarterly update process:
- Export new tickets from last 3 months
- Review new product additions (add Q&A pairs for new product lines)
- Update policy sections if changed
- Retrain with expanded dataset in Ertas
- Download new GGUF, deploy via Ollama update
Most stores need quarterly retraining. High-change stores (active promotions, frequent new product launches) may need monthly.
Ship AI that runs on your users' devices.
Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Further Reading
- E-Commerce AI Agency Opportunity — The full e-commerce vertical overview
- E-Commerce Customer Service Fine-Tuned AI — Support automation walkthrough
- Product Catalog AI Classification — Categorizing products automatically
- Bootstrap AI SaaS Without API Costs — The economics of local model inference
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Keep reading

MCP Servers + Local Models: Zero API Costs for Domain-Specific AI Tools
The combination of MCP servers and fine-tuned local models eliminates per-token costs for AI tools built on Claude, Cursor, and other MCP-compatible clients. Here's the cost math and the architecture.

Replit App AI Costs Exploding? Replace OpenAI with a Fine-Tuned Local Model
Replit's always-on deployment and easy AI integration create a specific API cost problem. Here's how to replace OpenAI with a fine-tuned local model and cut costs to flat rate.

Windsurf + Fine-Tuned Local Model: The Zero-API-Cost Dev Stack
Apps built with Windsurf default to OpenAI API patterns. Here's how to fine-tune a local model for your specific use case and cut inference costs to zero per token.