
Bubble No-Code App + Local AI: Ship AI Features Without API Bills
Bubble's OpenAI plugin and API connector generate per-token costs at scale. Here's how to replace them with a fine-tuned local model using Ollama's OpenAI-compatible API.
Bubble is where serious no-code apps get built. Multi-sided marketplaces, complex SaaS tools, workflow automation products — Bubble handles it all. When you add AI features to a Bubble app, you are typically using the OpenAI plugin or a custom API connector to call Claude or GPT-4.
These work. The problem is what they cost at scale, and Bubble apps at scale can hit that problem faster than you expect because Bubble workflows trigger on events — not just user actions.
How Bubble Apps Use AI Today
There are three common patterns for AI in Bubble:
The OpenAI Plugin is the easiest entry point. Install it, enter your API key, and you can call GPT models from any Bubble workflow. Each plugin call is a direct OpenAI API request with per-token billing.
The API Connector lets you call any REST API from Bubble workflows. Builders who need more control (custom headers, specific models, streaming) set up a connector to the OpenAI or Anthropic API directly. Still per-token.
Backend workflow triggers are where Bubble's AI costs become non-obvious. Bubble workflows fire on: new record creation, scheduled triggers, user actions, webhook receipts, database changes. If any of these triggers an AI call, the cost accumulates based on trigger frequency, not just user session count.
The Bubble AI Cost Problem
A real-world example: a Bubble CRM app that uses AI to auto-generate follow-up email drafts when a new lead is created. The workflow fires on "New Lead Created" and calls OpenAI to generate a draft.
At 50 new leads per day: 50 calls × 700 tokens = 35,000 tokens/day = 1,050,000 tokens/month = ~$2-20/month (cost depends on model).
Scale the business to 500 leads/day: $20-200/month. Scale to 5,000 leads/day: $200-2,000/month.
Now add a second AI workflow (qualification scoring on lead entry), a third (meeting summary generation when a note is added), and a fourth (weekly report generation). Each multiplies the cost.
| Workflows | Triggers/Day | Monthly Cost (gpt-4o-mini) | Monthly Cost (gpt-4o) |
|---|---|---|---|
| 1 AI workflow, 50 triggers | 50 | ~$2 | ~$30 |
| 1 AI workflow, 500 triggers | 500 | ~$20 | ~$300 |
| 4 AI workflows, 500 triggers | 2,000 | ~$80 | ~$1,200 |
| 4 AI workflows, 5,000 triggers | 20,000 | ~$800 | ~$12,000 |
Why Bubble Builders Have an Advantage
Here is the good news specific to Bubble: your app already calls external APIs. Switching from OpenAI's API to a local model's API is a configuration change, not an architecture change.
Bubble's API Connector is generic — it calls any REST endpoint you configure. Ollama (the tool that serves your fine-tuned GGUF model locally) exposes an OpenAI-compatible REST API. Your Bubble AI calls can be redirected from api.openai.com to your Ollama VPS by updating one URL in your API Connector settings.
No code change. No workflow rebuild. One URL swap.
Architecture: Bubble → OpenAI-Compatible Local API
Bubble Workflow
↓ (API Connector call)
Your VPS (e.g., Hetzner, $14-26/mo)
└── Ollama (serving fine-tuned GGUF)
└── OpenAI-compatible endpoint: http://your-vps:11434/v1
↓ (response)
Bubble continues workflow with AI output
Setting Up the Ollama API Connector in Bubble
-
Create the API Connector in Bubble:
- API Root URL:
http://your-vps-ip:11434 - Add a new API: "LocalAI"
- API Root URL:
-
Add the chat completions call:
- Method: POST
- Path:
/v1/chat/completions - Headers:
Content-Type: application/json - Body (JSON):
{ "model": "your-fine-tuned-model", "messages": [ {"role": "system", "content": "Your system prompt"}, {"role": "user", "content": "<dynamic_input>"} ], "temperature": 0.1 } -
Map the response:
- Extract:
choices[0].message.content
- Extract:
This is the same structure as the OpenAI API. If you already have an OpenAI connector in Bubble, you are duplicating it and changing the URL — 10-15 minutes of work.
Fine-Tuning for Bubble Use Cases
Bubble apps commonly use AI for these tasks — all excellent fine-tuning candidates:
Classification and scoring: Lead qualification, ticket routing, content moderation, sentiment classification. These are the highest-ROI fine-tuning tasks: a 7B model trained on 400 labeled examples achieves 90-94% accuracy, at zero marginal cost per classification.
Content generation with templates: Follow-up emails, meeting summaries, report generation, product descriptions. Fine-tuning captures your specific format, tone, and domain vocabulary. Output consistency improves dramatically vs generic prompting.
Data extraction: Pulling structured data from unstructured text inputs (contact forms, support emails, document uploads). Fine-tuning on (input, JSON output) pairs produces highly consistent structured extraction.
Text transformation: Summarization, reformatting, translation within a domain. For tasks with consistent input/output patterns, fine-tuned models match GPT-4 quality.
Step-by-Step Migration
Step 1: Identify your highest-cost AI workflows. Check your OpenAI usage dashboard. Which workflow generates the most tokens? That is your first migration candidate.
Step 2: Export training data from Bubble's database. Your AI outputs are likely stored in your Bubble database (or should be). Export 400-800 input/output pairs as CSV, convert to JSONL:
{"instruction": "Generate a follow-up email for this lead:", "input": "Name: John Smith, Company: Acme, Inquiry: pricing for enterprise", "output": "Hi John, thank you for your interest in our enterprise plan..."}
Step 3: Fine-tune in Ertas. Upload JSONL, select base model (Qwen 2.5 7B for most Bubble use cases), train, export GGUF.
Step 4: Deploy Ollama on a VPS. Hetzner CX32 ($14/month) handles a 7B model fine-tuned for Bubble's typical workflow patterns (short inputs, structured outputs). Load your GGUF file, start Ollama.
Step 5: Update Bubble API Connector. Change the API root URL. Test with a sample workflow. Deploy.
Cost After Migration
| Daily AI Triggers | Monthly Cost (gpt-4o-mini) | Monthly Cost (Local Fine-Tuned) |
|---|---|---|
| 500 | ~$20 | $40.50 |
| 2,000 | ~$80 | $40.50 |
| 10,000 | ~$400 | $40.50 |
| 50,000 | ~$2,000 | $40.50-66.50 |
Break-even against gpt-4o-mini: around 2,000-2,500 daily triggers. Against gpt-4o: under 200 daily triggers.
Ship AI that runs on your users' devices.
Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Further Reading
- Vibecoder AI Cost Guide: All Platforms — Every major platform's cost cliff
- Make.com + Local AI — Same approach for Make.com automation workflows
- n8n + Ollama Fine-Tuned Zero-Cost Stack — Self-hosted automation with local models
- Fine-Tune AI Without Code — How the Ertas training workflow works
- Running AI Models Locally — Setting up Ollama on a VPS
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Keep reading

v0 App AI Features at Flat Cost — No Per-Token Pricing
v0 by Vercel makes AI features easy with the Vercel AI SDK. Here's how to replace per-token cloud API costs with a fine-tuned local model at flat monthly cost.

MCP Servers + Local Models: Zero API Costs for Domain-Specific AI Tools
The combination of MCP servers and fine-tuned local models eliminates per-token costs for AI tools built on Claude, Cursor, and other MCP-compatible clients. Here's the cost math and the architecture.

Replit App AI Costs Exploding? Replace OpenAI with a Fine-Tuned Local Model
Replit's always-on deployment and easy AI integration create a specific API cost problem. Here's how to replace OpenAI with a fine-tuned local model and cut costs to flat rate.