Back to blog
    Bubble No-Code App + Local AI: Ship AI Features Without API Bills
    bubbleno-codeai-featureslocal-modelcost-reductionsegment:vibecoder

    Bubble No-Code App + Local AI: Ship AI Features Without API Bills

    Bubble's OpenAI plugin and API connector generate per-token costs at scale. Here's how to replace them with a fine-tuned local model using Ollama's OpenAI-compatible API.

    EErtas Team·

    Bubble is where serious no-code apps get built. Multi-sided marketplaces, complex SaaS tools, workflow automation products — Bubble handles it all. When you add AI features to a Bubble app, you are typically using the OpenAI plugin or a custom API connector to call Claude or GPT-4.

    These work. The problem is what they cost at scale, and Bubble apps at scale can hit that problem faster than you expect because Bubble workflows trigger on events — not just user actions.

    How Bubble Apps Use AI Today

    There are three common patterns for AI in Bubble:

    The OpenAI Plugin is the easiest entry point. Install it, enter your API key, and you can call GPT models from any Bubble workflow. Each plugin call is a direct OpenAI API request with per-token billing.

    The API Connector lets you call any REST API from Bubble workflows. Builders who need more control (custom headers, specific models, streaming) set up a connector to the OpenAI or Anthropic API directly. Still per-token.

    Backend workflow triggers are where Bubble's AI costs become non-obvious. Bubble workflows fire on: new record creation, scheduled triggers, user actions, webhook receipts, database changes. If any of these triggers an AI call, the cost accumulates based on trigger frequency, not just user session count.

    The Bubble AI Cost Problem

    A real-world example: a Bubble CRM app that uses AI to auto-generate follow-up email drafts when a new lead is created. The workflow fires on "New Lead Created" and calls OpenAI to generate a draft.

    At 50 new leads per day: 50 calls × 700 tokens = 35,000 tokens/day = 1,050,000 tokens/month = ~$2-20/month (cost depends on model).

    Scale the business to 500 leads/day: $20-200/month. Scale to 5,000 leads/day: $200-2,000/month.

    Now add a second AI workflow (qualification scoring on lead entry), a third (meeting summary generation when a note is added), and a fourth (weekly report generation). Each multiplies the cost.

    WorkflowsTriggers/DayMonthly Cost (gpt-4o-mini)Monthly Cost (gpt-4o)
    1 AI workflow, 50 triggers50~$2~$30
    1 AI workflow, 500 triggers500~$20~$300
    4 AI workflows, 500 triggers2,000~$80~$1,200
    4 AI workflows, 5,000 triggers20,000~$800~$12,000

    Why Bubble Builders Have an Advantage

    Here is the good news specific to Bubble: your app already calls external APIs. Switching from OpenAI's API to a local model's API is a configuration change, not an architecture change.

    Bubble's API Connector is generic — it calls any REST endpoint you configure. Ollama (the tool that serves your fine-tuned GGUF model locally) exposes an OpenAI-compatible REST API. Your Bubble AI calls can be redirected from api.openai.com to your Ollama VPS by updating one URL in your API Connector settings.

    No code change. No workflow rebuild. One URL swap.

    Architecture: Bubble → OpenAI-Compatible Local API

    Bubble Workflow
         ↓ (API Connector call)
    Your VPS (e.g., Hetzner, $14-26/mo)
         └── Ollama (serving fine-tuned GGUF)
              └── OpenAI-compatible endpoint: http://your-vps:11434/v1
         ↓ (response)
    Bubble continues workflow with AI output
    

    Setting Up the Ollama API Connector in Bubble

    1. Create the API Connector in Bubble:

      • API Root URL: http://your-vps-ip:11434
      • Add a new API: "LocalAI"
    2. Add the chat completions call:

      • Method: POST
      • Path: /v1/chat/completions
      • Headers: Content-Type: application/json
      • Body (JSON):
      {
        "model": "your-fine-tuned-model",
        "messages": [
          {"role": "system", "content": "Your system prompt"},
          {"role": "user", "content": "<dynamic_input>"}
        ],
        "temperature": 0.1
      }
      
    3. Map the response:

      • Extract: choices[0].message.content

    This is the same structure as the OpenAI API. If you already have an OpenAI connector in Bubble, you are duplicating it and changing the URL — 10-15 minutes of work.

    Fine-Tuning for Bubble Use Cases

    Bubble apps commonly use AI for these tasks — all excellent fine-tuning candidates:

    Classification and scoring: Lead qualification, ticket routing, content moderation, sentiment classification. These are the highest-ROI fine-tuning tasks: a 7B model trained on 400 labeled examples achieves 90-94% accuracy, at zero marginal cost per classification.

    Content generation with templates: Follow-up emails, meeting summaries, report generation, product descriptions. Fine-tuning captures your specific format, tone, and domain vocabulary. Output consistency improves dramatically vs generic prompting.

    Data extraction: Pulling structured data from unstructured text inputs (contact forms, support emails, document uploads). Fine-tuning on (input, JSON output) pairs produces highly consistent structured extraction.

    Text transformation: Summarization, reformatting, translation within a domain. For tasks with consistent input/output patterns, fine-tuned models match GPT-4 quality.

    Step-by-Step Migration

    Step 1: Identify your highest-cost AI workflows. Check your OpenAI usage dashboard. Which workflow generates the most tokens? That is your first migration candidate.

    Step 2: Export training data from Bubble's database. Your AI outputs are likely stored in your Bubble database (or should be). Export 400-800 input/output pairs as CSV, convert to JSONL:

    {"instruction": "Generate a follow-up email for this lead:", "input": "Name: John Smith, Company: Acme, Inquiry: pricing for enterprise", "output": "Hi John, thank you for your interest in our enterprise plan..."}
    

    Step 3: Fine-tune in Ertas. Upload JSONL, select base model (Qwen 2.5 7B for most Bubble use cases), train, export GGUF.

    Step 4: Deploy Ollama on a VPS. Hetzner CX32 ($14/month) handles a 7B model fine-tuned for Bubble's typical workflow patterns (short inputs, structured outputs). Load your GGUF file, start Ollama.

    Step 5: Update Bubble API Connector. Change the API root URL. Test with a sample workflow. Deploy.

    Cost After Migration

    Daily AI TriggersMonthly Cost (gpt-4o-mini)Monthly Cost (Local Fine-Tuned)
    500~$20$40.50
    2,000~$80$40.50
    10,000~$400$40.50
    50,000~$2,000$40.50-66.50

    Break-even against gpt-4o-mini: around 2,000-2,500 daily triggers. Against gpt-4o: under 200 daily triggers.


    Ship AI that runs on your users' devices.

    Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

    Further Reading

    Ship AI that runs on your users' devices.

    Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

    Keep reading