On-Device AI Unit Economics: The Math That Makes Mobile AI Profitable

Cloud AI has variable costs. Every user, every request costs money. On-device AI has fixed costs. Fine-tune once, distribute once, run free forever. The financial structures are fundamentally different, and the implications for mobile app businesses are significant.

This article breaks down the complete cost model for both approaches.

Cloud API Cost Structure

Variable Costs (Scale with Users)

Cost Component	Per-User Monthly	At 10K MAU	At 100K MAU
API tokens (GPT-4o-mini)	$0.05-0.10	$500-1,000	$5,000-10,000
API tokens (Gemini Flash)	$0.03-0.06	$300-600	$3,000-6,000
Server infrastructure (proxy/queue)	$0.01-0.02	$100-200	$1,000-2,000
Total variable	$0.06-0.12	$600-1,200	$6,000-12,000

Fixed Costs (Do Not Scale)

Cost Component	Monthly
Developer time (prompt engineering, maintenance)	$2,000-5,000
Monitoring and logging	$50-200
Total fixed	$2,050-5,200

Total Cloud AI Cost

At 10K MAU: $2,650-6,400/month At 100K MAU: $8,050-17,200/month

The variable component dominates at scale. At 100K MAU, variable costs are 75-85% of total AI spend.

On-Device Cost Structure

One-Time Costs

Cost Component	Amount	Frequency
Training data preparation	$500-2,000 (developer time)	Once, then incremental
Fine-tuning compute	$5-50	Per training run
llama.cpp integration	$1,000-3,000 (developer time)	Once
Testing across devices	$500-1,500 (developer time)	Per model update
Total one-time	$2,005-6,550

Recurring Fixed Costs

Cost Component	Monthly
CDN for model distribution	$50-200 (at 100K downloads/month)
Model re-training (quarterly)	$5-50 per run = $2-17/month amortized
Developer maintenance	$500-1,000
Total recurring	$552-1,217

Variable Costs

Cost Component	Per-User Monthly
CDN bandwidth per new user	~$0.08-0.15 (one-time model download)
Per-inference cost	$0.00
Total variable	~$0.00 (after initial download)

Total On-Device Cost

At 10K MAU: $552-1,217/month + amortized one-time costs At 100K MAU: $552-1,217/month + amortized one-time costs

The cost is nearly flat regardless of user count. The CDN cost increases slightly with new user downloads but is minimal compared to API token costs.

Break-Even Analysis

When does on-device become cheaper than cloud APIs?

vs GPT-4o-mini

MAU	Cloud Monthly	On-Device Monthly	Savings
500	$2,680	$1,052	$1,628 (61%)
1,000	$2,750	$1,052	$1,698 (62%)
5,000	$3,150	$1,052	$2,098 (67%)
10,000	$3,650	$1,102	$2,548 (70%)
50,000	$7,550	$1,152	$6,398 (85%)
100,000	$12,550	$1,217	$11,333 (90%)

Break-even: Under 500 MAU. On-device is cheaper from essentially the first month, because the one-time fine-tuning cost ($5-50) is lower than even a single month of cloud API costs at any meaningful user count.

vs Gemini Flash (Cheapest Cloud API)

MAU	Cloud Monthly	On-Device Monthly	Savings
1,000	$2,380	$1,052	$1,328 (56%)
10,000	$2,950	$1,102	$1,848 (63%)
100,000	$8,250	$1,217	$7,033 (85%)

Even against the cheapest cloud API, on-device saves money from day one at any non-trivial user count.

The Scaling Advantage

The financial advantage of on-device compounds as you grow:

Cloud: Growing from 10K to 100K MAU adds $9,000-10,000/month in variable costs. On-device: Growing from 10K to 100K MAU adds ~$65-115/month in CDN costs.

This is the core insight. Cloud AI margins compress as you scale. On-device AI margins improve as you scale. The infrastructure cost is distributed across more users, each contributing $0 in variable cost.

Impact on App Business Models

Subscription Apps ($4.99/month)

Model	AI Cost/User	As % of Revenue	Gross Margin Impact
Cloud (GPT-4o-mini)	$0.08	1.6%	-1.6% per user
Cloud (Gemini Flash)	$0.05	1.0%	-1.0% per user
On-device	~$0.01	0.2%	-0.2% per user

On-device reduces AI's margin impact by 5-8x.

Freemium Apps

Freemium apps are where the difference is starkest. Free users generate cost with zero revenue.

With cloud AI: Every free user costs $0.05-0.10/month in API calls. If 90% of users are free, paying users must cover 10x their own AI costs.

With on-device AI: Free users cost essentially nothing. The model runs on their device. The only cost was the one-time model download (~$0.08-0.15 CDN bandwidth).

This changes the freemium math entirely. You can offer AI features to free users without worrying about cost-per-free-user destroying your margins.

Ad-Supported Apps

Average ad revenue per user: $0.50-2.00/month. Cloud AI at $0.05-0.10/user eats 2.5-20% of ad revenue. On-device AI at ~$0.01/user eats 0.5-2%. The difference can be the margin between a sustainable and unsustainable business.

The Investment Payback

Think of on-device AI as a capital investment. The upfront cost ($2,000-6,500 for the full pipeline) pays back quickly:

Cloud Cost Displaced	Payback Period
$500/month	4-13 months
$1,000/month	2-7 months
$3,000/month	Under 2 months
$10,000/month	Under 1 month

At $3,000/month in cloud API costs (common at 30-50K MAU), the entire on-device investment pays for itself in less than two months.

Platforms like Ertas reduce the upfront investment by handling the fine-tuning infrastructure. You bring training data. Ertas provides the compute, training pipeline, and GGUF export. The one-time cost drops to the fine-tuning compute ($5-50) plus your time to prepare training data.

What to Model

Before committing to either approach, build a simple spreadsheet:

Current cloud AI cost per user (from your billing dashboard)
Projected user growth (monthly)
Cloud cost curve (cost per user * projected MAU)
On-device fixed cost (fine-tuning + integration + maintenance)
Break-even month (when cumulative cloud costs exceed cumulative on-device costs)

For most mobile apps, the break-even is months, not years. The earlier you make the switch, the more you save over the lifetime of the product.

Ship AI that runs on your users' devices.

Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

View early bird pricing or join the waitlist →

On-Device AI Unit Economics: The Math That Makes Mobile AI Profitable

Cloud API Cost Structure

Variable Costs (Scale with Users)

Fixed Costs (Do Not Scale)

Total Cloud AI Cost

On-Device Cost Structure

One-Time Costs

Recurring Fixed Costs

Variable Costs

Total On-Device Cost

Break-Even Analysis

vs GPT-4o-mini

vs Gemini Flash (Cheapest Cloud API)

The Scaling Advantage

Impact on App Business Models

Subscription Apps ($4.99/month)

Freemium Apps

Ad-Supported Apps

The Investment Payback

What to Model

Ship AI that runs on your users' devices.

Ship AI that runs on your users' devices.

Keep reading

AI API Pricing for Mobile: The Real Cost Per User

Your AI API Bill Will 10x When Your App Gets Users

Why Your AI App Feels Slow: Network Latency Is the Bottleneck