What Happens When OpenAI Deprecates the Model Your App Depends On

On June 6, 2024, OpenAI deprecated gpt-4-32k, gpt-4-vision-preview, and several other models. Apps using these models had until a specific date to migrate. After that date, API calls returned errors.

This was not the first deprecation. OpenAI has retired over 15 model versions since 2023. GPT-3.5-turbo-0301, GPT-4-0314, text-davinci-003, code-davinci-002. Each deprecation forced every dependent app to migrate.

If your mobile app depends on a specific OpenAI model, this will happen to you. The question is not "if" but "when" and "how disruptive."

The Deprecation Pattern

OpenAI's model lifecycle follows a pattern:

New model launches with a dated version (e.g., gpt-4o-2024-08-06)
Alias points to latest (e.g., gpt-4o points to the latest dated version)
Older dated versions are deprecated with 3-6 months notice
After the deprecation date, calls to the old model return errors or are automatically redirected to a replacement

The notice period sounds generous. In practice, it is not.

Why 3-6 Months Is Not Enough

A deprecation does not just mean changing a model string in your code. When you switch models, the behavior changes:

Output format shifts: Your prompts were tuned for a specific model's tendencies. A new model may format JSON differently, use different phrasing, or handle edge cases differently.

Accuracy changes: A prompt that achieves 90% accuracy on gpt-4-0613 might achieve 85% or 95% on gpt-4o. You do not know which until you test.

Token usage changes: Different models tokenize differently and generate different-length responses. Your cost model changes.

Latency changes: Newer models may be faster or slower. Your UX timing assumptions may break.

Each of these requires testing, prompt re-tuning, and potentially code changes. For a mobile app, this also means a new app release through the App Store review process.

The Mobile App Problem

Web apps can deploy a fix in minutes. Mobile apps cannot.

App Store review: iOS App Store review takes 24-48 hours on average. If you need to change your model string, you need to push a new build, wait for review, and wait for users to update.

User update lag: Even after you push an update, users do not update immediately. The median time for 80% of users to update an iOS app is 1-2 weeks. Android is worse. You may have users on the old version calling a deprecated model for weeks after your fix ships.

Testing requirements: Each new model version needs regression testing across your entire AI feature set. Automated tests help, but prompt-based AI is inherently nondeterministic. Manual QA is often required.

The Cascading Risk

If your app calls the deprecated model directly (hardcoded model string), the timeline looks like this:

OpenAI announces deprecation (Day 0)
You notice the announcement (Day 1-14, depending on monitoring)
You test the replacement model against your prompts (Day 14-30)
You retune prompts that do not work with the new model (Day 30-60)
You push a new app build (Day 60-65)
Apple reviews the build (Day 65-67)
Users update (Day 67-80+)

With a 90-day deprecation window, you are cutting it close. With a 60-day window, you may not make it.

It Is Not Just OpenAI

Every cloud AI provider deprecates models:

Anthropic has deprecated Claude 2, Claude 2.1, and Claude Instant. Claude 3 models will follow.

Google has deprecated PaLM, Bard, and older Gemini versions. The Gemini model lineup changes regularly.

The pattern is industry-wide: AI model development moves fast. Providers have no incentive to maintain old models indefinitely. The compute cost of serving outdated models is real, and the replacement is always "better" from the provider's perspective.

Mitigation Strategies (Within the Cloud Model)

Use Aliases, Not Dated Versions

Call gpt-4o instead of gpt-4o-2024-08-06. The alias automatically points to the latest version. You get the new model without code changes.

The downside: You lose control over when behavior changes. The model can change under you without notice. Your prompts that worked yesterday may work differently today.

Server-Side Model Configuration

Do not hardcode the model name in your mobile app. Store it in a server-side config that the app fetches at launch:

{
  "ai_model": "gpt-4o-mini",
  "ai_provider": "openai"
}

This lets you change models without pushing an app update. But it does not solve the behavior testing problem. You are still shipping an untested model to production.

Prompt Version Pinning

Maintain a prompt registry that pairs each prompt with the model it was tested against. When you change models, the registry flags which prompts need re-testing.

This is good engineering practice but adds operational complexity.

The Structural Solution

The deprecation problem exists because you depend on someone else's infrastructure and someone else's timeline. When you own the model, there is no deprecation.

On-device models that you fine-tune and deploy are yours:

The GGUF file on the user's device does not expire
No third party can deprecate your model
You control when and whether to update
Model updates happen on your schedule, not a vendor's

The model upgrade path becomes:

You decide to update (based on your own testing, not a deadline)
Fine-tune the new version
Test against your quality benchmarks
Push the update to users via your normal model delivery pipeline
Old model continues to work for users who have not updated

No forced timelines. No emergency migrations. No App Store review pressure.

The Cost of Stability

Fine-tuning a model costs $5-50 per run on platforms like Ertas. That is the cost of complete independence from deprecation schedules, behavior changes, and vendor timelines.

Compare that to the engineering cost of a forced migration: 2-4 weeks of developer time for testing, prompt re-tuning, and deployment. At typical developer costs, a single deprecation response costs $5,000-$20,000 in engineering time.

The fine-tuning route is not just more stable. It is cheaper per migration event.

What to Do Now

Inventory your model dependencies. Which models does your app call? Are they dated versions or aliases?
Set up deprecation monitoring. Subscribe to OpenAI's changelog, Anthropic's announcements, and Google's API updates.
Build a migration plan. If your primary model were deprecated tomorrow, how long would it take to ship a fix?
Start collecting training data. Every API call you make today is a potential fine-tuning example for the model you will own tomorrow.
Evaluate the on-device path. Fine-tune a small model on your domain data, export GGUF, test on-device. When it matches your quality bar, deploy it.

The goal is not to avoid cloud APIs entirely. The goal is to stop depending on infrastructure you do not control for features your users rely on.

Ship AI that runs on your users' devices.

Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

View early bird pricing or join the waitlist →

What Happens When OpenAI Deprecates the Model Your App Depends On

The Deprecation Pattern

Why 3-6 Months Is Not Enough

The Mobile App Problem

The Cascading Risk

It Is Not Just OpenAI

Mitigation Strategies (Within the Cloud Model)

Use Aliases, Not Dated Versions

Server-Side Model Configuration

Prompt Version Pinning

The Structural Solution

The Cost of Stability

What to Do Now

Ship AI that runs on your users' devices.

Ship AI that runs on your users' devices.

Keep reading

Claude API vs OpenAI API for Mobile Apps

Your AI API Bill Will 10x When Your App Gets Users

Google Gemini API for Mobile: Pricing, Limits, and When to Go On-Device