On February 23, 2026, Anthropic banned 24,000 accounts in a single action. The accounts belonged to networks used by DeepSeek, Moonshot AI, and MiniMax for distilling Claude's capabilities. The accounts were gone. The API access was gone. Whatever those companies had built on top of that access was, at best, frozen in time.

That's an extreme case. But the underlying dynamic — your AI provider making a unilateral decision that disrupts your business — is something every company using AI APIs faces. The question isn't whether it will happen to you. It's which flavour of disruption you'll get.

Risk 1: Model Deprecation

OpenAI has deprecated five major models or APIs in the past six months. Each deprecation forced thousands of businesses into unplanned engineering work.

The timeline:

January 2026: GPT-4o deprecated with approximately two weeks notice. Developers who had spent months optimising prompts for this specific model had to start over.
March 2026: Realtime API Beta deprecated.
May 2026: DALL-E-3 scheduled for deprecation. Every tool, product, and workflow built on it needs migration.
August 2026: Assistants API sunset announced. Thousands of developers who built production systems on this API face a major migration project.

Each deprecation event costs a business 40-80 engineering hours for testing, migration, prompt rewriting, and regression validation. At market rates, that's $6,000-$12,000 per event. If you're migrating 3-4 times per year, you're spending $18,000-$48,000 annually on work that doesn't improve your product — it just keeps it running.

The insidious part: the more you build on a provider's platform, the more each deprecation costs. Your investment in optimisation makes you more vulnerable, not less.

Risk 2: Pricing Changes

Per-token pricing is variable by design. Your provider can adjust rates with minimal notice, and you have no negotiating leverage.

This creates a structural problem for any business that bakes AI costs into its pricing model. If you charge clients a flat monthly fee for AI-powered services, your margins depend on token costs staying predictable. They won't.

Consider a typical agency setup: you charge AU$800/month for AI customer support automation. Your underlying OpenAI cost at moderate volume is AU$200-350/month. Your margin is 56-75%.

Now your client runs a holiday promotion. Support volume spikes 4x for two weeks. Your API bill for that client jumps to AU$1,200 for the month. Your flat fee doesn't change. You just lost money on that client.

This isn't a hypothetical. It's the daily reality for agencies running on per-token pricing. And it's compounded by the fact that API providers have raised prices, changed rate limits, and introduced usage tiers multiple times.

Risk 3: Terms of Service Changes

What's permitted today may be prohibited tomorrow. ToS updates are unilateral — you accept or you leave.

OpenAI, Anthropic, and others have all updated their Terms of Service since launch. Anti-distillation clauses, output ownership terms, and acceptable use policies have all been tightened. Each update changes the rules for how you can use the outputs you're paying for.

For businesses that store API outputs for training, analytics, or product improvement, a ToS change can retroactively create compliance risk for data you've already collected. You built a workflow six months ago that was fully compliant. A ToS update changes the terms. Now what?

This risk is particularly acute for companies building competitive AI features. The line between "using" an API and "building a competing product" is drawn by the provider — and they can move that line whenever they want.

Risk 4: Outages and Rate Limits

AI APIs are the single point of failure for thousands of production systems. When they go down, everything that depends on them goes down too.

Major AI API providers have experienced outages lasting 2-6 hours multiple times per year. During these windows:

Your customer-facing AI features return errors
Your automated workflows stall
Your SLA commitments to clients are violated
Your support team gets flooded with complaints

Rate limits create a quieter version of the same problem. Hit your requests-per-minute ceiling during peak traffic, and your application degrades for everyone — not just the users making the excess requests.

Most businesses don't have a fallback plan. They don't have a local model ready to serve during outages. They don't have a redundant provider configured. They just… wait.

Risk 5: Geographic Restrictions

Anthropic's response to the distillation campaigns included blocking access from specific regions. Chinese companies that had been using Claude through various access methods lost that access overnight.

Geographic restrictions can be applied instantly and without individual notice. If your AI provider decides to restrict access in a particular jurisdiction — due to sanctions, regulatory compliance, or business strategy — your access can evaporate regardless of how you've been using the service.

This risk extends beyond obvious geopolitical situations. GDPR compliance requirements, data residency laws, and emerging AI regulations in the EU, Australia, and other jurisdictions can all trigger access restrictions or usage limitations that affect your business.

Start your exit strategy today. Ertas makes fine-tuning accessible — no ML expertise required. Join the waitlist →

The Mitigation Framework: Three Levels

Vendor dependency mitigation isn't binary. You don't have to go from "fully API-dependent" to "fully self-hosted" overnight. There are three levels, each reducing risk progressively.

Level 1: Redundancy

What it means: Having more than one AI provider configured and tested.

What it looks like:

Your application can route requests to OpenAI, Anthropic, or a local model
You've tested failover behaviour and documented the quality differences
Your prompt engineering works across providers (not optimised for one model's quirks)

What it costs: Additional API subscriptions and engineering time for abstraction layer.

What it solves: Outage risk, rate limit risk, some pricing risk (you can shift volume to cheaper providers).

What it doesn't solve: Deprecation risk (all providers deprecate models), vendor dependency at a fundamental level (you're still renting).

Level 2: Portability

What it means: Your AI components can be moved between providers — or to your own infrastructure — without rebuilding.

What it looks like:

Abstracted AI layer with clean interfaces
Prompt templates and configurations that aren't provider-specific
Data pipelines that produce training-ready datasets from your production logs
Tested deployment of at least one open-source model as a fallback

What it costs: Architecture investment upfront, ongoing maintenance of abstraction layer.

What it solves: Most vendor risks. You can move if conditions change.

What it doesn't solve: You're still paying per-token. Your AI capabilities still live somewhere else.

Level 3: Ownership

What it means: You own model weights, trained on your data, running on your infrastructure.

What it looks like:

Fine-tuned models for your highest-volume AI tasks
GGUF exports deployed on Ollama, llama.cpp, or similar
Local inference with no API dependency for critical functions
API usage limited to non-critical or exploratory tasks

What it costs: Fine-tuning investment (one-time per model), inference hardware costs (significantly less than API bills at moderate volume).

What it solves: Every vendor risk in this article. You own the capability. Nobody can take it away.

Building Your Exit Ramp

You don't have to migrate everything at once. Start with your highest-value targets.

Identify Fine-Tuning Candidates

Your best candidates for migration to owned models share these characteristics:

High volume — tasks that generate significant API bills
Predictable format — consistent input/output structure
Available training data — you have examples of the task done correctly (API logs work)
Domain-specific — tasks where your data gives you an advantage over generic models

Common first targets:

Customer support classification and routing
Content generation in a specific format or voice
Data extraction from domain-specific documents
FAQ and knowledge base responses

The Migration Sequence

Month 1: Audit your AI touchpoints. Categorise by volume, criticality, and training data availability. Select one task for your pilot migration.

Month 2: Prepare your training dataset from existing API logs. Fine-tune a model. Run shadow evaluation against your current API-based solution.

Month 3: A/B test in production. Measure quality, cost, and latency. If the fine-tuned model meets your quality bar, route production traffic to it.

Ongoing: Repeat for the next task. Each migration reduces your API dependency and improves your cost structure.

For a detailed week-by-week breakdown, see the 90-day migration playbook.

The End State

Full vendor independence looks like this:

Critical AI functions run on models you own, deployed on your infrastructure
Non-critical or experimental functions may still use APIs (with redundancy configured)
Your costs are predictable — local inference is flat, not per-token
Your models don't deprecate — you control the lifecycle
Your data stays on your network — no compliance risk from third-party data flows
Your competitive advantage is owned — fine-tuned models trained on your data can't be replicated by signing up for an API

This isn't theoretical. Teams are making this transition today. The open-source model ecosystem is production-quality. Fine-tuning tools have matured to the point where non-ML teams can operate them. GGUF export means your models are portable across inference engines.

The only question is whether you start building your exit ramp now — or wait until the next deprecation, pricing change, or account ban forces your hand.

Your models. Your data. Your terms. Pre-subscribe to Ertas at early-bird pricing — Builder tier at $14.50/mo for life. See plans →

What Happens When Your AI Provider Cuts You Off? A Survival Guide

Risk 1: Model Deprecation

Risk 2: Pricing Changes

Risk 3: Terms of Service Changes

Risk 4: Outages and Rate Limits

Risk 5: Geographic Restrictions

The Mitigation Framework: Three Levels

Level 1: Redundancy

Level 2: Portability

Level 3: Ownership

Building Your Exit Ramp

Identify Fine-Tuning Candidates

The Migration Sequence

The End State

Ship AI that runs on your users' devices.

Keep reading

The AI Independence Checklist: 7 Signs You're Too Dependent on a Single Provider

Anthropic Just Exposed the Biggest Problem in AI: You Don't Own Your Models

What AI Model Ownership Actually Means (and Why It Matters More Than the API Price)