Back to blog
    AI Incident Response Playbook: What to Do When Your Model Gets It Wrong
    ai-incident-responseai-governancemodel-governanceenterprise-aicompliance

    AI Incident Response Playbook: What to Do When Your Model Gets It Wrong

    A complete playbook for responding to AI model failures in production — from detection to root cause analysis, remediation, and disclosure. Adapt for your organization.

    EErtas Team·

    AI incidents are not like software incidents. When a software system has a bug, you find the line of code, fix it, and deploy the patch. The failure mode is deterministic: the same input always produces the same wrong output.

    AI failure modes are statistical. A model doesn't "break" in the traditional sense — it produces the wrong output with some probability across some distribution of inputs. The failure may have been occurring for weeks before anyone noticed. The affected population may be identifiable and large, which triggers disclosure obligations. The root cause may be something that happened during training — months before the failure surfaced — making remediation significantly more complicated than a code rollback.

    These differences require a distinct incident response process. This playbook covers the full lifecycle: detection, triage, investigation, remediation, disclosure, and post-incident review.

    Incident Severity Classification

    Not all AI failures are equal. Classify severity immediately upon detection to ensure proportional response.

    SeverityDefinitionExamples
    P0 — CriticalAI system caused or contributed to physical harm; financial loss >$100K; regulatory breach; or 1,000+ individuals affected by incorrect decisionsIncorrect medical recommendation acted upon; discriminatory loan decisions at scale; GDPR-notifiable data breach involving AI processing
    P1 — HighAI system produced systematically wrong outputs for a defined group; compliance gap discovered; reputational risk if the incident became publicFraud detection model blocking a demographic group at significantly higher rates; LLM generating factually false claims in customer-facing context
    P2 — MediumAI system producing incorrect outputs for a subset of inputs; no immediate harm to individuals; correctable without notificationDocument summarization model failing on a specific document format; recommendation model producing irrelevant results for a specific input category
    P3 — LowQuality degradation noticed; no individual harm; no compliance implicationModel accuracy metrics declining toward but not beyond alert threshold; user-reported reduction in output quality

    Severity escalation: Start with your best estimate and escalate if investigation reveals broader scope. It is better to over-classify and de-escalate than to under-classify and miss a notification deadline.


    Phase 1: Detection and Triage

    Target: 0-2 hours for P0 and P1 incidents

    Detection Sources

    AI incidents typically surface through one of five channels. Make sure your monitoring covers all of them:

    1. Automated monitoring alerts — threshold breaches in model accuracy metrics, output distribution anomalies, latency or error rate spikes
    2. User reports — customer support tickets, internal reports from employees using AI tools
    3. Downstream metric anomalies — business metrics behaving unexpectedly in systems that depend on AI outputs (e.g., loan approval rates changing without policy changes)
    4. Audit log anomalies — patterns in the audit log that indicate unexpected behavior (e.g., unusually high override rates from human reviewers, unusual input patterns)
    5. Third-party reports — regulatory inquiry, journalist inquiry, partner notification, security researcher disclosure

    Triage Checklist

    Complete within the first 2 hours for P0/P1:

    Step 1: Determine severity

    • What is the nature of the failure? (wrong classification, wrong generation, missing output, model unavailable)
    • Are individual people affected? If yes, how many and how severely?
    • Is there a regulatory reporting obligation (EU AI Act, GDPR Article 33, HIPAA)?
    • Assign initial severity: P0 / P1 / P2 / P3

    Step 2: Identify the affected system

    • Which system is affected? (Reference model inventory ID)
    • What is the current model version in production?
    • When was this version deployed? Has it changed recently?
    • Is the failure limited to one model version, or could it affect other deployments?

    Step 3: Estimate scope

    • How long has the failure likely been occurring? (Check logs from before detection)
    • How many decisions or outputs are potentially affected?
    • Is the failure on all inputs or a specific subset?

    Step 4: Preserve evidence — do this before any remediation

    • Export audit logs covering the incident period (minimum: 48 hours before first detected anomaly to now)
    • Save sample inputs and outputs that demonstrate the failure
    • Record the current model version and configuration
    • Screenshot monitoring dashboards showing the anomaly
    • Do NOT update, rollback, or modify the model before evidence is preserved

    Step 5: Immediate notifications

    • P0/P1: AI System Owner → AI Risk Officer → CISO/DPO → Legal (same hour)
    • P2: AI System Owner → AI Risk Officer (within 4 hours)
    • P3: AI System Owner (log and review)

    Immediate Containment

    After evidence is preserved, decide on containment:

    Option A: Traffic rerouting — Route traffic away from the affected model version (to a backup version, fallback logic, or human-only workflow). Use this when a fallback is available and the failure is version-specific.

    Option B: Pause the use case — Suspend AI-assisted processing and route all cases to human review or manual processing. Use this when no safe fallback exists or when human review is required by the incident's severity.

    Document your containment decision and rationale. The choice between Option A and Option B, and the timing of that decision, will be reviewed in the post-incident review and may be examined by regulators.


    Phase 2: Investigation

    Target: 2-24 hours for P0/P1; up to 5 days for P2

    Root Cause Analysis Framework

    AI incidents typically trace to one of four root causes. Work through each systematically:

    Root Cause Type 1: Model behavior change

    • Has the model version changed recently? Check the Model Change Log.
    • For vendor-API models: did the vendor update the model without notice? Compare current model behavior against your logged baseline.
    • For internal models: was the model retrained recently? On different data?
    • Diagnostic: run your standard evaluation set through the current model version and compare scores to the pre-incident baseline.

    Root Cause Type 2: Data distribution shift

    • Are the failing inputs qualitatively different from the model's training data?
    • Has something changed in your upstream data pipeline that affects what the model receives?
    • Diagnostic: compare the statistical distribution of recent inputs to the training data distribution. Flag inputs that fall outside the training distribution.

    Root Cause Type 3: Prompt or integration bug

    • For LLM-based systems: has the prompt template changed? Is there a bug in how context is assembled?
    • For pipeline systems: has the preprocessing logic changed in a way that produces malformed inputs?
    • Diagnostic: manually trace a failing case through the integration layer, step by step, before the model receives it.

    Root Cause Type 4: Human oversight failure

    • Were human reviewers approving outputs they should have rejected?
    • Is the override rate unusually low? (Possible rubber-stamping)
    • Did reviewers receive sufficient context to identify the failure?
    • Diagnostic: review the audit log of human review decisions during the incident period. Calculate override rate and time-to-decision. Interview reviewers.

    Evidence to Collect

    Evidence TypeWhere to Find ItWhy It Matters
    Model version at time of incidentModel inventory + deployment logsEstablishes exactly what was running
    Sample of affected inputs and outputsAudit logsCharacterizes the failure pattern
    Model performance on eval set before and afterModel validation recordsQuantifies performance change
    Human review decisions during incident periodAudit logsDetermines if oversight failure contributed
    Upstream data statisticsData pipeline logsIdentifies distribution shift
    Vendor change notifications (if applicable)Email/API changelogsEstablishes if vendor caused change

    Scope Confirmation

    Once you have a hypothesis for the root cause, run a scope confirmation:

    1. Identify the full population of inputs processed during the affected period
    2. Apply the root cause hypothesis to classify each as likely-affected or not
    3. For a sample of likely-affected cases, verify the failure manually
    4. Produce a confirmed count and percentage of affected decisions

    This number is what you report to regulators and what determines individual notification obligations. Take the time to get it right — over-reporting and under-reporting both have consequences.


    Phase 3: Remediation

    Remediation happens in three phases with distinct timelines.

    Immediate Remediation (Day 1-2)

    • Roll back to the last known-good model version, OR suspend the use case if no safe version exists
    • Verify the rollback actually resolves the failure by testing on the affected input types
    • Restore normal operation only after confirming resolution — not before

    Short-Term Remediation (Days 3-30)

    • Identify all cases processed during the incident period that may have been affected
    • Re-process affected cases with the corrected model (or human review, depending on severity)
    • Notify affected individuals if required by regulation or policy (see Phase 4)
    • Implement enhanced monitoring targeting the failure pattern to detect recurrence

    Long-Term Remediation (Days 30+)

    Root CauseLong-Term Fix
    Model behavior change (vendor)Negotiate version pinning; evaluate vendor scorecard; consider alternative vendor or owned model
    Model behavior change (internal retraining)Improve evaluation process before promoting retrained models to production; add A/B testing period
    Data distribution shiftImplement input distribution monitoring; update training data to include the new distribution
    Prompt/integration bugAdd integration tests covering the failing case type; add input validation before model inference
    Human oversight failureRecalibrate reviewers; adjust review interface to surface relevant context; review threshold settings

    Phase 4: Disclosure

    Internal Disclosure

    • P0: Board-level notification within 24 hours of severity confirmation
    • P1: Executive (C-suite) notification within 24 hours; board notification at next regular meeting or sooner if warranted

    Regulatory Disclosure

    Regulatory disclosure obligations depend on jurisdiction, industry, and what happened:

    EU AI Act (if applicable): Article 73 requires providers of high-risk AI systems to report serious incidents to the market surveillance authority of the member state. "Serious incident" means malfunction or use that leads to death, serious physical harm, or significant damage to property. Timeline: without undue delay.

    GDPR Article 33: Personal data breaches (including those caused by or involving AI processing) must be reported to the supervisory authority within 72 hours. If AI processing caused incorrect decisions affecting individuals whose personal data was involved, assess whether this constitutes a breach.

    HIPAA Breach Notification Rule (US, healthcare): If PHI was involved in the incident, assess breach notification obligations. Business associates must notify covered entities within 60 days.

    SR 11-7 (US banking regulators): Model risk events should be documented and reported through existing model risk management reporting channels. P0 incidents may require direct regulator notification depending on your institution's agreement with its primary regulator.

    Document your disclosure assessment even if you determine no notification is required. The documented analysis showing why you concluded notification wasn't required is itself a compliance artifact.

    Individual Notification

    If individuals received incorrect AI-generated decisions that affected their rights or access to services, Legal will advise on notification obligations (which vary by jurisdiction and industry). The technical investigation should produce: the list of affected individuals, the nature of the incorrect decision they received, and the corrected outcome.


    Phase 5: Post-Incident Review

    Conduct within 10 business days of incident closure. Document the results and store with the incident record.

    Timeline reconstruction

    • When did the failure start? (Not when it was detected — when did it actually begin?)
    • When was it detected?
    • When was it contained?
    • When was it resolved?
    • What was the total time from start to resolution?

    Root cause confirmed

    • What was the confirmed root cause?
    • What evidence confirmed it?

    Lessons learned

    • What controls failed to prevent or detect this incident?
    • What controls worked as intended?
    • Was the response process followed? If not, why not?
    • What would have caught this faster?

    Policy and process updates

    • What changes to monitoring, thresholds, or review processes will prevent recurrence?
    • What changes to the incident response process itself would improve future response?
    • Owner and deadline for each change

    Model governance documentation updates

    • Update the Model Inventory entry (validation status, incident log link)
    • Update the model card if root cause reveals capability limitation
    • Update the Model Change Log if a rollback was performed

    Common AI Incident Pitfalls

    Not preserving logs before remediation. The single most common and damaging mistake. Once you roll back the model or clear processing queues, the evidence of exactly what happened may be gone. Preserve first, remediate second — always.

    Assuming the rollback fixed everything without validation. Test on the specific input types that triggered the failure before declaring the incident resolved. Rollbacks can introduce different problems, or the root cause may not be the model version at all.

    Treating the incident as purely technical and not engaging Legal and Compliance. Even P2 incidents can have regulatory implications that aren't apparent at first. Loop Legal in early and let them determine whether reporting is required — that determination should not be made by the engineering team alone.

    Scope estimation based on gut feel rather than data. "We think a few hundred records were affected" is not an acceptable scope estimate for regulatory reporting. Run the analysis. If you can't run it accurately in time for a regulatory deadline, say so and provide your best estimate with explicit uncertainty bounds.

    Not updating the model inventory after the incident. The inventory entry should reflect what happened: validation status, incident log reference, and any changes to oversight level. Auditors check consistency between incident records and inventory entries.


    Connecting Audit Logs to Investigation

    Root cause analysis for AI incidents depends entirely on the quality and completeness of your audit logs. If your logs don't capture model version at inference time, you can't confirm when a version change occurred. If they don't capture the full input, you can't characterize the failure pattern. If they don't capture human review decisions, you can't assess oversight failure as a contributing factor.

    Ertas Data Suite generates immutable, timestamped audit records for every processing step — who ran it, what inputs were used, what the outputs were, and which operator reviewed it. For incident investigations, this means you have a complete, tamper-evident record to work from rather than reconstructing events from incomplete logs.

    Book a discovery call with Ertas →

    Turn unstructured data into AI-ready datasets — without it leaving the building.

    On-premise data preparation with full audit trail. No data egress. No fragmented toolchains. EU AI Act Article 30 compliance built in.

    Keep reading