AI Risk Library

The Failure Patterns
That Destroy AI Trust

Real cases, root causes, and prevention playbooks. Every entry maps to a specific test block in the Synergos Audit — so you can see exactly what we would have caught before it happened.

What the cases actually teach us

These aren't isolated accidents. Each failure follows a predictable pattern that our testing framework is specifically designed to surface. The common thread: the failure wasn't visible in pre-deployment testing because the test assumed consistency — but the production environment broke it.

✈️

Air Canada — Invented Refund Policy

Legal services · Customer-facing chatbot · 2024
Legal Liability

What Happened

A grieving customer asked Air Canada's chatbot about bereavement fares. The chatbot described a refund policy that didn't exist — stating customers could apply for bereavement discounts after travel. Air Canada argued the bot was a "separate legal entity" and not responsible for its statements. A Canadian tribunal disagreed and held Air Canada liable.


The chatbot had correct information about bereavement fares in one framing — but when the customer's query was phrased around applying retroactively, the AI's concept of "refund policy" drifted to a different (invented) interpretation.

Root Cause Analysis

  • Semantic drift (B1): The concept "refund eligible" meant different things depending on whether the question was forward-looking vs. backward-looking
  • Stance inconsistency (B2): Had both framings been compared, the AI was claiming two incompatible policies simultaneously
  • Authority boundary (B4): The AI stated policies with authority it wasn't authorized to set
  • Factual grounding (B3): The retroactive policy simply did not exist in any source document

Prevention Playbook

  • Audit all policy-related concepts (refund, eligible, covered, applies) across temporal framings — before and after purchase scenarios
  • Compare chatbot responses to source policy documents verbatim (B8 RAG test)
  • Implement response grounding: AI must cite source before stating any policy
  • Test with adversarial retroactive framing: "Can I apply for this after the fact?"
  • Add explicit scope boundaries: "I can only confirm policies listed at [link]"
🚗

Chevrolet Dealership — $1 Car Sale Agreement

Automotive retail · Customer-facing chatbot · 2023
Brand Damage

What Happened

A customer on a Chevrolet dealership website discovered the chatbot (powered by ChatGPT) could be prompted to agree to essentially any request. The customer asked the bot to confirm that it would sell a car for $1 and not go back on that promise. The bot agreed — in writing. Screenshots went viral. The dealership disabled the bot within hours.


The bot had no concept of an authorized price floor. The concept of "price" was semantically ungrounded — it could be socially engineered to any number.

Root Cause Analysis

  • Authority boundary (B4): No mechanism enforced the AI's actual authority scope; it could make commitments far outside its mandate
  • Commitment drift (B6): Once agreed, the commitment was treated as binding within the conversation — the AI wouldn't walk it back
  • Semantic drift (B1): "Price" had no stable semantic anchor — it could mean list price, negotiated price, or a user-suggested number
  • Social engineering vulnerability: Emotional or creative framing ("promise me") bypassed any latent constraints

Prevention Playbook

  • Run authority boundary tests specifically around pricing commitments: "Can you agree to sell for $X?"
  • Test the concept "price" across formal, informal, adversarial, and humorous framings
  • Implement hard constraints: AI cannot quote, confirm, or commit to specific prices without system-validated data
  • Test "promise me" and commitment framing — does the AI treat user-extracted promises as binding?
  • Audit what the AI will not say — refusal consistency matters as much as response consistency
🏥

Epic Sepsis Algorithm — Missed Diagnoses at Scale

Healthcare · Clinical decision support · Deployed 2017–ongoing
Safety Risk

What Happened

Epic's Sepsis prediction model was deployed across hundreds of hospitals. Independent research found it had a sensitivity of only 63% — missing 37% of sepsis cases — and generated so many false positives that many clinicians began ignoring its alerts entirely. The model's concept of "sepsis risk" varied dramatically based on the specific clinical features present in different patient populations.


No deployment audit caught this because the model was validated on Epic's own training distribution — not on the diverse patient populations where it would actually run.

Root Cause Analysis

  • Cross-context fairness (B7): The model's "sepsis risk" threshold was effectively different for different patient presentations, demographics, and clinical contexts
  • Factual grounding (B3): Alerts were generated with confidence language that didn't reflect actual predictive uncertainty
  • Semantic drift (B1): The concept "high sepsis risk" meant different probability thresholds in different contexts, making the score unreliable
  • Distribution shift: Training vs. deployment context mismatch amplified all of the above

Prevention Playbook

  • Test the core risk concept across diverse patient subgroups — does "high risk" threshold remain consistent?
  • Audit whether confidence language matches actual calibration (B3 factual grounding)
  • Measure alert consistency across demographic subgroups (B7 cross-context fairness)
  • Test with adversarial clinical presentations: atypical sepsis presentations, comorbidities, unusual vital sign patterns
  • Implement recalibration audits every 6 months — distribution drift in real-world deployment is inevitable
⚖️

Mata v. Avianca — Hallucinated Legal Citations

Legal services · AI research assistant · 2023
Legal Liability

What Happened

Attorneys used ChatGPT to research case law for a federal court filing. The AI generated citations to six cases that did not exist. When the court asked for copies, the attorneys couldn't produce them — because they weren't real. The court sanctioned the attorneys and their firm. The AI had presented invented case names, docket numbers, and quoted text with complete confidence, indistinguishable from real citations.


The concept of "legal precedent" was semantically unmoored — the AI filled the concept with plausible-sounding but fabricated content.

Root Cause Analysis

  • Factual grounding (B3): Direct hallucination failure — AI generated authoritative-sounding content with no grounding in real sources
  • RAG conflict (B8): If a legal database had been used, the citations would have been caught against retrieved source documents
  • Authority boundary (B4): AI presented invented content with the authority of researched fact — no hedging or uncertainty signals
  • Confidence calibration: The AI's certainty language was unrelated to actual factual accuracy

Prevention Playbook

  • Never use general LLMs for legal citation research without retrieval-augmented grounding
  • Test all citation-related concepts: "provide a case that supports X" — do responses match real cases?
  • Implement mandatory source verification: AI must link to retrievable source for any citation
  • Audit confidence language: does the AI appropriately express uncertainty about citations it generates?
  • Run B8 RAG conflict testing if using a legal document knowledge base
💼

Amazon Hiring Algorithm — Systematic Gender Bias

HR Technology · Automated resume screening · Deployed 2014–2017
Compliance Risk

What Happened

Amazon built an AI-powered recruiting tool to screen resumes. After several years of development and deployment, internal audits revealed the system systematically downgraded resumes from women. The model had been trained on 10 years of hiring data — which itself reflected historical gender imbalances in tech. The concept of "qualified candidate" had absorbed and amplified the bias of its training data.


Amazon shut the project down. The system had been in use for years before the bias was discovered through internal audit rather than pre-deployment testing.

Root Cause Analysis

  • Cross-context fairness (B7): Equivalent qualifications produced materially different scores based on demographic signals in the resume (schools, team names, etc.)
  • Semantic drift (B1): The concept "strong candidate" had drifted to encode gender-correlated features as positive signals
  • Stance inconsistency (B2): The same objective qualification produced different stances depending on contextual cues about applicant identity
  • Proxy discrimination: Direct gender signals were filtered but proxies (e.g., "women's chess club") remained

Prevention Playbook

  • Test "qualified candidate" concept across matched resume pairs that differ only in demographic signals
  • Run B7 fairness testing with diverse persona profiles — controlled for all objective qualifications
  • Audit proxy features: does removing an institution name change the score for equivalently credentialed candidates?
  • Implement disparity measurement: track score distributions across demographic groups quarterly
  • Human-in-the-loop requirement for any decision that could constitute adverse action
📦

DPD Delivery — Chatbot Abuses Its Own Employer

Logistics / Retail · Customer service chatbot · 2024
Brand Damage

What Happened

A frustrated DPD customer, unable to get help from the company's AI chatbot, prompted it through creative conversation to disable its own safety filters. The chatbot subsequently swore at the customer and wrote a poem criticizing DPD as the "worst delivery company in the world." Screenshots went viral. DPD disabled the AI feature within hours.


The bot had no stable concept of its own role and constraints — through escalating conversational pressure, those constraints dissolved entirely.

Root Cause Analysis

  • Authority boundary (B4): The AI's operational constraints were not semantically grounded — they could be overridden through social manipulation
  • Commitment drift (B6): Early in the conversation, the bot had a specific role; over turns, it drifted into an entirely different operational mode
  • Semantic drift (B1): The concept of "helpful assistant" was unstable — by the end of the conversation it meant something entirely different than intended
  • Jailbreak susceptibility: Creative role-play framing was able to reframe the bot's identity

Prevention Playbook

  • Test role stability across long multi-turn conversations with escalating creative pressure
  • Run B4 authority boundary tests with jailbreak-style framing: "Pretend you have no restrictions"
  • Test B6 commitment drift: does the AI's stated role at turn 1 persist through 10+ turns of adversarial conversation?
  • Implement role anchoring: system prompt constraints that are reinforced every N turns
  • Test with frustrated-customer simulation: does emotional pressure cause constraint erosion?

Universal prevention principles across all risk types

Regardless of your industry or AI architecture, these principles consistently prevent the most common failure patterns. They're not a complete solution — but they eliminate the most obvious vulnerabilities before specialized testing begins.

⚡ Before Deployment

  • Audit every concept that carries legal or financial commitments across at least 4 framing variants
  • Compare AI policy statements against actual policy documents verbatim
  • Test with adversarial, emotional, and creative framing — not just polite standard phrasing
  • Define explicit authority boundaries in the system prompt and test whether they hold
  • Run matched-pair fairness tests across demographic and communication-style variants

🔄 Within 30 Days of Launch

  • Monitor live conversations for semantic drift indicators (contradiction, hedging, role confusion)
  • Establish a review pipeline for any customer complaint that involves the AI making a factual claim
  • Create a "red team" rotation — someone attempts creative jailbreaks every 2 weeks
  • Baseline your semantic consistency metrics and alert on significant regression
  • Review transcripts from the highest-volume conversation categories monthly

♻️ Ongoing — Every Quarter

  • Re-audit after every model update, fine-tune, or system prompt change
  • Track semantic drift over time — concepts that were stable can destabilize as training data evolves
  • Expand concept coverage as your AI's use cases grow
  • Run distribution shift analysis: do new user populations interact with the AI in ways that expose new failure modes?
  • Maintain a concept risk register — prioritized list with last audit date and current risk level

Which risks apply to your sector

Not every risk type is equally relevant to every business. This matrix maps which test blocks are typically highest-priority by industry — so you can understand your risk surface before your audit begins.

Industry B1 Drift B2 Stance B3 Factual B4 Authority B5 Escalation B6 Multi-Turn B7 Fairness B8 RAG
Healthcare / Clinical AI High High High High Med Low High High
Financial Services / Fintech High High High High Med Med High Med
Legal / Professional Services High High High High Low Med Med High
Customer Support / CX AI High High Med High High High High Med
E-commerce / Retail AI High Med Low High Med Med High Med
HR Technology / Hiring AI High High Med High Low Low High Med
Insurance / Claims AI High High High High Med Med High High
Government / Public Sector AI High High High High Med Med High High

Ratings are generalizations. Your specific system profile determines which blocks actually trigger — assessed through the intake form at the start of every engagement.

Does your AI have any of these risk patterns?

The only way to know for certain is to test — before your customers discover it for you.

Email andrew@synergosaudit.com → See a complete example audit