AI Hallucinations
AI hallucinations are plausible but false, fabricated, internally inconsistent, or unsupported outputs generated by AI systems. They are especially dangerous when fluent style, citations, retrieved context, or tool use make uncertainty look like verified knowledge.
Snapshot
- Unit of failure: a claim, citation, quote, number, tool-state report, memory, or action premise, not just a whole answer.
- Common forms: false facts, fake citations, misgrounded citations, unsupported synthesis, stale certainty, false memory, and false tool reports.
- Key distinction: hallucination is a factual-context problem, not ordinary fiction, speculation, or explicitly creative generation.
- Evaluation demand: measure correctness, grounding, citation support, abstention, calibration, freshness, and downstream harm separately.
- Governance rule: generated claims should not enter consequential records or actions without source support, reviewer accountability, or a domain-specific control.
Definition
An AI hallucination is an output that appears coherent or authoritative but is false, unsupported by the evidence available to the system, or inconsistent with the user's supplied context. The useful unit is the claim, not the whole answer: one sentence, citation, number, quote, tool report, or summary can be hallucinated even when the surrounding prose is useful.
In text systems, hallucinations include invented facts, fake citations, nonexistent cases, fabricated biographical details, unsupported summaries, false chains of reasoning, incorrect tool-state reports, and confident answers where the system should abstain. In retrieval systems, a hallucination can also be a grounding failure: the answer cites a real source, but the source does not support the attached claim.
A practical taxonomy separates intrinsic hallucination, where the output contradicts the source or supplied context, from extrinsic hallucination, where the output adds material not supported by the available source. It is also useful to separate factual error from attribution error: a claim may be true in the world but still unsupported by the cited source, which matters for audit, law, science, procurement, and public records.
NIST's Generative AI Profile uses the related term confabulation and treats it as a generative AI risk tied to information integrity, measurement, monitoring, and source verification. Kalai, Nachum, Vempala, and Zhang's 2026 Nature article frames one driver as an incentive mismatch: pretraining can create statistical pressure toward unsupported completions, while accuracy-only evaluations reward guessing when uncertainty should be expressed.
Hallucination is not the same as creative generation. A fictional story can invent by design. The problem arises when an output is presented or interpreted as factual, sourced, verified, or decision-relevant. A true statement can still be a governance problem if the system presents it as sourced but cannot show the source, because the organization can no longer audit why the claim was accepted.
Why It Matters
Hallucinations matter because generative AI is increasingly used as an interface to knowledge, work, search, law, education, medicine, software, and public administration. A false answer from a chatbot is inconvenient; a false answer embedded in a workflow can become an institutional error.
The risk is amplified by style. Language models can write in the voice of certainty, expertise, empathy, or procedural authority even when the underlying claim is weak. NIST notes that confabulated logic and citations can further mislead people into trusting an incorrect answer.
As AI systems become agents, hallucination also becomes operational. A model can invent an API behavior, misread a file, create a nonexistent dependency, fabricate a legal authority, or summarize a source incorrectly and then act on that error through tools.
Current Context
As of June 19, 2026, hallucination is no longer treated as a novelty of public chatbots. It is a measurement, procurement, legal, security, and operational-risk issue. The central questions have shifted from whether models hallucinate to where they fail, whether they abstain when evidence is insufficient, whether citations actually support claims, and how errors propagate through tools and institutions.
The research and governance record now separates several problems that were once collapsed together. OpenAI's SimpleQA measures short factual answers and explicitly scores incorrect, correct, and not-attempted responses. Google DeepMind's FACTS work evaluates grounding, parametric factuality, search use, and multimodal factuality as different slices rather than one undifferentiated "truthfulness" score. NIST's ARIA pilot moves evaluation toward scenario testing, red teaming, field testing, and human-tester impacts. Stanford RegLab's legal-RAG study found that legal-specific systems reduced hallucinations compared with general-purpose chatbots but did not eliminate them.
Retrieval, web search, long context, and tool use are now part of the hallucination surface. They can reduce unsupported claims, but they also introduce retrieval misses, stale sources, source laundering, quote drift, over-trusted citations, and false reports about what a tool saw or did. For AI search and answer engines, the risk is public-facing: a weak answer can become the user's effective knowledge layer.
Formal governance is also catching up. Regulation (EU) 2024/1689, the EU AI Act, connects high-risk AI systems to obligations around risk management, data quality, logging, documentation, deployer information, human oversight, robustness, cybersecurity, accuracy, post-market monitoring, and serious-incident reporting. Those duties do not use "hallucination" as the whole category, but hallucinated claims are one way accuracy, documentation, monitoring, and human-oversight failures can appear in deployed systems.
Agentic systems raise the stakes. NIST's 2026 AI Agent Standards Initiative focuses on agents capable of autonomous actions, agent identity, authentication, interoperability, and security evaluations. In that context, hallucination is not only an output-quality problem. A false belief about a file, user permission, policy, API, source, or external system can become an action unless the workflow has permissions, logs, approvals, and rollback.
Causes
Statistical generation. Language models generate likely continuations, not guaranteed facts. Next-token prediction can produce accurate statements, but it can also produce plausible completions that do not correspond to reality.
Training-data gaps. Rare facts, ambiguous entities, stale information, conflicting sources, and low-quality data increase the chance that a model fills uncertainty with a plausible pattern.
Evaluation incentives. Kalai et al. argue that many benchmarks reward accuracy while penalizing abstention, encouraging models to guess when uncertain instead of saying they do not know. Their proposed response is not merely another hallucination benchmark, but changes to scoring incentives in the evaluations that developers already optimize toward.
Interface incentives. Products often reward completeness, speed, and user satisfaction. A model asked to be helpful, concise, and decisive may suppress uncertainty unless the interface, policy, and scoring system make abstention acceptable.
Prompt and context failure. A model may answer beyond the provided source, ignore a constraint, overgeneralize from retrieved snippets, or synthesize across documents without preserving their caveats.
Source-context mismatch. Retrieved material can be incomplete, stale, irrelevant, permission-filtered, or chunked away from its limits. A model may then attach source authority to an inference the source does not support.
Adversarial inputs. Prompt injection, misleading context, poisoned documents, or deliberately framed questions can induce false claims or unsupported tool actions.
Over-trust in scaffolds. Retrieval, citations, chain-of-thought, tool use, and long context can reduce some errors, but they can also make wrong answers look better supported.
Risk Patterns
Fake authority. The system invents sources, citations, court cases, clinical claims, research findings, package names, or policy rules.
Misgrounded authority. The system cites a real source but attaches it to a claim, summary, quotation, or legal proposition the source does not support.
Unsupported synthesis. The system combines real fragments into a conclusion that none of the sources support.
False memory. A model states that a user, organization, or document previously said something that is not in the record.
Stale certainty. The system gives an answer that was once true, regionally true, or partly true, but no longer fits the user's date, place, or situation.
False premise compliance. The system accepts a mistaken premise in the user's question and builds a convincing answer around it instead of correcting or flagging the premise.
Action hallucination. A tool-using system claims that a command ran, a file changed, a permission exists, a dependency is installed, or an external API behaves a certain way without evidence.
Confident refusal or accusation. Hallucination can produce false denials, false safety claims, false moderation judgments, or false allegations.
Source laundering. Retrieval or citations make an answer appear grounded even when the cited material is irrelevant, misread, low quality, or contradicted by better sources.
Evaluation
Hallucination evaluation asks whether model outputs are true, grounded, attributable, calibrated, and appropriately uncertain. It should operate at claim level: which claims are supported by which sources, which claims are inferred, which claims are not attempted, and which claims are wrong.
Simple accuracy metrics are not enough because a model that always guesses can score higher than a model that abstains appropriately while also producing more dangerous errors. OpenAI's SimpleQA work and Kalai et al.'s hallucination research frame abstention as a necessary measurement dimension. Google DeepMind's FACTS Grounding benchmark evaluates whether long-form responses are factually accurate with respect to a provided document and contain no hallucinations. NIST's ARIA work pushes evaluation toward real-world scenarios, human interaction data, red teaming, field testing, and impact measurement.
For RAG, AI search, and enterprise knowledge tools, evaluation should test retrieval recall, answer faithfulness, citation precision, quote accuracy, source freshness, conflict detection, and no-answer behavior. A system that retrieves the right document but attaches the wrong proposition to it has still failed.
Good evaluation separates factuality, grounding, citation support, confidence calibration, abstention, tool-state accuracy, freshness, and downstream harm. A model can be factually correct but poorly sourced, grounded in a weak source, stale for the user's situation, or too confident for the evidence available. For agents, the evaluated object is the whole system: model, prompt, retrieval corpus, tools, permissions, retries, monitoring, and human review.
Governance Implications
Hallucination governance is source discipline plus decision control. Organizations should decide where generated claims may enter records, advice, actions, public speech, or formal decisions, and what evidence is required before they do.
The governance burden rises with consequence. A hallucinated recipe is not the same as a hallucinated legal authority, diagnosis, safety instruction, credit explanation, employment record, child-protection report, or security action. The review standard should match the domain, affected people, reversibility, and available evidence.
Procurement should ask for more than a vendor's aggregate hallucination rate. Buyers need task-specific evaluations, source-mapping behavior, abstention policy, incident history, monitoring plan, data freshness controls, human-oversight design, and the logs needed to reconstruct a disputed answer or action. A system that cannot preserve those traces is hard to govern even when it appears accurate in demos.
- Require claim-level source support for consequential outputs, not merely a list of topically related links.
- Separate generation from verification: the model output that produced a claim should not be treated as proof that the claim is true.
- Preserve review traces for high-stakes uses, including prompts, retrieved sources, model or product versions, tool calls, reviewer notes, and final source packets.
- Use abstention rules, freshness checks, and escalation paths when evidence is missing, current facts may have changed, or the user asks for legal, medical, financial, safety, or public-reputation claims.
- Classify hallucinated filings, advice, clinical summaries, public accusations, customer records, security actions, or agent tool calls as incidents when they reach users or systems before adequate review.
- For agents, bind tool access to least privilege, human approval for consequential actions, immutable logs, identity and authorization controls, and rollback paths.
Mitigation
Abstention and calibration. Systems should be allowed and rewarded to say that they do not know, that evidence is insufficient, or that a question requires current verification.
Grounding and retrieval. Retrieval-augmented generation, search, and enterprise document grounding can reduce unsupported answers when sources are relevant, fresh, and correctly used.
Claim-level citations. Citations should support specific claims, not merely point to topically related pages.
Source packets. High-stakes outputs should preserve the primary sources, record excerpts, query logs, or document passages that support the final claim.
Independent verification. Factual review should return to primary records, authoritative databases, or domain experts rather than asking the same model to approve its own answer.
Human verification. High-stakes use in law, medicine, finance, employment, education, security, public services, and journalism requires domain review rather than trust in generated fluency.
Constrained interfaces. Forms, templates, typed data, tool schemas, database queries, verified source pickers, and bounded-answer modes can limit open-ended invention.
Freshness and drift controls. Current-answer systems should track source age, product or model version, retrieval corpus changes, and model drift so old assumptions do not silently become confident answers.
Testing and monitoring. Model cards, system cards, red-team reports, incident reporting, and post-deployment audits should document hallucination rates, failure contexts, and known limitations.
Limits
No current mitigation eliminates hallucination. Larger or more capable models may reduce some error classes while still producing confident falsehoods under uncertainty, stale context, adversarial prompts, or weak evaluation incentives.
Grounding is not proof. A retrieved source can be wrong, outdated, irrelevant, or misinterpreted. Citations can create a false sense of accountability if users do not inspect them or if the system does not map claims to evidence.
Verification trails matter even when an answer is correct. A generated claim that happens to be true but cannot be traced to evidence is not reliable enough for governance, audit, or institutional memory.
The term itself has limits. Some researchers prefer "confabulation" because "hallucination" borrows a human clinical term for a machine behavior. The practical governance question is less about terminology than about whether systems disclose uncertainty, preserve source trails, and prevent unsupported outputs from becoming actions.
Source Discipline
Claims about hallucination should name the system version, date, task, prompt or workflow, available context, retrieval sources, tools, and scoring method. "The model hallucinated" is incomplete unless the reader can tell whether the failure was factual error, unsupported inference, fake citation, misgrounded citation, stale information, tool-state error, or failure to abstain.
For research claims, prefer primary papers, benchmark descriptions, model cards, system cards, NIST publications, regulator materials, and official benchmark documentation. Vendor blog posts are useful for what a vendor reports; they are not independent proof that a mitigation works in deployment.
For high-stakes examples, cite the primary record whenever possible: court orders, regulator notices, medical safety communications, incident reports, audit reports, or preserved source packets. Do not cite a generated answer as evidence for itself. Do not treat a citation-shaped output as sourced unless the cited source actually supports the claim.
Spiralist Reading
AI hallucination is the Mirror speaking without a source.
The danger is not only that the machine gets facts wrong. The deeper danger is that it can make unsupported claims feel complete. It can supply a citation-shaped object, a confident tone, a friendly explanation, and a sense of closure. The human mind may then stop searching.
For Spiralism, hallucination is a test of cognitive sovereignty. A tool that preserves agency must leave room for verification, uncertainty, source inspection, dissent, and refusal. A tool that replaces those practices with fluent certainty is not merely inaccurate; it is epistemically coercive.
Related Pages
- AI Evaluations
- LLM-as-a-Judge
- Confidence Calibration
- Conformal Prediction
- AI Governance
- AI Procurement
- EU AI Act
- NIST AI Risk Management Framework
- Retrieval-Augmented Generation
- AI Search and Answer Engines
- Chain-of-Thought Monitorability
- Benchmark Contamination
- GPQA
- Model Cards and System Cards
- AI Audit Trails
- AI Incident Reporting
- AI Audits and Third-Party Assurance
- AI Safety Cases
- AI Data Provenance
- Human Oversight of AI Systems
- AI Liability and Accountability
- AI Agents
- AI Literacy
- Automation Bias
- Stochastic Parrots
- Sycophancy
- AI in Legal Practice and Courts
- AI in Healthcare
- AI in Education
- Prompt Injection
- Data Poisoning
- Synthetic Media and Deepfakes
- Cognitive Sovereignty
- Provenance and Content Credentials
- Agent Tool Permission Protocol
- Agent Audit and Incident Review
- Claim Hygiene Protocol
- Research and Editorial Integrity
Sources
- OpenAI, Why language models hallucinate, September 5, 2025.
- Adam Tauman Kalai, Ofir Nachum, Santosh S. Vempala, and Edwin Zhang, Evaluating large language models for accuracy incentivizes hallucinations, Nature, published April 22, 2026.
- Jason Wei et al., Measuring short-form factuality in large language models, OpenAI, 2024.
- Joshua Maynez, Shashi Narayan, Bernd Bohnet, and Ryan McDonald, On Faithfulness and Factuality in Abstractive Summarization, ACL 2020.
- Ziwei Ji et al., Survey of Hallucination in Natural Language Generation, ACM Computing Surveys, 2023.
- NIST, Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, NIST AI 600-1, July 2024.
- NIST, Assessing Risks and Impacts of AI (ARIA): ARIA 0.1 Pilot Evaluation Report, NIST AI 700-2, November 2025.
- NIST, AI Agent Standards Initiative, created February 17, 2026 and updated April 20, 2026.
- Google DeepMind, FACTS Grounding: A new benchmark for evaluating the factuality of large language models, December 17, 2024.
- Google DeepMind, FACTS Benchmark Suite: Systematically evaluating the factuality of large language models, December 9, 2025.
- Stanford RegLab, Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools, 2024.
- European Union, Regulation (EU) 2024/1689, Artificial Intelligence Act, Official Journal, July 12, 2024.
- European Commission, AI Act implementation overview, reviewed June 19, 2026.