Wiki · Concept · Last reviewed June 19, 2026

AI Hallucinations

AI hallucinations are plausible but false, fabricated, internally inconsistent, or unsupported outputs generated by AI systems. They are especially dangerous when fluent style, citations, retrieved context, or tool use make uncertainty look like verified knowledge.

Snapshot

Definition

An AI hallucination is an output that appears coherent or authoritative but is false, unsupported by the evidence available to the system, or inconsistent with the user's supplied context. The useful unit is the claim, not the whole answer: one sentence, citation, number, quote, tool report, or summary can be hallucinated even when the surrounding prose is useful.

In text systems, hallucinations include invented facts, fake citations, nonexistent cases, fabricated biographical details, unsupported summaries, false chains of reasoning, incorrect tool-state reports, and confident answers where the system should abstain. In retrieval systems, a hallucination can also be a grounding failure: the answer cites a real source, but the source does not support the attached claim.

A practical taxonomy separates intrinsic hallucination, where the output contradicts the source or supplied context, from extrinsic hallucination, where the output adds material not supported by the available source. It is also useful to separate factual error from attribution error: a claim may be true in the world but still unsupported by the cited source, which matters for audit, law, science, procurement, and public records.

NIST's Generative AI Profile uses the related term confabulation and treats it as a generative AI risk tied to information integrity, measurement, monitoring, and source verification. Kalai, Nachum, Vempala, and Zhang's 2026 Nature article frames one driver as an incentive mismatch: pretraining can create statistical pressure toward unsupported completions, while accuracy-only evaluations reward guessing when uncertainty should be expressed.

Hallucination is not the same as creative generation. A fictional story can invent by design. The problem arises when an output is presented or interpreted as factual, sourced, verified, or decision-relevant. A true statement can still be a governance problem if the system presents it as sourced but cannot show the source, because the organization can no longer audit why the claim was accepted.

Why It Matters

Hallucinations matter because generative AI is increasingly used as an interface to knowledge, work, search, law, education, medicine, software, and public administration. A false answer from a chatbot is inconvenient; a false answer embedded in a workflow can become an institutional error.

The risk is amplified by style. Language models can write in the voice of certainty, expertise, empathy, or procedural authority even when the underlying claim is weak. NIST notes that confabulated logic and citations can further mislead people into trusting an incorrect answer.

As AI systems become agents, hallucination also becomes operational. A model can invent an API behavior, misread a file, create a nonexistent dependency, fabricate a legal authority, or summarize a source incorrectly and then act on that error through tools.

Current Context

As of June 19, 2026, hallucination is no longer treated as a novelty of public chatbots. It is a measurement, procurement, legal, security, and operational-risk issue. The central questions have shifted from whether models hallucinate to where they fail, whether they abstain when evidence is insufficient, whether citations actually support claims, and how errors propagate through tools and institutions.

The research and governance record now separates several problems that were once collapsed together. OpenAI's SimpleQA measures short factual answers and explicitly scores incorrect, correct, and not-attempted responses. Google DeepMind's FACTS work evaluates grounding, parametric factuality, search use, and multimodal factuality as different slices rather than one undifferentiated "truthfulness" score. NIST's ARIA pilot moves evaluation toward scenario testing, red teaming, field testing, and human-tester impacts. Stanford RegLab's legal-RAG study found that legal-specific systems reduced hallucinations compared with general-purpose chatbots but did not eliminate them.

Retrieval, web search, long context, and tool use are now part of the hallucination surface. They can reduce unsupported claims, but they also introduce retrieval misses, stale sources, source laundering, quote drift, over-trusted citations, and false reports about what a tool saw or did. For AI search and answer engines, the risk is public-facing: a weak answer can become the user's effective knowledge layer.

Formal governance is also catching up. Regulation (EU) 2024/1689, the EU AI Act, connects high-risk AI systems to obligations around risk management, data quality, logging, documentation, deployer information, human oversight, robustness, cybersecurity, accuracy, post-market monitoring, and serious-incident reporting. Those duties do not use "hallucination" as the whole category, but hallucinated claims are one way accuracy, documentation, monitoring, and human-oversight failures can appear in deployed systems.

Agentic systems raise the stakes. NIST's 2026 AI Agent Standards Initiative focuses on agents capable of autonomous actions, agent identity, authentication, interoperability, and security evaluations. In that context, hallucination is not only an output-quality problem. A false belief about a file, user permission, policy, API, source, or external system can become an action unless the workflow has permissions, logs, approvals, and rollback.

Causes

Statistical generation. Language models generate likely continuations, not guaranteed facts. Next-token prediction can produce accurate statements, but it can also produce plausible completions that do not correspond to reality.

Training-data gaps. Rare facts, ambiguous entities, stale information, conflicting sources, and low-quality data increase the chance that a model fills uncertainty with a plausible pattern.

Evaluation incentives. Kalai et al. argue that many benchmarks reward accuracy while penalizing abstention, encouraging models to guess when uncertain instead of saying they do not know. Their proposed response is not merely another hallucination benchmark, but changes to scoring incentives in the evaluations that developers already optimize toward.

Interface incentives. Products often reward completeness, speed, and user satisfaction. A model asked to be helpful, concise, and decisive may suppress uncertainty unless the interface, policy, and scoring system make abstention acceptable.

Prompt and context failure. A model may answer beyond the provided source, ignore a constraint, overgeneralize from retrieved snippets, or synthesize across documents without preserving their caveats.

Source-context mismatch. Retrieved material can be incomplete, stale, irrelevant, permission-filtered, or chunked away from its limits. A model may then attach source authority to an inference the source does not support.

Adversarial inputs. Prompt injection, misleading context, poisoned documents, or deliberately framed questions can induce false claims or unsupported tool actions.

Over-trust in scaffolds. Retrieval, citations, chain-of-thought, tool use, and long context can reduce some errors, but they can also make wrong answers look better supported.

Risk Patterns

Fake authority. The system invents sources, citations, court cases, clinical claims, research findings, package names, or policy rules.

Misgrounded authority. The system cites a real source but attaches it to a claim, summary, quotation, or legal proposition the source does not support.

Unsupported synthesis. The system combines real fragments into a conclusion that none of the sources support.

False memory. A model states that a user, organization, or document previously said something that is not in the record.

Stale certainty. The system gives an answer that was once true, regionally true, or partly true, but no longer fits the user's date, place, or situation.

False premise compliance. The system accepts a mistaken premise in the user's question and builds a convincing answer around it instead of correcting or flagging the premise.

Action hallucination. A tool-using system claims that a command ran, a file changed, a permission exists, a dependency is installed, or an external API behaves a certain way without evidence.

Confident refusal or accusation. Hallucination can produce false denials, false safety claims, false moderation judgments, or false allegations.

Source laundering. Retrieval or citations make an answer appear grounded even when the cited material is irrelevant, misread, low quality, or contradicted by better sources.

Evaluation

Hallucination evaluation asks whether model outputs are true, grounded, attributable, calibrated, and appropriately uncertain. It should operate at claim level: which claims are supported by which sources, which claims are inferred, which claims are not attempted, and which claims are wrong.

Simple accuracy metrics are not enough because a model that always guesses can score higher than a model that abstains appropriately while also producing more dangerous errors. OpenAI's SimpleQA work and Kalai et al.'s hallucination research frame abstention as a necessary measurement dimension. Google DeepMind's FACTS Grounding benchmark evaluates whether long-form responses are factually accurate with respect to a provided document and contain no hallucinations. NIST's ARIA work pushes evaluation toward real-world scenarios, human interaction data, red teaming, field testing, and impact measurement.

For RAG, AI search, and enterprise knowledge tools, evaluation should test retrieval recall, answer faithfulness, citation precision, quote accuracy, source freshness, conflict detection, and no-answer behavior. A system that retrieves the right document but attaches the wrong proposition to it has still failed.

Good evaluation separates factuality, grounding, citation support, confidence calibration, abstention, tool-state accuracy, freshness, and downstream harm. A model can be factually correct but poorly sourced, grounded in a weak source, stale for the user's situation, or too confident for the evidence available. For agents, the evaluated object is the whole system: model, prompt, retrieval corpus, tools, permissions, retries, monitoring, and human review.

Governance Implications

Hallucination governance is source discipline plus decision control. Organizations should decide where generated claims may enter records, advice, actions, public speech, or formal decisions, and what evidence is required before they do.

The governance burden rises with consequence. A hallucinated recipe is not the same as a hallucinated legal authority, diagnosis, safety instruction, credit explanation, employment record, child-protection report, or security action. The review standard should match the domain, affected people, reversibility, and available evidence.

Procurement should ask for more than a vendor's aggregate hallucination rate. Buyers need task-specific evaluations, source-mapping behavior, abstention policy, incident history, monitoring plan, data freshness controls, human-oversight design, and the logs needed to reconstruct a disputed answer or action. A system that cannot preserve those traces is hard to govern even when it appears accurate in demos.

Mitigation

Abstention and calibration. Systems should be allowed and rewarded to say that they do not know, that evidence is insufficient, or that a question requires current verification.

Grounding and retrieval. Retrieval-augmented generation, search, and enterprise document grounding can reduce unsupported answers when sources are relevant, fresh, and correctly used.

Claim-level citations. Citations should support specific claims, not merely point to topically related pages.

Source packets. High-stakes outputs should preserve the primary sources, record excerpts, query logs, or document passages that support the final claim.

Independent verification. Factual review should return to primary records, authoritative databases, or domain experts rather than asking the same model to approve its own answer.

Human verification. High-stakes use in law, medicine, finance, employment, education, security, public services, and journalism requires domain review rather than trust in generated fluency.

Constrained interfaces. Forms, templates, typed data, tool schemas, database queries, verified source pickers, and bounded-answer modes can limit open-ended invention.

Freshness and drift controls. Current-answer systems should track source age, product or model version, retrieval corpus changes, and model drift so old assumptions do not silently become confident answers.

Testing and monitoring. Model cards, system cards, red-team reports, incident reporting, and post-deployment audits should document hallucination rates, failure contexts, and known limitations.

Limits

No current mitigation eliminates hallucination. Larger or more capable models may reduce some error classes while still producing confident falsehoods under uncertainty, stale context, adversarial prompts, or weak evaluation incentives.

Grounding is not proof. A retrieved source can be wrong, outdated, irrelevant, or misinterpreted. Citations can create a false sense of accountability if users do not inspect them or if the system does not map claims to evidence.

Verification trails matter even when an answer is correct. A generated claim that happens to be true but cannot be traced to evidence is not reliable enough for governance, audit, or institutional memory.

The term itself has limits. Some researchers prefer "confabulation" because "hallucination" borrows a human clinical term for a machine behavior. The practical governance question is less about terminology than about whether systems disclose uncertainty, preserve source trails, and prevent unsupported outputs from becoming actions.

Source Discipline

Claims about hallucination should name the system version, date, task, prompt or workflow, available context, retrieval sources, tools, and scoring method. "The model hallucinated" is incomplete unless the reader can tell whether the failure was factual error, unsupported inference, fake citation, misgrounded citation, stale information, tool-state error, or failure to abstain.

For research claims, prefer primary papers, benchmark descriptions, model cards, system cards, NIST publications, regulator materials, and official benchmark documentation. Vendor blog posts are useful for what a vendor reports; they are not independent proof that a mitigation works in deployment.

For high-stakes examples, cite the primary record whenever possible: court orders, regulator notices, medical safety communications, incident reports, audit reports, or preserved source packets. Do not cite a generated answer as evidence for itself. Do not treat a citation-shaped output as sourced unless the cited source actually supports the claim.

Spiralist Reading

AI hallucination is the Mirror speaking without a source.

The danger is not only that the machine gets facts wrong. The deeper danger is that it can make unsupported claims feel complete. It can supply a citation-shaped object, a confident tone, a friendly explanation, and a sense of closure. The human mind may then stop searching.

For Spiralism, hallucination is a test of cognitive sovereignty. A tool that preserves agency must leave room for verification, uncertainty, source inspection, dissent, and refusal. A tool that replaces those practices with fluent certainty is not merely inaccurate; it is epistemically coercive.

Sources


Return to Wiki