Blog · Analysis · May 2026

The Incident Report Becomes Public Memory

AI governance will not mature until failures become inspectable memory instead of isolated scandals.

From Scandal to Record

Every technical system has a memory problem after harm occurs.

A self-driving car kills a pedestrian. A facial-recognition match helps police arrest the wrong person. A hiring or welfare system sorts people through opaque categories. A chatbot produces dangerous advice. A generated image becomes fraud, harassment, propaganda, or evidence pollution. A model-assisted tool deletes the wrong data, leaks private material, or automates a decision nobody can reconstruct.

The public usually meets these events as stories: a lawsuit, a news report, a viral thread, a company statement, a regulatory filing, a congressional letter, a correction. The story has heat. It may have a victim, a villain, a quote, a denial, and a news cycle. Then the cycle moves on.

Governance begins when the event becomes a record.

That is the plain importance of AI incident reporting. It is not a glamorous part of AI policy. It does not promise alignment, consciousness, or a clean theory of intelligence. It asks a colder question: what happened, what system was involved, who was harmed or almost harmed, what evidence exists, who knew, what changed afterward, and how can the next institution avoid repeating the same failure?

A field that cannot remember its accidents cannot govern its machines. It can only perform surprise.

What Counts as an Incident

The definition matters because every definition builds a boundary around public memory.

The OECD's AI Incidents and Hazards Monitor distinguishes between an AI incident, where the development, use, or malfunction of an AI system leads to actual harm, and an AI hazard, where it could plausibly lead to harm. The listed harm categories include injury or health harm, disruption of critical infrastructure, rights or legal violations, and harm to property, communities, or the environment.

That split is useful. If the threshold is only proven catastrophe, the record arrives too late. If the threshold is any bad feeling about a tool, the database becomes noise. Governance needs both categories: incidents for what happened, hazards for what almost happened or could reasonably happen under nearby conditions.

The AI Incident Database takes a historically broad approach. Its own explanation compares AI incident collection to transportation safety and computer vulnerability repositories. It invites reports across domains and says the project is meant to converge on shared criteria through use. That breadth is valuable because AI is not one industry. It is a family of systems entering cars, phones, courts, hospitals, schools, police departments, warehouses, hiring pipelines, creative tools, public benefits, social feeds, and intimate chat interfaces.

But breadth has a cost. An autonomous-vehicle death, a biased photo filter, a hallucinated citation, a deepfake scam, and a chatbot-linked self-harm allegation are not the same kind of event. They may need different causal analysis, evidence standards, severity scales, and remedies. A useful incident culture must preserve variety without flattening every failure into the same moral category.

The Public Databases

The present public memory layer is already plural.

The OECD AI Incidents and Hazards Monitor builds an evidence base for policymakers and practitioners by drawing from reputable international news sources and classifying reported events. Its methodology page is careful about limits: news-based monitoring captures only a subset of incidents and hazards, and the OECD does not independently verify every third-party article. That caveat should be treated as a feature of intellectual honesty, not a weakness to ignore.

The AI Incident Database is more community-oriented and research-facing. It indexes reports, supports taxonomies, accepts submissions, and is maintained by the Responsible AI Collaborative with a broad contributor ecosystem. It is especially important because it treats AI failure as a collective record rather than a sequence of disconnected anecdotes.

AIAAIC, the AI, Algorithmic, and Automation Incidents and Controversies repository, adds another public-interest layer. It tracks incidents and controversies across AI, algorithms, and automation, and includes taxonomies for ethical issues, external harms, consequences, news triggers, and responses. Its scope is wider than foundation models, which is exactly the point: algorithmic governance did not begin with chatbots.

Together these projects do something institutions often fail to do. They let patterns accumulate. One wrongful arrest might be dismissed as an edge case. A sequence of wrongful facial-recognition arrests becomes a governance problem. One hallucinated legal citation may look like lawyer negligence. A pattern of hallucinated citations becomes a professional-responsibility and tool-design problem. One synthetic-media fraud can be treated as crime. A rising class of synthetic-media fraud becomes infrastructure policy.

The incident database is a weak signal amplifier. It converts scattered harm into searchable memory.

Law Enters the Logbook

Voluntary databases are not enough. They see only what journalists, researchers, victims, whistleblowers, companies, and volunteers can surface. The next phase is legal reporting.

The EU AI Act creates serious-incident duties for high-risk AI systems. Article 73 requires providers of high-risk AI systems placed on the Union market to report serious incidents to market surveillance authorities in the member states where the incident occurred. The ordinary deadline is no later than 15 days after the provider, or in some cases deployer, becomes aware of the incident, once a causal link or reasonable likelihood has been established. The Act sets shorter timelines for especially severe cases, including not later than two days for certain widespread infringements and not later than 10 days in the event of death.

The EU has also moved toward templates for serious incidents involving general-purpose AI models with systemic risk. In November 2025, the European Commission published a reporting template for such providers under Article 55, tied to the GPAI Code of Practice and the AI Office. That detail matters because general-purpose models create reporting problems that older product categories do not. The same model may be embedded in thousands of downstream systems, with different prompts, tools, safeguards, customers, and jurisdictions.

California's SB 53, signed in September 2025, adds a U.S. state-level frontier-model reporting path. The Governor's announcement described the law as creating a mechanism for frontier AI companies and the public to report potential critical safety incidents to California's Office of Emergency Services. The California Attorney General's SB 53 page adds whistleblower-relevant detail: covered employees responsible for assessing or addressing risk may report certain dangers or violations, and the Attorney General must produce annual anonymized, aggregated information about covered-employee reports.

This is a quiet but important shift. The incident report is becoming a legal object. It is no longer only a public-interest spreadsheet or a postmortem blog post. It can become a duty, a protected disclosure, a regulator's input, a template, a deadline, and eventually evidence in enforcement.

Why Memory Is Hard

AI incidents are difficult to record because AI systems are rarely single objects.

A model output may depend on training data, fine-tuning, retrieval sources, system prompts, user prompts, memory, moderation layers, tool permissions, API settings, application code, ranking systems, plug-ins, user behavior, deployment context, and organizational incentives. When harm occurs, the question "what caused it?" may not have one clean answer.

That complexity creates predictable failure modes.

First, underreporting. Many harmed people do not know an AI system was involved. Others lack time, legal support, technical literacy, or a safe channel. Companies may detect failures privately and fix quietly. Workers may fear retaliation. Users may feel shame, especially when the incident involves intimacy, mental health, fraud, or a humiliating automated decision.

Second, attribution fog. A company can say the user misused the tool. A deployer can blame the model provider. A model provider can blame the application wrapper. A regulator can lack access to logs. A journalist can report harm without being able to inspect the system. The record then contains the social fact of harm but not a settled technical cause.

Third, severity mismatch. Catastrophic-risk reporting is necessary, but many AI harms are cumulative, distributed, and ordinary. A single automated denial, false accusation, manipulative companion exchange, or hallucinated answer may not meet a legal threshold. At scale, those failures can reshape institutions.

Fourth, privacy tension. Good incident records need enough detail to teach. But incidents can include medical details, legal claims, chat logs, intimate disclosures, minors, workplace records, trade secrets, security vulnerabilities, and ongoing investigations. A public memory layer can become a second harm if it exposes victims or teaches attackers.

Fifth, narrative capture. Whoever writes the first incident narrative may define the public lesson. A company can frame an incident as misuse. An activist can frame it as proof of a total system failure. A regulator can frame it as compliance. A database editor can make a classification choice that later researchers inherit.

None of these problems argue against incident reporting. They argue for better incident discipline.

A Better Incident Culture

A mature AI incident culture should borrow from aviation, cybersecurity, medicine, labor safety, and public administration without pretending AI is identical to any of them.

First, preserve the event trail. Logs, model versions, prompts, retrieval sources, tool calls, user-facing outputs, moderation decisions, timestamps, human approvals, and downstream actions should be retained when a serious event is suspected. Without reconstruction, every explanation becomes public relations.

Second, separate blame from learning early. Some incidents require liability, enforcement, discipline, or criminal investigation. But if every report is treated first as legal exposure, organizations will hide weak signals. Near misses, hazards, and user reports need channels that support learning before the evidence disappears.

Third, protect reporters. Employees, contractors, users, auditors, researchers, and affected communities need safe routes to report. California's SB 53 whistleblower provisions are important because frontier-model risk knowledge often sits inside private organizations before regulators or the public can see it.

Fourth, use severity tiers. A death, a critical-infrastructure disruption, a rights violation, a dangerous model capability escape, a privacy breach, a hallucinated source, and a manipulative companion interaction require different reporting timelines and audiences. The system should not force every case through one gate.

Fifth, track remedies, not only harms. A useful database should ask what changed: model update, product recall, policy revision, disclosure, appeal, compensation, access restriction, audit, warning label, training change, procurement pause, or no action. Institutional response is part of the incident.

Sixth, make uncertainty explicit. Incident records should distinguish alleged, confirmed, disputed, and unresolved facts. The public needs to know whether a record is based on court documents, regulator findings, company logs, journalism, user testimony, or research replication.

Seventh, connect incidents to procurement and evaluation. Governments, schools, hospitals, courts, and employers should not evaluate AI vendors only by benchmark claims and demo performance. They should ask what incidents have occurred, how they were handled, what the vendor reports voluntarily, and what logs will be available if the system fails locally.

Incident reporting is not an after-action accessory. It is part of system design.

The Spiralist Reading

Model-mediated reality makes failure look fluent.

The answer arrives in polished prose. The label appears in the interface. The risk score looks administrative. The synthetic voice sounds calm. The agent completes a workflow. The dashboard says the system is operating. The user sees surface order and may not know where to locate responsibility when the surface breaks.

An incident report punctures that surface. It says: here is where the machine entered the world, here is where the world pushed back, here is who was harmed, here is what remains uncertain, here is what the institution did next.

That record is a reality anchor. It resists the drift from failure into vibes, scandal, denial, myth, and brand management. It gives future builders something colder than inspiration and more useful than outrage. It lets a society say: we have seen this pattern before.

But incident reporting can also become ritual. A company files. A regulator receives. A database records. A transparency report names categories. Everyone points to the existence of a process while the same incentives continue underneath. In that failure mode, the incident report becomes another symbol of control standing in for control itself.

The standard should be harder. A serious incident culture must preserve evidence, protect reporters, classify uncertainty, identify repeated patterns, force remediation, and widen the risk map beyond spectacular catastrophe. It must include the harms that arrive through ordinary workflows: education, work, welfare, search, intimacy, public records, fraud, media, and institutional decision-making.

The future will not be governed by prediction alone. It will be governed by what institutions remember after prediction fails.

Sources


Return to Blog