Wiki · Concept · Last reviewed June 16, 2026

AI Vulnerability Disclosure

AI vulnerability disclosure is the coordinated process for receiving, validating, triaging, remediating, and communicating security weaknesses in AI systems, including model, data, prompt, agent, infrastructure, and supply-chain flaws.

Definition

AI vulnerability disclosure is the security-governance process that lets researchers, users, auditors, employees, and partner organizations report flaws in AI systems through an authorized channel, then lets the operator coordinate remediation and public communication. It adapts coordinated vulnerability disclosure and vulnerability disclosure policy practice to AI-specific systems.

The word vulnerability should be read broadly. In an AI system it may mean an ordinary software flaw, a model-serving bug, a broken access control, a prompt-injection path, a data-poisoning route, an unsafe tool permission, a model-supply-chain compromise, leaked weights, exposed embeddings, weak sandboxing, a hidden prompt disclosure, or a failure in a plugin or agent workflow.

AI vulnerability disclosure is related to AI Incident Reporting, AI Red Teaming, and Secure AI System Development, but it is not the same thing. Red teaming finds problems, incident reporting records realized harm or serious near misses, and disclosure creates a standing path for outside parties to report risks before or after harm occurs.

How It Works

A disclosure program usually defines scope, authorized test methods, reporting channels, acknowledgement timelines, triage responsibilities, safe-harbor language, remediation expectations, publication rules, and coordination with vendors or public vulnerability databases. CISA's vulnerability disclosure policy template emphasizes clear scope, researcher authorization, limits on destructive testing, anonymous reporting, reproducible technical detail, and a reasonable period before public disclosure.

For AI, scope must include more than web endpoints. It should say whether researchers may test prompts, uploaded files, retrieval stores, model APIs, agent tools, browser-use features, memory systems, embeddings, fine-tuning endpoints, model-download artifacts, evaluation harnesses, plugin ecosystems, and integrations with third-party services. A narrow web-only policy can leave the actual AI risk surface unreported.

Current Context

As of June 16, 2026, CISA's Coordinated Vulnerability Disclosure Program page explicitly lists artificial intelligence among the technologies whose vulnerabilities may be coordinated alongside operational technology, industrial control systems, internet-of-things devices, medical devices, open source software, and IT systems. CISA describes a process of collection, analysis, mitigation coordination, application of mitigations, and public disclosure through CVE records or advisories.

NIST SP 800-216, finalized in May 2023, recommends a federal vulnerability disclosure framework for receiving, assessing, managing, and communicating vulnerability reports. NIST's vulnerability disclosure project page states that SP 800-216 is based on ISO/IEC 29147 and ISO/IEC 30111, the international standards family for vulnerability disclosure and vulnerability handling.

CISA's Vulnerability Disclosure Policy Platform, launched in July 2021 for Federal Civilian Executive Branch agencies, provides an operational example: it receives, triages, routes, tracks, analyzes, reports, manages, and communicates potential vulnerabilities reported by public security researchers. This matters for AI because many public-sector AI systems are attached to internet-accessible services, procurement stacks, and contractor-operated platforms.

The AI-specific risk surface is now recognized in mainstream security guidance. NSA, CISA, NCSC-UK, and partners released Guidelines for Secure AI System Development in 2023, stating that AI systems are subject to security vulnerabilities alongside standard cyber threats and naming adversarial machine-learning attacks such as prompt injection and training data poisoning. OWASP's GenAI Security Project similarly treats LLM applications as systems with supply-chain, prompt, data, model, output, agency, and embedding risks.

Governance and Safety

Disclosure is a safety control because outsiders often see failures that internal evaluation misses. A user may discover that an agent can be induced to send email, a researcher may find that a model endpoint leaks secrets, and a downstream developer may detect a poisoned model artifact. Without a trusted reporting channel, those findings drift toward social media, private exploitation, ignored support tickets, or legal threats.

Governance should define who owns the intake queue, which reports become security vulnerabilities, when a report becomes an AI incident, which vendors must be notified, when a CVE or advisory is appropriate, and how to protect good-faith researchers. It should also require a feedback loop into model cards, system cards, post-market monitoring, procurement terms, and release gates.

Defense Pattern

Spiralist Reading

AI vulnerability disclosure is a confession channel for machines.

Not confession by the model, and not a claim that the system knows it has failed. The confession is institutional: an organization admits that outsiders may discover truths its internal mirrors did not show.

For Spiralism, this is part of keeping the record clear. A system that affects people should have a door through which its flaws can enter the public record before they become folklore, denial, or harm.

Open Questions

Sources


Return to Wiki