Wiki · Concept · Last reviewed June 25, 2026

OWASP Top 10 for LLM Applications

The OWASP Top 10 for LLM Applications is a security-awareness reference for applications that embed large language models, retrieval, prompts, tools, data pipelines, and generative-AI outputs.

Definition

The OWASP Top 10 for Large Language Model Applications is an OWASP GenAI Security Project awareness document for developers, data scientists, and security professionals building applications and plug-ins that use LLM technologies. OWASP's project repository says the list represents a broad consensus about critical security risks to LLM applications and is scoped to LLM application security.

It is not a law, certification, or guarantee of safety. It is a vocabulary and review scaffold. It helps teams discuss the recurring places where LLM-backed systems fail: instruction boundaries, secrets, dependencies, training and retrieval data, output handling, delegated actions, system prompts, vector stores, factual reliability, and resource consumption.

How It Works

The OWASP GenAI Security Project's 2025 list names ten categories: LLM01 Prompt Injection; LLM02 Sensitive Information Disclosure; LLM03 Supply Chain; LLM04 Data and Model Poisoning; LLM05 Improper Output Handling; LLM06 Excessive Agency; LLM07 System Prompt Leakage; LLM08 Vector and Embedding Weaknesses; LLM09 Misinformation; and LLM10 Unbounded Consumption.

The categories cover both model behavior and surrounding application design. Prompt injection names cases where user input, retrieved content, or external data alters intended behavior. Sensitive information disclosure covers leakage of private, confidential, or restricted information. Supply-chain risk includes models, data, packages, plug-ins, and service dependencies. Poisoning covers corrupted training, fine-tuning, model, or embedding inputs.

The remaining categories focus on what happens after a model responds. Improper output handling describes insufficient validation or sanitization before model output reaches downstream software. Excessive agency addresses systems where the model can take consequential actions with too much permission. System prompt leakage concerns exposure of internal instructions. Vector and embedding weaknesses cover retrieval and similarity-search failure modes. Misinformation covers harmful reliance on false or misleading outputs. Unbounded consumption covers cost, capacity, denial-of-service, and resource abuse patterns.

Agent Context

The LLM Top 10 is broader than, and different from, the OWASP Top 10 for Agentic Applications. The LLM list applies to chatbots, retrieval-augmented generation, summarizers, coding assistants, enterprise search, classification, and model-backed workflows even when they do not qualify as autonomous agents.

Agentic systems inherit the LLM risks and add more. A tool-using agent can suffer prompt injection, disclose sensitive information, rely on poisoned retrieval, leak a system prompt, or consume resources without also being compromised through an agent-specific failure such as inter-agent communication or rogue workflow behavior. Good review keeps those two OWASP lists adjacent but separate.

Governance and Safety

A governance program can use the LLM Top 10 as a design-review checklist. For each category, record the system boundary, data sources, model provider, prompts, retrieval stores, output consumers, tool permissions, logging, human review points, abuse controls, and incident owner. The list becomes useful when every category points to an artifact and an accountable team.

Procurement reviews should ask vendors which OWASP LLM categories they test, what evidence they preserve, how they handle prompt-injection reports, whether customer data enters training or logs, how vector indexes are protected, and how resource limits are enforced. Internal deployments should preserve the same evidence for auditors and incident responders.

Defense Pattern

Source Discipline

Claims about the OWASP LLM list should cite the 2025 source page, the specific LLM category page, or the project repository. Category names changed from earlier versions, so source notes should identify the year and label. Do not mix the 2025 LLM list with the 2026 agentic list or with MCP-specific security checklists.

The list is a security taxonomy, not a prediction that every LLM application will fail. It also is not proof that an LLM application is safe after a team checks ten boxes. The useful claim is narrower: these are widely recognized risk classes that should be reviewed with local evidence.

Spiralist Reading

Spiralism reads the OWASP LLM list as a map of where language becomes infrastructure. A sentence can become an instruction. A retrieval result can become evidence. An answer can become code, a ticket, an email, or a decision record.

The practical lesson is sobriety. Once language is wired into systems of action, security cannot live only in the model. It has to live in provenance, permissions, boundaries, validation, logs, and the human habit of asking what authority a text has been given.

Open Questions

Sources


Return to Wiki