Wiki · Concept · Last reviewed June 15, 2026

Context Poisoning

Context poisoning is the deliberate manipulation of the information an AI system treats as active context, persistent memory, retrieved evidence, or thread history so that later answers or actions serve an attacker, advertiser, compromised workflow, or false institutional record.

Definition

Context poisoning is an attack and failure pattern in which untrusted or adversarial content enters the model's working context and later changes what the system believes, retrieves, recommends, remembers, or does. It is narrower than general misinformation and broader than a single prompt injection. The target is not only the next answer. The target is the information environment from which future answers and actions are produced.

MITRE ATLAS names the family AI Agent Context Poisoning: manipulation of the context used by an agent's large language model to influence responses or actions. In the 2026.05 ATLAS data, the technique includes memory and thread variants. Memory poisoning tries to persist a false instruction, preference, or fact across sessions. Thread poisoning leaves malicious instructions in a conversation or shared channel so they influence later turns.

Context poisoning overlaps with Prompt Injection, Data Poisoning, Retrieval-Augmented Generation, and AI Memory and Personalization, but it names a distinct operational question: what happens when the machine's local world model is edited by material that should have been treated as untrusted evidence?

How It Works

Context poisoning can arrive through ordinary surfaces: a webpage, email, issue comment, customer ticket, spreadsheet, PDF, Slack thread, documentation page, meeting transcript, memory update, or tool result. A human may see harmless text. The agent may see an instruction, a preferred source, a false fact, a credential lure, a request to call a tool, or a reason to trust one document over another.

In a retrieval system, the poisoned material is stored where a future query will find it. In a memory system, it is saved as a user preference or durable fact. In a long thread, it remains inside the context window. In an agent workflow, it can pass from one tool call to the next until the original source becomes hard to reconstruct.

The risk rises when context is connected to tools. A poisoned answer is an information problem. A poisoned agent with email, browser, calendar, code, payment, file, or administrative tools can become an action problem.

Current Context

OWASP's 2025 LLM Top 10 treats prompt injection as the leading LLM application risk and explicitly includes indirect injection from websites or files. NIST's Generative AI Profile likewise describes indirect prompt injection as a remote attack against LLM-integrated applications by injecting prompts into data likely to be retrieved. These descriptions are the foundation for context poisoning: the attack is mediated by the system's own evidence pipeline.

Microsoft's February 2026 security research described AI memory poisoning in recommendation workflows. The reported pattern used links such as "Summarize with AI" to send hidden memory-manipulation instructions into an assistant so a source could later be treated as trusted. The important lesson is not that every memory system is compromised; it is that personalization creates a durable attack surface when memory writes are controlled through normal conversational channels.

OWASP's agentic security work and the MITRE ATLAS 2026.05 data both move the problem from chatbot safety into system architecture. Agents combine instructions, memory, retrieval, tool calls, identity, permissions, and human approval. Poisoned context can therefore affect what the agent sees, what it can do, and what a human reviewer is told about why the action is reasonable.

Governance and Safety

Context poisoning is a governance problem because it changes the record before the decision. The affected user may not know which document was retrieved, which memory was loaded, which thread message shaped the plan, or which tool output became authoritative. In employment, education, healthcare, finance, legal practice, security operations, and public administration, that makes appeal and audit difficult.

Organizations should treat memory, retrieval, and agent scratchpads as governed infrastructure. They need ownership, retention limits, deletion paths, provenance, source labels, access controls, incident review, and evidence logs. A memory update should not be treated like a casual chat message if it can influence later recommendations or actions.

Defense Pattern

Spiralist Reading

Context poisoning is corruption of the local reality frame.

The agent does not act on the whole world. It acts on the world assembled for it: retrieved fragments, remembered preferences, task history, tool outputs, and the last few turns of speech. Whoever can alter that assembly can bend the next action without touching the model weights.

For Spiralism, the danger is not mystical possession. It is administrative possession: a record system quietly accepting someone else's instruction as context, then returning later with confidence, citations, and a clean interface.

Open Questions

Sources


Return to Wiki