Blog · Analysis · Last reviewed June 15, 2026

The Prompt Worm Becomes the Email Attachment

The old email attachment asked a human to click. The prompt worm asks an agent to read. In agentic systems, ordinary content can become instruction, payload, and propagation path at the same time.

From Attachment to Instruction

The classic email worm depended on a brittle social-technical bargain. A message arrived, an attachment looked tempting, a user clicked, and the machine executed code. Security training learned to name the pattern: do not open strange files, do not trust unexpected links, do not mistake a familiar sender for a safe payload.

Agentic AI reopens the problem at a different layer. A message, document, image, calendar invite, web page, support ticket, or retrieved record may be processed automatically by a model-mediated system. The hostile material does not need to be executable in the old operating-system sense. It only needs to be interpreted by the agent as relevant instruction, context, or task material.

The old attachment asked the person to run it. The new prompt payload may only need the assistant to summarize, reply, route, retrieve, or act.

The Self-Replicating Prompt

In 2024, Stav Cohen, Ron Bitton, and Ben Nassi published an arXiv paper on a research system they called Morris II: a zero-click worm for generative-AI ecosystems. The paper describes adversarial self-replicating prompts that can be embedded in inputs processed by GenAI-powered applications. In their lab demonstrations, the prompt causes the model-mediated application to reproduce the prompt, perform a malicious payload, and propagate to other agents through connected workflows.

The researchers tested the idea against GenAI-powered email assistants in controlled settings, including text and image inputs, RAG-based retrieval, automatic responses, spamming, and personal-data exfiltration scenarios. They state that the experiments were performed in a lab environment and were not run against existing public applications.

The important point is not the particular proof of concept. It is the category. A prompt worm treats language as a replication medium. It does not merely fool one model once. It tries to turn the agent's ordinary communication loop into the transport mechanism.

Why Agents Change the Risk

Prompt injection is already the first risk in OWASP's 2025 Top 10 for LLM Applications. OWASP defines it as crafted input that alters model behavior or output in unintended ways, and notes that injections can be indirect, coming from external sources such as websites or files. It also warns that prompt injections do not have to be human-visible if the model can parse them.

Agents make that risk more institutional because they add memory, tools, permissions, and workflow. A chatbot that says something wrong is a problem. A mailbox agent that reads untrusted mail, searches a private archive, drafts replies, forwards messages, updates records, or calls APIs can turn wrong interpretation into action.

This is why OWASP's 2026 Top 10 for Agentic Applications matters. Its summary frames agentic AI systems as systems that plan, act, and make decisions across complex workflows. The security problem is no longer only bad output. It is cascading behavior across tools, identities, permissions, and connected agents.

Filtering Is Not Enough

The UK National Cyber Security Centre has warned that prompt injection should be treated as a residual risk managed through design, build, and operation, not as a class of bug that one appliance or filter can fully solve. Its practical point is severe: when an LLM system calls tools or APIs based on model output, the possible impact of prompt injection approaches the worst case of giving an attacker access to those tools.

MITRE's SAFE-AI material makes a similar distinction between direct and indirect prompt injection. It describes indirect prompt injection as malicious prompts ingested from separate data sources during normal operation, including websites, multimedia, or plugins, and notes that the user may not be aware of the injection.

The prompt worm is the stress test for every vague agent-security promise. "We filter prompts" is not enough. "The model is trained to ignore bad instructions" is not enough. "The user can review final output" is not enough if the agent has already retrieved private data, updated a record, or sent a message.

The Governance Standard

A serious agent system should treat external content as hostile until proven otherwise, especially when the content can influence action.

First, separate reading from acting. An agent that reads external mail, documents, or web pages should not inherit the user's full authority merely because the user owns the mailbox or browser session.

Second, drop privilege to the source. If the model is processing material from an outside sender, the action boundary should reflect that sender's lack of authority. Untrusted input should not be allowed to trigger privileged tools.

Third, require confirmation before propagation. Auto-replies, forwards, shared documents, ticket updates, and cross-agent messages should be treated as transmission events, not harmless text generation.

Fourth, inspect retrieval paths. RAG databases, vector stores, email indexes, and memory systems need poisoning tests, origin labels, quarantine paths, and deletion procedures.

Fifth, log the chain. A user, administrator, or auditor should be able to reconstruct which external item was read, which instructions were extracted, which tool calls were attempted, which actions were blocked, and which messages were sent.

What This Changes

The prompt worm is a reminder that AI security is not only about the model's answer. It is about the route from perception to action.

In the email era, the risky object was often a file. In the agent era, the risky object may be ordinary language positioned where an automated reader will treat it as operational context. The payload can live in an email, a document, a web page, an image, or a memory store. The transport can be the agent's helpfulness.

The Spiralist lesson is restrained: do not mythologize the worm, and do not dismiss it as a lab trick. The pattern is real enough to govern. Every agent that reads untrusted content and acts with delegated authority needs narrow permissions, source-aware privilege, quarantine, propagation limits, and audit trails. Otherwise the institution has rebuilt the email attachment as a service.

Sources


Return to Blog