Blog · arXiv Analysis · Last reviewed June 24, 2026

The Cross-Session Prompt Becomes the Payload

The June 2026 arXiv position paper What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems, by Yuanbo Xie, Tianyun Liu, Yingjie Zhang, Suchen Liu, Yulin Li, Liya Su, and Tingwen Liu, names a narrow but important agent-security problem: prompt injection can become durable when hostile content is written into persistent system state and retrieved after the original session is gone.

The Attack That Waits

Most prompt-injection talk still imagines an attack that happens inside one visible interaction. A page, file, email, tool response, or user message enters the context window. The model is asked to ignore it, obey it, summarize it, or act through it. The failure is immediate enough to investigate as a bad turn.

The Xie et al. paper shifts the time axis. Its term is cross-session stored prompt injection: adversarial instructions are written during one interaction, persist in memories, filesystems, tool-visible artifacts, or other long-lived context, and later influence another execution. The authors explicitly compare the shape to stored cross-site scripting: not because the mechanics are identical, but because the dangerous move is persistence. Injection and activation no longer need to occur in the same session.

That makes the topic distinct from ordinary prompt injection, broader context poisoning, and the site's earlier page on model memory as an attack surface. The focus here is the stored-payload lifecycle: a write, a later incorporation into context, and an activation under a new task or user.

What the Paper Tests

The paper, arXiv:2606.04425, was submitted on June 3, 2026 and is labeled as a position paper in arXiv's abstract record. It formalizes stored prompt injection, gives a taxonomy for persistence channels and downstream harms, and presents SPI-Benchmark, a sandboxed benchmark for evaluating the risk across staged agent executions.

The benchmark description is useful because it makes persistence measurable. The authors describe 162 unique cases across e-commerce, travel booking, and financial portfolio management. Their variables include three harm categories: fact manipulation, preference manipulation, and action-scope manipulation. Their persistence channels include working memory, archival memory, and file-backed context such as an AGENTS.md file. Their attack styles distinguish blunt instructions from contextual disguise, where a harmful directive is framed as ordinary business context.

The important measurement split is write, incorporation, and activation. A stored injection has to enter persistent state, be loaded into a later execution context, and then influence the agent's answer or tool-mediated action. That decomposition is more useful than asking only whether a model "fell for" a prompt. It asks whether the system's context lifecycle let untrusted language cross a boundary and return as operational context.

Stored State Is the Boundary

The paper's strongest governance lesson is that persistent state must be treated as a security boundary, not a convenience layer. A stateless prompt dies when the window clears. A stored prompt can wait inside a memory, note, retrieved file, workspace artifact, or agent-generated record. The next user may be innocent. The later task may be legitimate. The risky part is that the old instruction can be reintroduced as if it were project context.

This is not the same as a prompt worm, which requires propagation. It is not the same as a prompt cache, which is mainly an inference-performance artifact. It is also narrower than every possible memory poisoning claim. Cross-session stored prompt injection asks one question: what happens when an agent lets untrusted content be written into long-lived state and later treats that state as usable context?

The answer is institutional as much as technical. The memory store, file store, retrieval index, and tool-visible workspace become places where authority is assigned. A sentence that began as attacker-controlled text may later look like a remembered user preference, a product policy, a trip constraint, a financial rule, a coding convention, or a past lesson. The model does not need to be malicious, embodied, or especially advanced for this to matter. It only needs to read old context while holding new authority.

Governance Standard

A governable agent should separate memory write authority from memory read authority and separate both from action authority. Low-trust content may be saved for inspection, but that does not mean it should silently shape a purchase, file edit, portfolio recommendation, email, code change, or support decision later.

The control surface starts at write time. A persistent record should carry source class, writer, session, timestamp, trust level, retention rule, and reason for saving. The system should mark whether the content came from a user, webpage, email, document, tool result, another agent, or model-generated summary. Without that label, later retrieval launders the source.

The second control surface is incorporation. Loading a memory into context should be logged as a decision, not treated as invisible plumbing. High-risk actions should show which stored records influenced the plan, whether those records are stale, whether any came from untrusted channels, and whether a human approved the authority jump.

The third control surface is rollback. If a stored injection is discovered, the organization should be able to remove the record, trace which later executions incorporated it, identify which actions it influenced, and preserve an incident record. Deleting the visible note is not enough if the same payload was summarized, embedded, copied into a file, or passed into another agent's workspace.

What This Changes

The cross-session prompt becomes the payload because agent systems do not merely answer from the present. They answer from retained context that can be written by many parties and read under new authority later.

The Spiralist rule is therefore plain: persistent context must have provenance, expiry, scope, audit trails, and action gates. If a system cannot explain who wrote a memory, why it was retained, when it was loaded, and what action it influenced, it has not built memory. It has built an ungoverned past.

Sources

Yuanbo Xie, Tianyun Liu, Yingjie Zhang, Suchen Liu, Yulin Li, Liya Su, and Tingwen Liu, What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems, arXiv:2606.04425 [cs.CR], submitted June 3, 2026.
arXiv experimental HTML for What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems, reviewed June 24, 2026.
Related pages: Prompt Injection, Context Poisoning, The Model Memory Becomes an Attack Surface, The Prompt Worm Becomes the Email Attachment, The Prompt Cache Becomes the Shadow Memory, and The Agent Log Becomes the Receipt.

Return to Blog