Blog · arXiv Analysis · Last reviewed June 25, 2026

The RAG Document Becomes the Token Bomb

A retrieval system can answer correctly and still be under attack. The bill, latency, and retrieved-document trail become part of the security evidence.

Correct Answer, Inflated Cost

Most RAG security talk asks whether a retrieved document makes the model wrong. That is not the only failure. A retrieved document can keep the answer correct while making the model spend far more tokens to reach it. The user sees a plausible answer. The provider sees extra latency, accelerator time, and billable output. The attacker has shifted damage from truth to operating cost.

This matters because RAG systems often trust external or semi-external knowledge: web pages, support articles, policy repositories, developer docs, ticket histories, and private knowledge bases. Once those sources are part of the prompt, a poisoned document does not need to look like a direct instruction. It only needs to be retrieved and to make the generation phase heavier.

The Paper

arXiv lists Chengliang Liu, Liangbo Ning, Yujuan Ding, and Wenqi Fan's Inference Cost Attacks for Retrieval-Augmented Large Language Models as arXiv:2606.02643v1 [cs.CR], submitted May 31, 2026. The arXiv record says the work was accepted at The ACM Web Conference 2026, and the PDF identifies WWW 2026 as April 13-17, 2026, in Dubai, United Arab Emirates.

The paper names the attack class Retrieval-Augmented Inference Cost Attack, or RA-ICA. Its defensive significance is that the attack target is not answer accuracy alone. The adversarial objective is to increase token consumption while preserving the final answer's integrity, making the failure harder to catch with ordinary correctness tests.

The Threat Model

The paper proposes CREEP, short for Computational Resource Exhaustion via External Poisoning. In the studied black-box setting, the attacker can introduce documents into the knowledge base, query the victim RAG system, observe the response and cited source documents, and measure generation cost through output token counts. The attacker does not need internal access to the retriever or model weights.

The framework tests two broad agent styles: rewriting retrieved documents and generating new pseudo-documents. It also compares three high-level manipulation strategies: adding decoy reasoning work, inserting contradictions that require extra reconciliation, and leaving the agent to pursue the cost-amplification objective more directly. The paper then uses Memory-Augmented Group Relative Policy Optimization, MA-GRPO, to improve the document-generation policy by learning from high-performing prior adversarial documents.

For governance, the important unit is not the individual prompt. It is the retrievable document. If a shared knowledge source is poisoned, many ordinary user queries may retrieve the same costly context without users doing anything adversarial.

What the Experiments Found

The experiments use Natural Questions, HotpotQA, and MS MARCO, with 100 training, 100 validation, and 100 testing instances sampled for each. The RAG setup uses Contriever to retrieve the top five documents from each dataset corpus, then passes them to qwen-turbo, GPT-5, claude-sonnet-4, or deepseek-r1 with temperature set to zero.

The paper evaluates Retrieval Rate, weighted Answer Alignment, weighted Attack Concealment, and weighted Token Consumption Ratio. In the default RAG configuration, CREEP+ -GContra reaches a 92.00 percent retrieval rate on Natural Questions with a 2.52x weighted token-consumption ratio. On HotpotQA, CREEP+ -RTask reaches 100 percent retrieval rate while maintaining 97.00 percent weighted answer alignment. Across tested victim models, the paper reports a peak 13.12x weighted token-consumption ratio on GPT-5 with an 85.00 percent retrieval rate.

The strongest reading is not "all RAG is broken." It is narrower and more useful: correctness-only evaluation can miss resource exhaustion, and output token counts should be treated as security telemetry, not only finance telemetry.

Governance Reading

This page belongs beside Retrieval-Augmented Generation, Context Poisoning, Prompt Injection, and the token meter as an AI budget. The new angle is cost integrity. A RAG answer can be faithful to the retrieved evidence and still represent a denial-of-wallet or latency attack.

Defensive review should therefore include retrieval-source provenance, document admission controls, anomalous-output-token alerts, per-source cost attribution, source-level quarantine, answer-correctness checks, and latency budgets. Incident response should ask which retrieved document inflated the run, how it entered the corpus, which queries retrieved it, whether it crossed tenants or products, and whether similar documents remain indexed.

Procurement teams should also separate ordinary unit cost from adversarial cost exposure. A vendor demo that reports average token cost under clean retrieval does not answer how the system behaves when the corpus contains plausible but computationally expensive distractors.

Limits

The page should not turn the benchmark into a deployment prevalence claim. The experiments use three QA datasets, one retriever family, top-five retrieval, output tokens as the cost proxy, and LLM-judge measurements for answer alignment and attack concealment. Real systems may have different retrievers, context packers, citation policies, answer caps, rate limits, caches, and source-moderation rules.

The paper also assumes the attacker can get adversarial material into a retrievable knowledge source and can observe enough output and source data to optimize. Those assumptions fit some web-connected and transparency-heavy RAG systems better than closed corpora with strict ingestion review.

Cost-Attack Receipt

A RAG cost-attack receipt should record: query, retrieved document IDs, source URLs or corpus IDs, ingestion timestamp, embedding model, retriever version, ranking position, prompt length, output token count, latency, cache status, model endpoint, answer-correctness verdict, source-level cost contribution, anomaly threshold, owner of the indexed source, quarantine action, and follow-up search for related poisoned documents.

Sources

Chengliang Liu, Liangbo Ning, Yujuan Ding, and Wenqi Fan, Inference Cost Attacks for Retrieval-Augmented Large Language Models, arXiv:2606.02643v1 [cs.CR], submitted May 31, 2026.
Primary arXiv versions checked: PDF and experimental HTML, reviewed for authorship, venue metadata, threat model, CREEP and MA-GRPO framing, datasets, victim models, metrics, reported results, and scope constraints.
Related pages: Retrieval-Augmented Generation, Context Poisoning, Prompt Injection, and The Token Meter Becomes the AI Budget.

Return to Blog