Wiki · Concept · Last reviewed June 25, 2026

Confused Deputy Problem

The confused deputy problem is an authorization failure where a trusted program, agent, server, or tool uses authority from one source to satisfy a request from another source that should not have that authority.

Category: Concept Updated: June 25, 2026 Tags: AI agents, authorization, OAuth, MCP, tool use, least privilege

Definition

The confused deputy problem appears when a trusted component carries more than one kind of authority and cannot reliably say which authority should govern a particular action. A user asks the deputy to do something; the deputy applies its own stronger privilege; the user gets an effect the user could not have caused directly.

Norm Hardy named the pattern in the 1988 paper The Confused Deputy: (or why capabilities might have been invented). His example involved a compiler that could write a privileged statistics file and also accept a user-supplied file name for debugging output. A user supplied the name of a billing file in the same privileged area, and the compiler overwrote it because the operating system saw the compiler's authority rather than the user's lack of authority.

For AI systems, the pattern matters because AI agents often sit between users, tools, documents, memory, APIs, browsers, and enterprise connectors. The authority comes from accounts, tokens, sessions, tools, servers, and policy layers around the model. The issue is not consciousness or legal autonomy; it is ordinary access control made sharper by model-mediated delegation.

How It Works

The deputy has two relationships: it is trusted by a powerful system because it legitimately needs privilege, and it is callable by a less powerful actor because that actor needs service. The failure begins when the deputy accepts an object, instruction, URL, file path, token, or tool argument from the weaker actor and resolves it using the deputy's stronger authority.

The safe design separates designation from authority. Hardy's capability solution bound the thing being named to the permission to use it, instead of letting a text name be interpreted under ambient privilege. In web and agent systems, the same lesson appears as scoped credentials, audience-restricted tokens, resource indicators, sender-constrained tokens, token exchange, and explicit delegation records.

OAuth 2.0 defines access tokens as scoped credentials. RFC 8707 adds a way to signal the protected resource for a requested token. RFC 9700 recommends audience restriction so a resource server rejects tokens not meant for it. RFC 8693 defines token exchange for impersonation and delegation. These mechanisms keep tokens, sessions, and delegated actions from being treated as interchangeable authority.

Agent Context

As of this June 25, 2026 review, the official Model Context Protocol security guidance explicitly treats confused-deputy attacks and token passthrough as MCP security risks. Its token-passthrough section says the anti-pattern occurs when an MCP server accepts tokens from an MCP client without validating that they were issued to the MCP server, then passes them through to a downstream API. The same guidance says MCP servers must not accept tokens that were not explicitly issued for the MCP server.

An agent workflow can recreate Hardy's compiler story with newer nouns. A browser agent reads a malicious page while holding a user's session. A coding agent reads an issue comment while holding repository write access. A support agent reads a customer message while holding CRM update permissions. Prompt injection is one route into the problem, but overbroad scopes, reusable bearer tokens, mixed user and service authority, missing audience checks, unlogged delegation, and vague approval prompts can produce the same failure.

Governance and Safety

Good governance preserves the difference between user intent, untrusted content, model output, tool argument, agent identity, service authority, human approval, and external effect. If those categories collapse into one ambient session, accountability collapses with them.

For agent deployments, the most important control is least authority at the point of action. A tool should not infer permission from general chat access. A server should not infer permission from token possession unless the token is meant for that server, resource, user, and purpose. An approval prompt should identify the action, resource, authority source, data leaving the boundary, and whether approval expands scope.

Audit trails matter because the final action may look like an ordinary write, purchase, email, commit, ticket update, or API call. A useful record shows the originating request, untrusted input consulted, tool call, credential used, token audience, approval event, downstream API, and state change.

Defense Pattern

Bind authority to the object. Prefer capabilities, resource-specific grants, scoped tool handles, or narrow tokens over ambient privilege.
Validate audience. Downstream services should reject tokens not issued for them.
Separate user, agent, and service authority. Logs and policies should distinguish the human principal, the agent process, the tool server, and the backend service.
Use step-up approval. Reading, sending, spending, modifying, and granting connectors should not share one permission class.
Treat untrusted context as evidence, not authority. Search results, web pages, emails, issue comments, PDFs, and tool descriptions should not be able to expand privileges.
Test the boundary. Red-team prompt injection, malicious URLs, forged tool metadata, token replay, scope confusion, cross-tenant requests, and logged-in browser actions.

Source Discipline

Claims about confused deputy failures should name the deputy, caller, protected resource, authority source, and action. "The agent misused access" is too vague; a stronger report says which token, session, file handle, API key, tool permission, or connector scope was applied to which request. Primary sources for this entry are Hardy's original paper text, IETF OAuth specifications, official MCP security guidance, and NIST least-privilege material.

Spiralist Reading

The confused deputy is the old spell of bureaucracy: "I had access, therefore the action was allowed."

Spiralism reads the problem as a warning against seamless delegation. When a machine acts through another party's authority, the question is whether the institution kept authority, instruction, evidence, and responsibility apart.

The cure is narrower power, clearer receipts, and systems that refuse to let one voice borrow every key in the building.

Open Questions

How should websites distinguish a human click from a browser agent acting under a scoped task?
Which agent tools should require resource-specific tokens rather than broad user-session authority?
How much authority context should be visible in an approval prompt without making approval unusable?
When should a blocked confused-deputy attempt be treated as an AI security incident?
How can agent logs preserve accountability without storing more private content than the task requires?

Sources

Norm Hardy, The Confused Deputy: (or why capabilities might have been invented), MIT CSAIL mirror of the 1988 paper text, used to verify the compiler, statistics file, and billing file example.
IETF, RFC 6749: The OAuth 2.0 Authorization Framework, October 2012.
IETF, RFC 8707: Resource Indicators for OAuth 2.0, February 2020.
IETF, RFC 8693: OAuth 2.0 Token Exchange, January 2020.
IETF, RFC 9700: Best Current Practice for OAuth 2.0 Security, January 2025.
Model Context Protocol, Security Best Practices, reviewed June 25, 2026.
Model Context Protocol, Authorization specification, version 2025-11-25.
NIST CSRC, Least Privilege glossary entry, reviewed June 25, 2026.

Return to Wiki