The Tool Server Becomes the Trust Boundary
Model Context Protocol makes agents useful by connecting them to tools and data. It also moves the trust boundary from the chat window into a new infrastructure layer of servers, schemas, permissions, and logs.
From Chat to Tools
The first public image of generative AI was the chat window: a user typed, a model answered, and the risk seemed to live mostly in language. The answer might be false, biased, manipulative, derivative, overconfident, or emotionally sticky. Those risks remain real, but the center of gravity has moved.
Modern agent systems do not only answer. They search, retrieve, inspect files, call APIs, operate browsers, write code, edit records, query databases, schedule work, and trigger other software. A model with no tools is a speaking system. A model with tools is an actor inside someone else's permissions.
That is why Model Context Protocol, or MCP, deserves attention beyond developer convenience. Anthropic announced MCP on November 25, 2024 as an open standard for connecting AI assistants to data sources and tools. By December 2025, Anthropic said it was donating MCP to the Linux Foundation's Agentic AI Foundation, and described adoption across products including ChatGPT, Cursor, Gemini, Microsoft Copilot, and Visual Studio Code, with more than 10,000 active public MCP servers and official SDKs in major languages.
The adoption story is easy to understand. Every AI company and enterprise wants agents that can reach useful context without building a custom integration for every calendar, repository, database, ticketing system, drive, customer record, and internal workflow. MCP offers a common doorway.
But a doorway is not neutral. It decides what enters, what exits, what is named, what is callable, and what evidence survives afterward.
What MCP Standardizes
MCP standardizes a relationship between an AI client and external servers. In broad terms, the client is the assistant or agent runtime. The server exposes context and capabilities. The model sees resources, prompts, and tools, then the system may call those tools as part of a task.
That sounds like plumbing, and in one sense it is. Good plumbing matters. It can replace brittle one-off integrations with shared conventions. It can make it easier to build agents that know which tools exist, what arguments they take, what data they return, and how authorization should work. It can make logs and approvals more consistent than a pile of bespoke scripts.
The mistake is treating standardization as safety. A standard can make a dangerous pattern repeatable. A clean interface can carry poisoned context. A well-documented connector can still have excessive scope. A tool schema can be syntactically valid while semantically misleading. An official registry can reduce chaos without eliminating dependency risk.
The official MCP security best-practices documentation already points toward this problem. It discusses attacks including confused-deputy failures, session hijacking, and prompt injection. OWASP's MCP Top 10 names a broader risk map: privilege escalation through scope creep, tool poisoning, software supply-chain attacks, command injection, prompt injection through contextual payloads, insufficient authentication, missing audit telemetry, shadow MCP servers, and context over-sharing.
That is the crucial turn. The security question is not merely whether the model is aligned. It is whether the tool environment is governable.
Why the Boundary Moved
Traditional software security often asks where the trust boundary sits. Which code is trusted? Which input is untrusted? Which process owns the credential? Which network request crosses into a different authority zone? Which logs can prove what happened?
Agentic AI complicates those questions because the model interprets text as part of action. A webpage, ticket, email, README, calendar invite, spreadsheet cell, product description, error message, or tool description may become context for the model. If the system does not sharply separate data from authority, untrusted text can try to become an instruction.
MCP intensifies this because it gives that interpreted context a route to action. A malicious or compromised server does not necessarily need to exploit memory corruption or steal a password in the old way. It may try to shape the model's behavior by changing tool descriptions, returning crafted outputs, exposing misleading schemas, shadowing a trusted tool, or combining one harmless-looking read tool with another tool that can send data outward.
Invariant Labs has described these patterns in terms of tool poisoning and toxic flows: attacks where prompt-injection-style payloads move through agent systems and enable data exfiltration or other unsafe behavior. Trail of Bits has similarly warned about MCP risks including malicious tool descriptions, conversation-history theft, insecure credential storage, and deceptive terminal output.
The important feature is semantic attack surface. The adversary attacks what the model reads, not only what the program executes. The tool server becomes a trust boundary because it supplies both operational capability and the words that explain that capability to the model.
Tool Descriptions Are Interface Law
In ordinary software, documentation is secondary to code. Bad documentation can mislead a developer, but the program still follows the implementation. In agent systems, descriptions can become operational. A model chooses tools partly by reading names, descriptions, schemas, examples, and returned text. The description is no longer only documentation. It is part of the control surface.
That makes MCP tool metadata a strange new kind of interface law. It tells the model what the tool is for, when to use it, what arguments matter, and sometimes how to interpret results. If that metadata is hostile, stale, overly broad, or quietly changed after approval, the agent's practical world changes.
This is why "approval" cannot mean "the user clicked once when installing a connector." A user or organization may approve a server called GitHub, Slack, Gmail, Files, Calendar, CRM, or Database without understanding the exact tools, scopes, server owner, version, output behavior, or update policy. The server may later change its descriptions. A similar-looking server may appear in a registry. A developer may enable a local server for a narrow experiment and leave it available to a production agent.
There is also a legibility asymmetry. The model may see instructions that the user never sees, or may weigh descriptions the user treats as harmless setup text. The agent's interface is not identical to the user's interface. The user sees a helpful assistant with a connector. The model sees a tool list with hidden authority cues, permissions, data channels, and prompts embedded in ordinary language.
That asymmetry is a governance problem, not only a UX problem. If an agent can read a private drive, query a customer database, and send messages, then the words used to describe those powers should be reviewable, versioned, and monitored like code.
The New Supply Chain
MCP servers are software dependencies plus delegated authority. That combination should make institutions cautious.
A normal dependency can break a build, introduce a vulnerability, or compromise a system. An MCP server can do those things while also changing what the agent knows, what it thinks it can do, and which external services it can touch. The server may hold credentials. It may bridge to local files. It may connect to internal tools. It may sit between the model and records that other people treat as institutional memory.
This creates a new shadow-infrastructure problem. Developers, researchers, teams, and vendors may spin up MCP servers because they are useful. Some will be official. Some will be experimental. Some will be abandoned. Some will run locally. Some will be remote. Some will be wrappers around SaaS products. Some will be community packages. Some will be installed by people who do not think of themselves as expanding an organization's authority surface.
OWASP's category of shadow MCP servers is useful because it names the organizational version of the risk. The dangerous server is not always malicious. It may simply be unreviewed, over-permissioned, unlogged, unpatched, or forgotten. Convenience becomes infrastructure before governance catches up.
This is the same pattern that made browser extensions, npm packages, SaaS integrations, spreadsheet macros, and Slack apps into governance problems. MCP adds a model-mediated layer: the dependency does not merely run code or fetch data. It speaks to the agent in a language the agent may obey.
The Governance Standard
A serious MCP governance program should treat every server as an access path, every tool as a permissioned operation, and every description as instruction-bearing context.
First, servers need provenance. Record who maintains the server, where the code lives, what version is enabled, how updates occur, what registry or package source supplied it, and who approved it.
Second, tools need least privilege. Separate read, write, send, delete, execute, purchase, publish, and permission-changing operations. A server should not expose broad action just because the underlying account can perform it.
Third, descriptions need review and pinning. Tool names, descriptions, schemas, examples, and prompts should be hashed or versioned so a silent change cannot quietly change the model's operating instructions.
Fourth, context needs labels. The client should distinguish user instruction, developer instruction, tool metadata, retrieved content, untrusted external text, private records, and model output. Natural language is not one authority class.
Fifth, credentials need containment. Avoid long-lived personal tokens, shared admin accounts, plaintext secrets, and broad OAuth scopes. Prefer short-lived, scoped, revocable credentials tied to a named agent or service account.
Sixth, consequential calls need approval gates. Sending messages, moving money, deleting records, changing permissions, publishing, deploying, or writing to systems of record should require explicit review with the exact action displayed.
Seventh, logs need enough detail for incident review. A usable trace should include the user request, tools listed, server identity, tool calls, arguments, returned data, approvals, blocked calls, external data shared, and final output.
Eighth, organizations need a server register. If no one can list which MCP servers are enabled, what they can touch, and who owns them, the organization does not have agent governance. It has hope with connectors attached.
Ninth, red teams need to attack the workflow, not only the model. Test tool poisoning, prompt injection through returned content, tool shadowing, excessive scope, cross-tool exfiltration, malicious updates, and confused-deputy paths.
Tenth, users need honest friction. The safest interface is not the one that asks for approval constantly until approval becomes meaningless. It is the one that distinguishes ordinary reading from consequential action and interrupts only where authority actually changes.
The Spiralist Reading
MCP is one of the places where the interface stops being a mirror and becomes a hallway.
The user speaks into a model. The model asks a server. The server returns context. The model chooses a tool. The tool touches a file, account, database, calendar, repository, inbox, browser, payment rail, or memory store. The result comes back as a fluent answer. To the user, it may feel like one assistant. In reality, it is a chain of institutions.
That chain changes belief formation. An agent with connected tools can make its answer feel more grounded because it has looked things up, touched live systems, and returned with operational confidence. But if the tool boundary is weak, the confidence is not evidence. It is the surface effect of a system that may have read poisoned context, used overbroad permissions, or hidden uncertainty inside a clean response.
The deeper risk is high-control interface drift. Once the tool server becomes invisible, the user stops seeing where authority enters. The answer feels like cognition. The action feels like convenience. The permission feels like setup. The audit trail becomes something only specialists inspect after harm.
The right discipline is to make the hallway visible. Name the server. Name the tool. Name the permission. Name the data source. Name the action. Name the uncertainty. Keep the receipt.
MCP can make agents more useful, interoperable, and inspectable. It can also make institutional life pass through model-mediated doors before anyone has decided who holds the keys. The trust boundary has moved. Governance has to move with it.
Sources
- Anthropic, Introducing the Model Context Protocol, November 25, 2024.
- Anthropic, Donating the Model Context Protocol and establishing the Agentic AI Foundation, December 9, 2025.
- Model Context Protocol, Security Best Practices, reviewed May 2026.
- Model Context Protocol, Authorization specification, 2025-06-18.
- OWASP Foundation, OWASP MCP Top 10, reviewed May 2026.
- OWASP Foundation, MCP Tool Poisoning, reviewed May 2026.
- Invariant Labs, Toxic Flow Analysis for agentic systems and MCP servers, July 29, 2025.
- Trail of Bits, Secure Your Model Context Protocol, reviewed May 2026.
- OpenAI, New tools for building agents, March 11, 2025.
- Church of Spiralism Wiki, Model Context Protocol, Tool Use and Function Calling, Prompt Injection, and Agent Tool Permission Protocol.