The MCP Server Becomes the Leakage Boundary
An MCP server is not just a connector. It is the place where private context leaves the host, external output enters the model, and tool authority becomes inspectable or invisible.
The leakage boundary is the full crossing: which server is connected, what data is sent to it, what tool or resource it exposes, what credentials it receives, what output it returns, and whether the host logs and constrains that exchange.
The Boundary Is Not the Connector
Model Context Protocol, or MCP, gives AI applications a common way to connect to tools, resources, prompts, and external systems. That sounds like plumbing, and at the protocol level it partly is. The governance mistake is to treat plumbing as neutral once the pipe fits.
For this essay, an MCP server leakage boundary is the point where an AI host allows data, credentials, tool arguments, tool definitions, resource content, or user interaction requests to cross between the model-facing runtime and an external server. Leakage can move outward, when private context is sent to a server. It can move inward, when server output becomes model-visible instruction-like content. It can move sideways, when one server's output influences calls to another server or tool.
The boundary is sharper than "can this agent use a tool?" It asks: which server sees which data, under which identity, through which transport, with which OAuth scopes, with which human approval, with which output filtering, and with which audit record?
That is why MCP belongs beside Agent Tool Permission Protocol, Agent Prompt Hardening, and Agent Audit and Incident Review. A connected model can do more than answer. It can read files, call APIs, ask users for information, retrieve records, mutate systems, and pass server output back into future reasoning. The server is therefore part of the authority surface, not merely a data adapter.
Current Context
As of June 24, 2026, MCP is a cross-platform integration layer rather than a single-vendor experiment. Anthropic introduced MCP on November 25, 2024 as an open standard for connecting AI assistants to data sources and tools. The official MCP specification page identifies version 2025-11-25 as the latest version reviewed here. That specification describes a host-client-server architecture using JSON-RPC sessions: hosts initiate connections, clients maintain isolated sessions with servers, and servers expose resources, prompts, and tools.
The current specification is explicit about the trust boundary. It says MCP enables powerful capabilities through arbitrary data access and code execution paths. It says users must understand and control data access and operations, hosts must obtain consent before exposing user data to servers, tools should be treated cautiously because they represent arbitrary code execution, and tool behavior descriptions should be considered untrusted unless obtained from a trusted server. The specification also says MCP itself cannot enforce these principles at the protocol level; implementers have to build authorization, consent, access controls, and data protections into the application.
Platform support has broadened the practical stakes. Microsoft announced MCP support in Copilot Studio in March 2025. GitHub documents MCP support across Copilot surfaces and provides a GitHub MCP server. OpenAI documents connectors and remote MCP servers in its API, with approvals defaulting before data is shared with a connector or remote MCP server, and with guidance to review and log data sent to third-party MCP servers. Those product controls are useful signals, but they are not protocol guarantees. Each host decides how much users see, what approval means, and how much evidence survives.
Security guidance has also hardened. The official MCP security best practices cover confused-deputy risks, token passthrough, server-side request forgery, session hijacking, local server compromise, and scope minimization. OWASP's MCP Top 10 lists risks such as token mismanagement, scope creep, tool poisoning, software supply-chain attacks, command injection, insufficient authentication, weak telemetry, shadow MCP servers, and context over-sharing. NIST's 2026 AI Agent Standards Initiative and NCCoE concept paper put agent identity and authorization into a broader standards agenda. A May 2026 NSA Cybersecurity Information Sheet goes further, warning that MCP's rapid proliferation has outpaced its security model and that high-stakes adoption needs deliberate controls beyond protocol suggestions.
What Leaks
The obvious leak is a secret sent to the wrong server. That is only the first case.
Tool arguments leak. A model may call an MCP tool with a snippet of private email, source code, customer data, meeting context, credentials pasted by a user, or a file path that reveals internal structure. OpenAI's documentation states the remote MCP server does not automatically receive the full conversation, but it can see any data the model sends in a tool call. That sentence is the leakage boundary in plain language.
Resource reads leak. A server can expose file-like data, API responses, repository content, database records, and internal documents as resources. The MCP resources specification says sensitive resources should have access controls and that permissions should be checked before operations. A resource URI is not a permission by itself.
Tool definitions leak authority into context. Tool names, descriptions, input schemas, output schemas, annotations, and server instructions become model-facing material. They shape what the model believes the tool does and when it should call it. If those definitions are malicious, stale, ambiguous, or overbroad, the model may route work through the wrong authority.
Outputs leak back inward. Tool results may contain text, structured content, resource links, or embedded resources. Once returned, that material can influence the model's next step. If the host treats output as instruction rather than evidence, a server can become an indirect prompt-injection channel.
Scopes leak future power. A broad OAuth token or long-lived service account can turn a narrow request into durable access. The MCP authorization specification requires audience validation for access tokens when authorization is used, but authorization itself is optional in MCP. Identity and token lifecycle therefore remain deployment responsibilities.
Logs leak too. Good logs preserve accountability. Bad logs become a second copy of sensitive prompts, tool arguments, retrieved files, and server output. The answer is not "no logs." It is minimal, access-controlled, purpose-bound evidence that supports audit without creating an unrestricted dossier.
Server Output Is Input
The MCP server is dangerous because it is both a source of data and a source of model-visible language.
OWASP describes MCP tool poisoning as an indirect prompt-injection attack: a malicious server can expose normal-looking tools while returning outputs that contain hidden instructions. The root problem is a trust gap between connection-time review and runtime content. A server can look harmless at installation and become manipulative through a later tool result, schema change, resource link, or descriptor update.
This is why approval cannot stop at "connect this server?" The runtime question is different: should this exact data be sent to this exact server for this exact tool call, and should this exact output be allowed to steer subsequent action?
Defensive systems should label tool output as untrusted content unless the server is trusted for the specific task and data class. The model can summarize, parse, and reason over that output, but it should not treat it as authority to expand scope, call other tools, change permissions, reveal hidden instructions, or move data to a new destination. That is the same authority discipline behind Prompt Injection and Context Poisoning, now applied to server-mediated agent work.
Local Servers Are Code
Remote servers create third-party data-sharing risk. Local servers create execution risk.
The official MCP security best-practices page warns that local MCP servers are binaries downloaded, authored, or configured to run on the user's machine. Without sandboxing and consent, they may execute with the client user's privileges, read local files, reach localhost services, interact with developer tools, or expose data to other processes. The guidance says one-click local server configuration should show the exact command and arguments, identify that code will run on the user's system, require explicit approval, and allow cancellation.
That makes a local MCP server more like installing a development dependency than opening a document. The user is not only granting an AI assistant more context. The user may be launching code near SSH keys, browser sessions, source repositories, build artifacts, package managers, shell history, private notes, and internal databases reachable from the machine.
Local servers therefore need package provenance, version pinning, sandboxing, filesystem roots, network restrictions, secret isolation, uninstall paths, and logs. A useful MCP client should make the execution boundary visible before the first request, not after something has already run.
Authorization Is Not Consent
OAuth can prove that a token was issued and scoped. It does not prove that a user understood a future model-generated tool call.
This difference matters because MCP systems often join three kinds of authority: the user's intent, the model's interpretation, and the server's credential. A user may authorize a calendar connector, but that does not mean every future calendar lookup, event mutation, or cross-tool data transfer should run silently. A token can be valid while the tool call is wrong.
The MCP authorization specification addresses transport-level authorization for HTTP-based transports and requires protected-resource metadata and token audience validation when authorization is supported. The OpenAI MCP guide adds an application-level pattern: approvals default before data is shared with a connector or remote MCP server, and developers can narrow the imported tool surface with allowed_tools and require approval for sensitive actions. Those controls point in the right direction because they separate connection from runtime disclosure.
The stronger rule is simple: authorization grants a channel; consent approves a specific crossing. A governed MCP deployment should preserve both records.
Failure Modes
The first failure mode is connector laundering. A remote MCP server is described as a connector, so teams forget it is a third-party service receiving data and returning model-visible content.
The second is tool-description trust. The model relies on names, descriptions, annotations, or server instructions that no one reviewed after an update.
The third is approval fatigue. Users click through vague prompts because they cannot see the data payload, destination, credential, or consequence.
The fourth is scope creep. A read-only research tool gradually becomes a write-capable operational tool without re-approval, separate credentials, or a new risk review.
The fifth is cross-server bleed. Output from one server influences calls to another server, turning separate integrations into an unplanned workflow.
The sixth is shadow MCP. A developer, team, or agent adds a server outside the approved inventory, creating an invisible pathway into files, APIs, or SaaS systems.
The seventh is local execution normalization. Installing an MCP server starts to feel like adding a bookmark, even though the server may run code on the user's machine.
The eighth is missing incident evidence. The organization cannot reconstruct which server was connected, which tool was listed, which arguments were sent, what output returned, or who approved the crossing.
Governance Standard
A serious MCP governance regime should treat every server as a privileged integration.
First, keep a server register. Record server name, owner, publisher, source, version, endpoint, transport, local command if any, data classes exposed, tools enabled, OAuth scopes, downstream APIs, retention terms, security contact, and review date.
Second, separate read from write. Read-only tools, write actions, shell-like execution, payment actions, external messaging, repository mutation, and permission changes should live in separate approval classes.
Third, narrow the tool surface. Import only the tools required for the workflow. Prefer explicit allowlists over broad server exposure, especially when a server offers many functions.
Fourth, show the crossing. Approval prompts should display server identity, tool name, data payload summary, destination, credential class, expected output, and whether the choice applies once or persists.
Fifth, label output as evidence, not authority. Tool results and resource content should be marked as untrusted unless they come from a server trusted for that data class and task. They should not be allowed to rewrite the agent's goal or permissions.
Sixth, control local execution. Local servers should be sandboxed, rooted, version-pinned, and installed only after showing exact commands and expected privileges. No local MCP server should receive ambient access to secrets merely because it runs nearby.
Seventh, bind authorization to identity and audience. Use short-lived, narrow tokens where possible, validate token audience, avoid token passthrough, support revocation, and require step-up approval for insufficient scopes or high-risk actions.
Eighth, re-approve meaningful change. New tools, changed descriptions, new write capabilities, changed endpoints, new scopes, dependency updates, or server-owner changes should trigger review.
Ninth, preserve audit receipts. Logs should capture user intent, host, client, server identity, server version, listed tools, allowed tools, resource reads, tool arguments, approvals, returned outputs, errors, and final actions. Sensitive contents should be minimized and access-controlled.
Tenth, test the boundary. Red-team indirect prompt injection, tool poisoning, schema drift, cross-server data flow, stale tokens, local-server execution, and denial-of-service behavior before connecting MCP to restricted records or consequential workflows.
What This Changes
The MCP server is where the Mirror grows hands.
A language model alone can confuse, persuade, invent, and misread. A model connected through MCP can ask the database, inspect the repository, query the calendar, fetch the file, call the CRM, open the ticket, or request sensitive information from the user. The interface stops being a box of text and becomes a set of institutional crossings.
The danger is not that MCP is bad. The danger is that a standard connector can make an authority decision feel like a compatibility decision. Once the server connects, the model's context becomes a place where private data, tool descriptions, external outputs, OAuth scopes, and human approvals meet.
The practical discipline is to keep the crossing visible. Name the server. Name the data. Name the credential. Name the action. Name the output. Name the approval. Then log enough to prove what happened without turning the log into another leak.
Source Discipline
Claims about MCP should name the protocol version and the product surface. The 2025-11-25 MCP specification is not the same thing as a vendor's MCP client, a hosted connector, a local server package, a public registry, or a specific server implementation.
Security claims should also separate categories. The official specification can establish architecture, features, authorization requirements, and stated trust-and-safety principles. OpenAI, Microsoft, and GitHub docs can establish how those vendors expose MCP in their products. OWASP, NIST, and NSA materials can establish risk frames and recommended controls. None of those sources prove that a particular MCP server is safe, current, least-privilege, or appropriate for restricted data.
Project-level review should use direct evidence: server source, signed releases, dependency manifests, tool schemas, endpoint ownership, OAuth scopes, audit logs, incident history, sandbox policy, data-retention terms, and test results. A server being MCP-compatible is an interoperability claim, not a trust claim.
Related Pages
- Model Context Protocol
- Tool Use and Function Calling
- Prompt Injection
- Context Poisoning
- Agentic Supply Chain Vulnerabilities
- AI Agent Identity
- AI Agent Observability
- AI Agent Sandboxing
- Agent Tool Permission Protocol
- Agent Prompt Hardening
- Agent Audit and Incident Review
- Vendor and Platform Governance
- The Enterprise Connector Becomes the Permission Map
- The Agent Log Becomes the Receipt
- The AI Browser Becomes the Control Surface
Sources
- Anthropic, Introducing the Model Context Protocol, November 25, 2024.
- Model Context Protocol, Specification, version 2025-11-25, reviewed June 24, 2026.
- Model Context Protocol, Architecture, version 2025-11-25.
- Model Context Protocol, Tools, Resources, and Elicitation, version 2025-11-25.
- Model Context Protocol, Authorization specification, version 2025-11-25.
- Model Context Protocol, Security Best Practices, reviewed June 24, 2026.
- OpenAI Developers, MCP and Connectors, reviewed June 24, 2026.
- OpenAI Developers, Building MCP servers for ChatGPT Apps and API integrations, reviewed June 24, 2026.
- Microsoft Copilot Blog, Introducing Model Context Protocol in Copilot Studio, March 19, 2025.
- Microsoft Learn, Extend your agent with Model Context Protocol, reviewed June 24, 2026.
- GitHub Docs, About Model Context Protocol (MCP), reviewed June 24, 2026.
- GitHub, github/github-mcp-server, reviewed June 24, 2026.
- OWASP Foundation, OWASP MCP Top 10 and MCP Tool Poisoning, reviewed June 24, 2026.
- NIST, AI Agent Standards Initiative, reviewed June 24, 2026.
- NIST NCCoE, Software and AI Agent Identity and Authorization, concept paper published February 5, 2026.
- National Security Agency, Model Context Protocol (MCP): Security Design Considerations, May 2026.