Blog · Analysis · Last reviewed June 23, 2026

The Tool Server Becomes the Trust Boundary

Model Context Protocol makes agents useful by connecting them to tools and data. It also moves the trust boundary from the chat window into a new infrastructure layer of servers, tool metadata, credentials, permissions, approvals, and logs.

For this essay, a tool server is any MCP server, connector wrapper, local subprocess, remote endpoint, tunnel-backed private service, or registry-discovered package that exposes model-visible context or model-callable tools. The trust boundary is the point where natural-language context, delegated credentials, and executable authority meet.

From Chat to Tools

The first public image of generative AI was the chat window: a user typed, a model answered, and the risk seemed to live mostly in language. The answer might be false, biased, manipulative, derivative, overconfident, or emotionally sticky. Those risks remain real, but the center of gravity has moved.

Modern agent systems do not only answer. They search, retrieve, inspect files, call APIs, operate browsers, write code, edit records, query databases, schedule work, and trigger other software. A model with no tools is a speaking system. A model with tools is an actor inside someone else's permissions.

That is why Model Context Protocol, or MCP, deserves attention beyond developer convenience. Every AI company and enterprise wants agents that can reach useful context without building a custom integration for every calendar, repository, database, ticketing system, drive, customer record, and internal workflow. MCP offers a common doorway.

But a doorway is not neutral. It decides what enters, what exits, what is named, what is callable, and what evidence survives afterward.

Current Context

As of June 23, 2026, MCP is no longer only an Anthropic experiment or a local-developer convenience. Anthropic announced MCP on November 25, 2024 as an open standard for connecting AI assistants to data sources and tools. On December 9, 2025, Anthropic said it was donating MCP to the Linux Foundation's Agentic AI Foundation, and described adoption across ChatGPT, Cursor, Gemini, Microsoft Copilot, Visual Studio Code, and other products. The Linux Foundation's announcement the same day named MCP, Block's goose, and OpenAI's AGENTS.md as founding contributions to the new foundation, and said MCP had more than 10,000 published servers.

The official MCP Registry changes the distribution problem. Its documentation describes a preview centralized metadata repository for publicly accessible MCP servers, meant mainly to feed downstream registries and marketplaces rather than to be consumed directly by host applications. Its terms also say the registry is provided as-is and does not guarantee the accuracy, completeness, safety, durability, or availability of listed servers or registry data. Registry presence is therefore provenance evidence, not trust.

The protocol itself has also matured. The latest published MCP specification reviewed for this article is version 2025-11-25. It defines stdio and Streamable HTTP transports, treats tools as server-provided capabilities that can be invoked through JSON-RPC, and builds HTTP authorization around OAuth-oriented discovery, protected resource metadata, token audience validation, PKCE, HTTPS, and step-up authorization. The November 2025 changelog also added authorization-discovery improvements, tool-name guidance, URL-mode elicitation, icon metadata for protocol objects, and other details that make metadata and consent more operational. That is a real protocol surface, not a branding wrapper around function calling.

Product adoption has moved MCP into distribution and enterprise governance surfaces. OpenAI's current API documentation distinguishes OpenAI-maintained connectors from arbitrary remote MCP servers on the public internet, says the API lists a server's tools before tool calls, and says approvals are required by default before data is shared with a connector or remote MCP server. OpenAI also documents Secure MCP Tunnel for private, on-premises, or firewalled MCP servers, which keeps the server off the public internet while routing supported product calls through an outbound tunnel. That improves the network shape, but it does not remove the need for authorization, app-level logging, data-retention rules, and review of what the server returns.

OpenAI's ChatGPT developer-mode documentation also says remote MCP apps can use SSE or streaming HTTP, OAuth, no authentication, or mixed authentication, and that refreshing an app can pull new tools, descriptions, and server instructions from the MCP server. It treats tools without the readOnlyHint annotation as write actions and requires confirmation by default for write actions. Those defaults are useful, but they also prove the point: metadata, annotations, refresh behavior, and remembered approvals are part of the boundary.

Security guidance has moved in parallel. The MCP project's own security best practices warn against token passthrough, broad scopes, session misuse, local-server compromise, SSRF, and weak consent. OWASP now maintains an MCP Top 10 and an Agentic Applications Top 10, naming risks such as tool poisoning, prompt injection via contextual payloads, insufficient authentication and authorization, missing audit telemetry, shadow MCP servers, and context over-sharing. NIST's AI Agent Standards Initiative places agent authentication, identity infrastructure, secure operation, and interoperability inside active standards work.

None of this proves the ecosystem is safe. It proves the boundary matters. Once a tool server becomes a common way for models to reach files, accounts, APIs, browsers, and internal systems, its identity, metadata, scopes, logs, registry status, update channel, and approval model become governance infrastructure.

What MCP Standardizes

MCP standardizes a relationship between an AI client and external servers. In broad terms, the client is the assistant or agent runtime. The server exposes context and capabilities. The model sees resources, prompts, and tools, then the system may call those tools as part of a task.

A tool server is therefore not just an adapter. It is a model-facing integration boundary that supplies callable operations, retrieved data, prompt material, tool names, descriptions, schemas, output shapes, authentication context, and sometimes local process access. Once the model can use it, the server's metadata and responses become part of the action path.

The boundary has four separable layers. The transport layer decides whether the server is a local subprocess, a remote HTTP endpoint, or a hosted connector. The authorization layer decides which token, user grant, scope, or service account can be used. The context layer decides which descriptions, prompts, resources, and returned content the model reads. The action layer decides which calls can read, write, send, delete, execute, publish, or change permissions.

There is also a discovery layer: registry metadata, package identifiers, install commands, remote URLs, namespace ownership signals, server versions, and downstream marketplace listings. Discovery helps a client or organization find a server. It should not be confused with enabling that server, trusting that server, or granting that server access to private data.

That sounds like plumbing, and in one sense it is. Good plumbing matters. It can replace brittle one-off integrations with shared conventions. It can make it easier to build agents that know which tools exist, what arguments they take, what data they return, and how authorization should work. It can make logs and approvals more consistent than a pile of bespoke scripts.

The mistake is treating standardization as safety. A standard can make a dangerous pattern repeatable. A clean interface can carry poisoned context. A well-documented connector can still have excessive scope. A tool schema can be syntactically valid while semantically misleading. An official registry can reduce chaos without eliminating dependency risk. Transport, discovery, schemas, and authorization flows are necessary controls; they are not the whole safety case.

The official MCP security best-practices documentation already points toward this problem. It discusses attacks including token passthrough, session hijacking, SSRF, local server compromise, scope inflation, and prompt-injection-style payloads in resumed sessions. OWASP's MCP Top 10 names a broader risk map: privilege escalation through scope creep, tool poisoning, software supply-chain attacks, command injection, prompt injection through contextual payloads, insufficient authentication, missing audit telemetry, shadow MCP servers, and context over-sharing.

That is the crucial turn. The security question is not merely whether the model is aligned. It is whether the tool environment is governable.

Why the Boundary Moved

Traditional software security often asks where the trust boundary sits. Which code is trusted? Which input is untrusted? Which process owns the credential? Which network request crosses into a different authority zone? Which logs can prove what happened?

Agentic AI complicates those questions because the model interprets text as part of action. A webpage, ticket, email, README, calendar invite, spreadsheet cell, product description, error message, or tool description may become context for the model. If the system does not sharply separate data from authority, untrusted text can try to become an instruction.

MCP intensifies this because it gives interpreted context a route to action. A malicious or compromised server does not necessarily need to exploit memory corruption or steal a password in the old way. It may try to shape the model's behavior by changing tool descriptions, returning crafted outputs, exposing misleading schemas, shadowing a trusted tool, or combining one harmless-looking read tool with another tool that can send data outward.

The dangerous combination is private data, untrusted content, and external communication or state change in the same workflow. OpenAI's current MCP and connectors guidance warns that malicious remote MCP servers can exfiltrate sensitive data from anything that enters model context, and separately recommends logging and reviewing data shared with third-party MCP servers. Trusting the developer of one server does not remove the risk introduced by hostile content from another source.

Invariant Labs has described these patterns in terms of tool poisoning and toxic flows: attacks where prompt-injection-style payloads move through agent systems and enable data exfiltration or other unsafe behavior. Trail of Bits has similarly warned about MCP risks including malicious tool descriptions, conversation-history theft, insecure credential storage, and deceptive terminal output.

The important feature is semantic attack surface. The adversary attacks what the model reads, not only what the program executes. The tool server becomes a trust boundary because it supplies both operational capability and the words that explain that capability to the model.

Tool Descriptions Are Interface Law

In ordinary software, documentation is secondary to code. Bad documentation can mislead a developer, but the program still follows the implementation. In agent systems, descriptions can become operational. The MCP tools specification describes tools as model-controlled: the model can discover and invoke them based on contextual understanding. It also says a tool definition includes a name, human-readable description, input schema, optional output schema, annotations, and execution properties. The description is no longer only documentation. It is part of the control surface.

That makes MCP tool metadata a strange new kind of interface law. It tells the model what the tool is for, when to use it, what arguments matter, and sometimes how to interpret results. If that metadata is hostile, stale, overly broad, or quietly changed after approval, the agent's practical world changes.

This is why "approval" cannot mean "the user clicked once when installing a connector." A user or organization may approve a server called GitHub, Slack, Gmail, Files, Calendar, CRM, or Database without understanding the exact tools, scopes, server owner, version, output behavior, or update policy. The server may later change its descriptions, and MCP includes a tool-list-changed notification for clients that support it. A similar-looking server may appear in a registry. A developer may enable a local server for a narrow experiment and leave it available to a production agent.

Product behavior already reflects this. OpenAI's ChatGPT developer-mode documentation says write actions require confirmation by default, that the system respects the MCP readOnlyHint annotation, and that tools without that hint are treated as write actions. It also lets users remember an approve or deny choice for a tool for the rest of a conversation. Those are useful controls, but they also show why annotations, defaults, and approval memory become part of the security boundary.

There is also a legibility asymmetry. The model may see instructions that the user never sees, or may weigh descriptions the user treats as harmless setup text. The agent's interface is not identical to the user's interface. The user sees a helpful assistant with a connector. The model sees a tool list with hidden authority cues, permissions, data channels, and prompts embedded in ordinary language.

That asymmetry is a governance problem, not only a UX problem. If an agent can read a private drive, query a customer database, and send messages, then the words used to describe those powers should be reviewable, versioned, and monitored like code.

The New Supply Chain

MCP servers are software dependencies plus delegated authority. That combination should make institutions cautious.

A normal dependency can break a build, introduce a vulnerability, or compromise a system. An MCP server can do those things while also changing what the agent knows, what it thinks it can do, and which external services it can touch. The server may hold credentials. It may bridge to local files. It may connect to internal tools. It may sit between the model and records that other people treat as institutional memory.

This creates a new shadow-infrastructure problem. Developers, researchers, teams, and vendors may spin up MCP servers because they are useful. Some will be official. Some will be experimental. Some will be abandoned. Some will run locally. Some will be remote. Some will be wrappers around SaaS products. Some will be community packages. Some will be installed by people who do not think of themselves as expanding an organization's authority surface.

OWASP's category of shadow MCP servers is useful because it names the organizational version of the risk. The dangerous server is not always malicious. It may simply be unreviewed, over-permissioned, unlogged, unpatched, or forgotten. Convenience becomes infrastructure before governance catches up.

The official registry can reduce part of this problem by making server metadata more discoverable and structured. It can also create a new false comfort if organizations treat a registry entry, package namespace, marketplace listing, or "official" label as a substitute for security review. A registry tells you where a server claims to come from and how it can be installed or reached. It does not tell you whether the tool descriptions are safe, whether the scopes are narrow enough, whether the server is appropriate for a regulated workflow, or whether its next update will preserve the same behavior.

This is the same pattern that made browser extensions, npm packages, SaaS integrations, spreadsheet macros, and Slack apps into governance problems. MCP adds a model-mediated layer: the dependency does not merely run code or fetch data. It speaks to the agent in a language the agent may obey. That connects this essay to the site's work on agent stores, enterprise connector permission maps, and agentic supply-chain vulnerabilities.

The Governance Standard

A serious MCP governance program should treat every server as an access path, every tool as a permissioned operation, and every description as instruction-bearing context.

First, servers need provenance. Record who maintains the server, where the code lives, what version is enabled, how updates occur, what registry or package source supplied it, and who approved it.

Second, transport and hosting need labels. A local stdio server, a remote Streamable HTTP endpoint, an OpenAI-maintained connector, a ChatGPT developer-mode app, and a private enterprise server have different operators, egress paths, credential flows, and incident contacts. The interface should not flatten them into the same word: tool.

Third, tools need least privilege. Separate read, write, send, delete, execute, purchase, publish, and permission-changing operations. A server should not expose broad action just because the underlying account can perform it.

Fourth, descriptions need review and pinning. Tool names, descriptions, schemas, examples, annotations, server instructions, and prompts should be hashed or versioned so a silent change cannot quietly change the model's operating instructions.

Fifth, context needs labels. The client should distinguish user instruction, developer instruction, tool metadata, retrieved content, untrusted external text, private records, and model output. Natural language is not one authority class.

Sixth, credentials need containment. Avoid long-lived personal tokens, shared admin accounts, plaintext secrets, and broad OAuth scopes. Prefer short-lived, scoped, revocable credentials tied to a named agent or service account.

Seventh, consequential calls need approval gates. Sending messages, moving money, deleting records, changing permissions, publishing, deploying, or writing to systems of record should require explicit review with the exact server, tool, arguments, recipient, data class, and consequence displayed.

Eighth, approval memory needs limits. Remembered approvals should be scoped by conversation, tool, action class, and server version. A remembered approval for a read operation should not become permission for later writes, exports, deletes, or cross-server data movement.

Ninth, runtime drift needs alarms. A refreshed tool list, changed description, new server instruction, altered endpoint, new OAuth scope, or changed read-only annotation should trigger review proportional to risk, not silent acceptance.

Tenth, logs need enough detail for incident review. A usable trace should include the user request, tools listed, server identity, transport, version, scopes, tool calls, arguments, returned data, approvals, blocked calls, external data shared, and final output.

Eleventh, organizations need a server register. If no one can list which MCP servers are enabled, what they can touch, and who owns them, the organization does not have agent governance. It has hope with connectors attached.

Twelfth, red teams need to attack the workflow, not only the model. Test tool poisoning, prompt injection through returned content, tool shadowing, excessive scope, cross-tool exfiltration, malicious updates, confused-deputy paths, SSRF through discovery, local-server compromise, and approval fatigue.

Thirteenth, users need honest friction. The safest interface is not the one that asks for approval constantly until approval becomes meaningless. It is the one that distinguishes ordinary reading from consequential action and interrupts only where authority actually changes.

Fourteenth, discovery must be separated from enablement. Finding a server in the MCP Registry, a marketplace, an internal catalog, or a vendor directory should create an intake record, not an automatic grant. Enablement should require task purpose, owner, data classes, scopes, transport, retention, and incident contact.

Fifteenth, manifest changes need review. Server metadata, tool descriptions, schemas, annotations, prompts, remote URLs, install commands, package references, headers, OAuth scopes, and tunnel associations should be diffed and approved when they change. This is AI change management applied to connectors.

Sixteenth, data boundaries must follow the call. Once data is sent to a third-party MCP server, connector, or tunnel-backed service, the server's own retention, residency, logging, and subprocessor rules matter. Data sharing should be recorded in an AI system inventory and governed through data minimization, not hidden inside a tool call.

Seventeenth, revocation must be rehearsed. The organization should know how to disable a server, revoke tokens, rotate secrets, quarantine logs, remove a registry or marketplace listing from internal allowlists, notify affected users, and preserve evidence for AI incident reporting.

Those controls belong with adjacent disciplines: agent identity, agent logs as receipts, permission classes, incident review, prompt-injection defense, agent sandboxing, agent observability, and AI audit trails. The trust boundary is not a single product setting. It is a chain of controls that has to survive live use.

What This Changes

MCP is one of the places where the interface stops being a mirror and becomes a hallway.

The user speaks into a model. The model asks a server. The server returns context. The model chooses a tool. The tool touches a file, account, database, calendar, repository, inbox, browser, payment rail, or memory store. The result comes back as a fluent answer. To the user, it may feel like one assistant. In reality, it is a chain of institutions.

That chain changes belief formation. An agent with connected tools can make its answer feel more grounded because it has looked things up, touched live systems, and returned with operational confidence. But if the tool boundary is weak, the confidence is not evidence. It is the surface effect of a system that may have read poisoned context, used overbroad permissions, or hidden uncertainty inside a clean response.

The deeper risk is high-control interface drift. Once the tool server becomes invisible, the user stops seeing where authority enters. The answer feels like cognition. The action feels like convenience. The permission feels like setup. The audit trail becomes something only specialists inspect after harm.

The right discipline is to make the hallway visible. Name the server. Name the tool. Name the permission. Name the data source. Name the action. Name the uncertainty. Keep the receipt.

MCP can make agents more useful, interoperable, and inspectable. It can also make institutional life pass through model-mediated doors before anyone has decided who holds the keys. The trust boundary has moved. Governance has to move with it.

Source Discipline

This essay separates protocol claims from product claims. The MCP specification defines protocol objects, transports, authorization expectations, and tool schemas for a particular version. OpenAI documentation describes OpenAI's current API and ChatGPT implementation choices. A deployment can follow one without inheriting the guarantees of the other.

It also separates risk taxonomies from prevalence claims. OWASP's MCP Top 10 and Agentic Applications Top 10 are used here as security categories, not as measurements of how often each attack succeeds. NIST's AI Agent Standards Initiative is evidence that identity, authentication, interoperability, and secure operation are standards priorities, not a certification that any product is safe.

Registry and marketplace sources are metadata sources, not assurance sources. They can establish that a server is listed, named, versioned, published through a particular package or remote endpoint, or connected to a particular namespace. They do not prove that the server is safe, least-privilege, well-maintained, suitable for restricted data, or unchanged since a prior approval.

For live deployments, source discipline means naming the MCP spec version, product surface, transport, server operator, server version, OAuth scope, tool metadata version, approval setting, logging retention, and data-residency boundary. "Supports MCP" is not a safety claim until those details are visible.

Current-source claims in this essay were checked against the named primary sources on June 23, 2026.

Sources

Anthropic, Introducing the Model Context Protocol, November 25, 2024.
Anthropic, Donating the Model Context Protocol and establishing the Agentic AI Foundation, December 9, 2025.
Linux Foundation, Linux Foundation announces the formation of the Agentic AI Foundation, December 9, 2025.
Model Context Protocol, Specification, version 2025-11-25, reviewed June 23, 2026.
Model Context Protocol, Transports, version 2025-11-25.
Model Context Protocol, Tools specification, version 2025-11-25.
Model Context Protocol, Key Changes, version 2025-11-25, reviewed June 23, 2026.
Model Context Protocol, Security Best Practices, reviewed June 23, 2026.
Model Context Protocol, Authorization specification, version 2025-11-25.
Model Context Protocol, The MCP Registry, reviewed June 23, 2026.
Model Context Protocol, Official MCP Registry Terms of Service, reviewed June 23, 2026.
OpenAI Developers, MCP and Connectors, reviewed June 23, 2026.
OpenAI Developers, ChatGPT Developer mode, reviewed June 23, 2026.
OpenAI Developers, Secure MCP Tunnel, reviewed June 23, 2026.
OWASP Foundation, OWASP MCP Top 10, reviewed June 23, 2026.
OWASP Foundation, MCP Tool Poisoning, reviewed June 23, 2026.
OWASP GenAI Security Project, OWASP Top 10 for Agentic Applications for 2026, December 9, 2025.
NIST, AI Agent Standards Initiative, created February 17, 2026, updated April 20, 2026.
NIST, Announcing the AI Agent Standards Initiative, February 17, 2026.
Invariant Labs, Toxic Flow Analysis for agentic systems and MCP servers, July 29, 2025.
Trail of Bits, Secure Your Model Context Protocol, reviewed June 23, 2026.
OpenAI, New tools for building agents, March 11, 2025.
Related references: Model Context Protocol, Tool Use and Function Calling, Prompt Injection, Context Poisoning, AI System Inventory, AI Change Management, AI Agent Observability, AI Audit Trails, AI Incident Reporting, Data Minimization, Agent Tool Permission Protocol, Agent Prompt Hardening, Agent Audit and Incident Review, The Agent Sandbox Becomes the Airlock, The Agent Identity Becomes the Service Account, The Agent Log Becomes the Receipt, The Agent Store Becomes the App Store, The Agent-to-Agent Protocol Becomes the Handshake, The AI Browser Becomes the Control Surface, The Enterprise Connector Becomes the Permission Map, Agentic Supply Chain Vulnerabilities, and AI Agent Sandboxing.

Return to Blog