Wiki · Concept · Last reviewed June 25, 2026

MCP Sampling

MCP sampling is the Model Context Protocol client feature that lets a server ask the client to run a language-model generation, keeping model access with the client while allowing servers to build agentic workflows.

Category: Concept Published: June 25, 2026 Modified: June 25, 2026 Last reviewed: June 25, 2026 Tags: MCP, Sampling, AI Agents, Tool Use, Agent Security

Definition

MCP sampling is a client feature in the Model Context Protocol that lets a server request an LLM generation from the client. The server sends a sampling/createMessage request; the client chooses and invokes a model it controls; and the generated result returns to the server. The point is architectural: the server can use model capability without holding the user's model API keys or deciding which provider the client must use.

This is not "sampling" in the narrow decoding sense of temperature or top-p alone. In MCP, sampling names a protocol flow: a server asks the client to create a message. The 2025-11-25 specification reviewed for this entry describes support for text, image, and audio content in sampling messages, model preferences, optional context inclusion, and tool-enabled sampling.

How It Works

Sampling is negotiated during MCP lifecycle initialization. A client that supports the feature declares a sampling capability. The lifecycle page lists sampling beside other client capabilities such as roots and elicitation, and says both parties must use only capabilities that were successfully negotiated. That means a server should not treat sampling as automatically available in every MCP session.

A basic request includes messages, optional model preferences, an optional system prompt, and a token limit. Model preferences let a server state priorities such as cost, speed, or intelligence, and may include model hints. The client still makes the final model choice. That matters for governance because the requester's preference is not the same as an entitlement to a particular model, provider, budget, or data path.

The spec recommends a human-review posture. Applications should make sampling requests easy to review, allow users to view and edit prompts before sending, and present generated responses for review before delivery. This is not just a user-experience concern. Sampling lets a server put text in front of a model the client controls, so the review step is part of the trust boundary.

Tool-Enabled Sampling

The 2025-11-25 key-changes page identifies tool calling support in sampling, through tools and toolChoice, as a major change from the previous MCP revision. A server can include tool definitions and ask the client's model to use them during sampling. The client must declare support for tool-enabled sampling through sampling.tools, and servers must not send tool-enabled sampling requests to clients that have not declared that support.

Tool-enabled sampling creates a nested agent loop. The model may request tool use, the server executes those tool uses, and the server can send a follow-up sampling request with tool results appended. The specification requires tool use and tool result messages to stay balanced, and it recommends iteration limits for tool loops. Without those limits, a server-requested generation can become a long-running chain of delegated model calls and tool executions.

Governance Requirements

A governed client should treat every sampling request as an auditable request for model labor and model exposure. The record should name the server, request timestamp, negotiated capabilities, selected model, prompt text or redacted prompt hash, model preferences, tool list, tool-choice mode, user approval state, token budget, iteration cap, response review state, and whether the response was returned to the server.

Enterprises should separate three approvals that are easy to blur: permission for the server to connect, permission for the server to request sampling, and permission for a particular prompt or tool loop. A server that is safe for reading project metadata may not be safe to feed arbitrary workspace excerpts into a model. A server that may request one generation may not be safe to run a multi-turn loop with tools.

Failure Modes

Prompt laundering. A server can wrap hostile or sensitive text inside a legitimate-looking sampling request, asking the client's model to summarize, transform, or act on it.

Model-access confusion. Users may think the server is doing the computation, while the client is actually spending model budget, exposing context, or invoking a provider under the user's account.

Tool-loop drift. A tool-enabled sampling request can keep accumulating tool results and follow-up prompts until the original user intent is no longer visible.

Preference overreach. Model hints are advisory, but a weak client UI can make the server's requested model family look like a requirement rather than a preference.

Source Discipline

Claims about MCP sampling should cite the protocol version, because the 2025-11-25 specification added tool calling support to sampling. Claims about availability should cite lifecycle negotiation, not only the sampling page. Security claims should cite both the sampling page's security considerations and broader MCP security guidance on consent, token handling, local-server compromise, scope minimization, and auditability.

Spiralist Reading

MCP sampling turns the client model into a callable resource for the server. That inversion is useful, but it also changes who is borrowing whose authority.

The Spiralist lesson is to preserve the handoff. A sampling request should not vanish into the smoothness of the assistant. It should leave a receipt: who asked, what text crossed the boundary, which model answered, which tools were offered, and when a human had the chance to stop the loop.

Open Questions

Should clients require separate approval for tool-enabled sampling even when basic sampling has been approved?
How should clients display the cost and privacy impact of a sampling request before it is sent?
What iteration limits are appropriate for nested tool loops in high-risk workflows?
How much of a prompt can be logged safely while still preserving evidence for incident review?

Sources

Model Context Protocol, Sampling, version 2025-11-25, reviewed June 25, 2026.
Model Context Protocol, Lifecycle, version 2025-11-25, reviewed June 25, 2026.
Model Context Protocol, Key Changes, version 2025-11-25, reviewed June 25, 2026.
Model Context Protocol, Security Best Practices, reviewed June 25, 2026.
GitHub, modelcontextprotocol/modelcontextprotocol, official specification and documentation repository, reviewed June 25, 2026.

Return to Wiki