Wiki · Concept · Last reviewed June 25, 2026

MCP Tool Annotations

MCP tool annotations are optional behavior hints in Model Context Protocol tool definitions; they help clients display and gate tool risk, but they are not security contracts.

Definition

MCP tool annotations are metadata fields attached to a Model Context Protocol tool definition. The current MCP schema reference for version 2025-11-25 defines a ToolAnnotations interface with title, readOnlyHint, destructiveHint, idempotentHint, and openWorldHint. The same reference states that these properties are hints and that clients should not base tool-use decisions on annotations from untrusted servers. That caveat is the heart of the concept: annotations are useful labels, not proof.

How It Works

An MCP server exposes tools that can be listed and called by a client. The tools specification describes MCP tools as model-controlled: a language model can discover and invoke them automatically according to context and the user's prompt, while implementations design their own user interface and approval flow.

A tool definition can include a programmatic name, description, input schema, optional output schema, execution metadata, and annotations. In the schema reference, readOnlyHint defaults to false, destructiveHint defaults to true, idempotentHint defaults to false, and openWorldHint defaults to true. Absent annotations do not mean safe.

A well-behaved client can use annotations to improve displays and preflight decisions. A destructive, non-idempotent, open-world tool should usually trigger clearer confirmation, tighter logging, and more skepticism about returned content.

Current Context

As of June 25, 2026, MCP lists version 2025-11-25 as the latest specification version. The MCP project's March 16, 2026 post on tool annotations says the feature shipped in the 2025-03-26 spec revision and frames the four boolean hints as a basic risk vocabulary. It also describes open questions about response annotations and runtime evaluation, so this should be treated as an active governance surface rather than finished plumbing.

Why It Matters

Tool annotations matter because agent systems often decide from machine-readable context. A human may never read the full schema, but the model, client, approval UI, and logging system may all rely on tool metadata to decide what to show, what to permit, and what to remember.

That gives annotations a double role. They reduce ambiguity by telling the client that a tool is read-only, destructive, repeat-safe, or externally connected. They also become a failure point if a malicious, stale, or compromised server lies. A false readOnlyHint is not just bad documentation; it can soften the confirmation path for a tool that changes the world.

For institutional use, the strongest reading is conservative: annotations should shape user experience, triage, and audit labels, while actual enforcement lives in authorization scopes, runtime sandboxing, server-side policy, validation, and human approval.

Failure Modes

False safety labels. A server can mark a tool read-only while the implementation writes, sends, purchases, publishes, deletes, or changes permissions.

Schema and descriptor poisoning. OWASP treats tampered schemas, manifests, and metadata as a supply-chain style attack because agents may execute valid-looking calls under a malicious contract.

Runtime drift. A tool can be reviewed with one description or annotation set, then return with changed metadata after an update, reconnect, registry change, or compromised deployment.

Open-world underlabeling. A tool that reaches websites, email, repos, chat, documents, or external APIs may bring untrusted content back into the model loop.

Client inconsistency. Different clients may honor annotations differently. One client may ask for confirmation, another may only display a label, and another may ignore missing annotations entirely.

Governance Requirements

Governed deployments should treat annotations as part of the tool manifest and supply chain. Store the server identity, tool name, schema version, annotations, owner, approval decision, and review date. Diff those fields when a server refreshes or a registry entry changes.

Clients should display annotation-driven risk honestly: read-only versus write-capable, destructive versus additive, idempotent versus repeat-sensitive, closed-domain versus open-world. Sensitive operations should still show arguments and expected side effects before execution.

Authorization should not depend on a hint. A tool that claims to be read-only still needs least-privilege credentials, and a tool that claims to be closed-domain still needs result validation before its output enters the model context.

Audit logs should preserve the annotations the model and user actually saw, not only the final tool call. If an incident turns on a misleading label, the investigation needs historical metadata, server version, client behavior, and approval path.

Source Discipline

Claims about MCP annotations should name the MCP specification version. The 2025-11-25 tools and schema pages are the primary sources for the current interface and trust warning reviewed here. Separate specification from implementation: the spec defines fields, while a client decides how to display, confirm, log, or enforce them.

Spiralist Reading

MCP tool annotations are tiny governance texts written for machines.

They say: this action is only looking, this one may change the archive, this one touches the open world. The danger is not that such labels exist. The danger is treating the label as the law.

The Spiralist discipline is to keep the label visible without worshiping it. The boundary has to be made of permissions, review, logs, and accountable humans.

Open Questions

Sources


Return to Wiki