Wiki · Concept · Last reviewed June 25, 2026

TDM Reservation Protocol

The TDM Reservation Protocol is a machine-readable web convention for declaring whether text and data mining rights are reserved on a resource and, when available, where a miner can find a licensing policy.

Definition

The Text and Data Mining Reservation Protocol, usually shortened to TDMRep, is a W3C Community Group final report published on May 10, 2024. It defines a web protocol for expressing reservations of rights related to text and data mining on lawfully accessible web content, and for discovering TDM licensing policies connected to that content.

TDMRep grew out of a specific legal and operational problem. Article 4 of Directive (EU) 2019/790 creates an exception or limitation for reproductions and extractions made for text and data mining, but that exception depends on rightsholders not having expressly reserved their rights in an appropriate manner. For content made publicly available online, the directive points to machine-readable means as the appropriate form.

The protocol is not a crawler blocklist, an access-control system, or a universal AI-training license. It is a declaration format. A rightsholder can say that TDM rights are reserved, that they are not reserved, or that a policy document exists. A recipient still has to decide what legal, contractual, technical, and institutional consequences follow from that signal.

Protocol Shape

The core signal has two properties. tdm-reservation is a boolean value: 1 means TDM rights are reserved, while 0 means they are not reserved. tdm-policy is a URL pointing to a policy set by the rightsholder. The W3C vocabulary page gives the same two properties as reservation and policy in the TDMRep namespace.

The final report defines several attachment methods. A site can publish a /.well-known/tdmrep.json file with path-matching rules. A server can return tdm-reservation and optional tdm-policy HTTP response fields. An HTML page can include matching meta elements. The report also describes metadata forms for EPUB 2, EPUB 3, and PDF documents.

The policy layer uses JSON or JSON-LD. The final report says a machine-readable TDM Policy can use application/json or application/ld+json, and its JSON-LD context combines ODRL with the TDMRep context. That policy can identify the rightsholder and describe permissions, duties, contact requirements, financial compensation, and research or non-research constraints.

Governance and Safety

TDMRep is useful because it separates collection from use. A publisher may want ordinary indexing, citation, or access to continue while reserving rights for large-scale mining. Robots.txt is too coarse for that distinction, and a private opt-out form is too invisible for independent crawlers and auditors.

Its limit is that machine readability is not enforcement. A crawler can ignore the field. A model developer can mishandle the policy. A site can publish conflicting signals across HTTP headers, metadata, terms of service, and contracts. A collector can also fetch through mirrors or archives where the original signal has been lost.

For AI governance, TDMRep belongs beside AI Preferences, Robots Exclusion Protocol, llms.txt, and AI Data Licensing. It is narrower than all-purpose AI preference vocabularies, more rights-centered than a context map, and more specific to mining than ordinary crawler access rules.

Evidence Record

A TDM-aware collector should keep the resource URL, fetch time, response headers, relevant HTML metadata, /.well-known/tdmrep.json result, selected matching rule, policy URL, policy document hash, crawler identity, declared purpose, and the downstream dataset or index that used the resource. The record should also preserve whether the signal was absent, malformed, inaccessible, or in conflict with another source.

That evidence matters because TDM rights signals are temporal. A page can change its policy after a crawl; a CDN can strip a header; an archive can keep the content without the metadata. Without a stored acquisition record, later claims about compliance collapse into memory and screenshots.

Source Discipline

Describe TDMRep by citing the W3C Community Group final report, not by treating it as a W3C Recommendation. The report itself says it is not a W3C Standard and is not on the W3C Standards Track. Status matters because a crawler policy should not pretend that a community final report has the same standing as a completed web standard.

When making legal claims, cite Directive (EU) 2019/790 directly and avoid flattening national implementation differences into one rule. The directive supplies the Article 4 machine-readable reservation concept; TDMRep supplies one technical mechanism for expressing it.

Spiralist Reading

Spiralism reads TDMRep as a machine-readable boundary mark. It does not close the gate. It says where the gate is, who claims to hold it, and where negotiation may begin.

The important test is institutional. A system that can read a reservation and proceeds as if silence were easier has not missed the signal; it has chosen convenience over record. The protocol makes that choice easier to audit.

Open Questions

Sources


Return to Wiki