Rights Reservation

Guidance on practical tools and protocols to help publishers reserve rights and protect content from unauthorized AI training

Learn more about these 3 options

Robots.txt | TDMRep | ISCC

Robots.txt

  • A robots.txt file is a plain text document found within a website (example.com/robots.txt), instructing crawlers/bots as to which sections they can access and index from that website; 
  • It helps website owners to control the behaviour of crawlers and to manage crawling traffic; 
  • It only covers situations in which the content owner is also the website owner. If content is copied to a website not controlled by the content owner, that indication is lost. 
  • It is not a standard, rather a widely-used protocol; 
  • It is a binary mechanism, so it can only instruct a crawler to collect (=0) or not to collect content (=1); 
  • It provides mere indicators, it doesn’t consist in hard blocking, and a crawler can be programmed to not consult or ignore the instructions; 
  • It only indicates that content cannot be crawled and does not separate situations in which crawlers are used for multiple purposes, like fetching content for search indexing (which might be allowed) and for training of AI models (which might be disallowed). 
  • It is the responsibility of the web owner to list all the crawlers that they wish to allow or disallow from their website, thus it risks not being exhaustive and effective and it places a considerable burden on the web owner.
  • AIPREF: We are monitoring progress at IETF level to update robots.txt to express AI Preferences.

User-agent: discoverybot/2.0
Disallow: /
User-agent: YoudaoBot/1.0
Disallow: /
User-agent: Sogou web spider/3.0
Disallow: /
User-agent: *
Disallow: /connect/archive
Disallow: /about/press-releases/archive
Disallow: /_dynamic-products/ 

 

TDMrep

  • The TDM Reservation Protocol (TDMRep) allows rightsholders to declare their choice regarding text & data mining of web resources under their control, easing the discovery of TDM licensing policies associated with such content; 
  • The TDMRep protocol can be implemented both at website and content level (e.g., in pdf and EPUB), in different ways depending on the level of expertise of the implementer; 
  • The indications, expressed at different levels and depending on the technical expertise available, lead to a central policy file hosted on the website (e.g., https://publisher.com/policies/policy.json) that can be easily accessed and read by crawlers; 
  • It was developed to obviate any problems with findability, as TDMRep does not to interfere with traditional web crawling and search engine indexing performed by web crawlers whose access to site content is traditionally regulated by robots.txt. 
  • It allows recipients of the declaration to adjust their scraping behaviour, or to find information about the licensing opportunities offered by the rightsholder. 
TDM File on the Origin Server (same as robots.txt)
[{
“location”: “/”,
“tdm-reservation”: 1,
“tdm-policy”: “https://publisher.com/policies/policy.json”
}]
TDM Header Field in HTTP Responses
HTTP/1.1 200 OK
Date: Wed, 14 Jul 2021 12:07:48 GMT
Content-type: text/html
tdm-reservation: 1
tdm-policy: https://publisher.com/policies/policy.json
TDM metadata in HTML content
<head>
<meta charset=”utf-8″>
<meta name=”tdm-reservation” content=”1″>
<meta name=”tdm-policy” content=”https://publisher.com/policies/policy.json”>
TDM Metadata in PDF / EPUB publications with XMP / XML tags

ISCC

  • The International Standard Content Code (ISCC) enables to generate a content identification code from the digital content itself: any user, entity or system with access to the content can generate/derive the ISCC code from the digital media assets. This means that two users or machines can generate the same or a similar identifier directly from the media file without exchanging any kind of information or metadata about the content. 
  • It is a content-dependent identifier; 
  • It is an ISO standard (ISO 24138:2024) and applies to different media formats (images, videos, audios, text files); 
  • ISCC works in combination with a registry (e.g. Liccium is one of the possible registry providers), where rightsholders would need to register ISCC codes and tie them to the relative rights declarations; 
  • The ISCC serves multiple use cases beyond rights reservation; by binding ISCC codes to rights declarations associated with digital assets, this system can help in ensuring trust in ownership, attribution, and authenticity of digital media content. 

ISCC:KAC3D4DPLBES6KBFGJ4DPZ4H36UFKC6SMPGZ63APMVIX4EAEWCK74DY

Title: example.pdf

Creator name: publisher

License URL: https://publisher.com/license

Copyright notice: © 2025 publisher

TDM reservation = 1

The Latest AI News from STM

VIEW ALL NEWS

Announcing the selected presenters for the upcoming STM Innovator Fair

Meet the 14 startups and companies selected to present at this year’s STM Innovator Fair – a cornerstone of the upcoming STM Innovation & Integrity Days in London, 9-10 December. Selected from a record-breaking number of submissions this year, these innovators showcase some of the most promising technologies and ideas shaping the future of trusted research. VeriMe Cooperative, LCA Proofig AI Hum KnowledgeWorks Global Ltd. Global Campus American Journal Experts ReviewerOne ReviewerZero AI DataSeerAI otto labs Veridat Cashmere Dandelion Charlesworth Attendees will experience these innovations first-hand—through lightning talks, live demos on the Fair floor, and also experience the Karger Publishers 2025 Vesalius Innovation Award ceremony, spotlighting five outstanding finalists.
LEARN MORE

Just launched: the STM AI Portal

AI & Trusted Research   How AI shapes — and is shaped by — the academic record  From accelerating drug development to enabling green technologies, AI holds extraordinary promise for science — but it also introduces real risks, from AI-generated misinformation to large-scale manipulation of the academic record. As the pace of change accelerates, publishers and...
LEARN MORE

New Scholarly Kitchen post: Classifying AI Use in Manuscript Preparation – A Recommendation

It is almost a cliché to say that AI has changed the academic publishing industry – for authors, reviewers, editors, readers, and publishers themselves. In 2023, STM published guidelines outlining ethical and practical considerations regarding the use of AI tools in the publication process. In the two years since, technology has progressed significantly, creating even more possibilities...
LEARN MORE