https://arstechnica.com/tech-policy/2025/09/pay-per-output-ai-firms-blindsided-by-beefed-up-robots-txt-instructions/
Interesting solution that just might work. Unlike spambots, the AI bots have deep pockets and if the content license specifically disallows usage for training and a publisher can prove that the license was violated, it puts them at some jeopardy.
For now, I have flipped the toggle on Cloudflare which then automatically adds this to every robots.txt file
# BEGIN Cloudflare Managed content
User-agent: Amazonbot
Disallow: /
User-agent: Applebot-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: GPTBot
Disallow: /
User-agent: meta-externalagent
Disallow: /
# END Cloudflare Managed Content