How to Configure robots.txt for AI Bots (GPTBot, ClaudeBot, Bingbot)

A complete guide to allowing or blocking AI crawlers — GPTBot, ClaudeBot, PerplexityBot, CCBot — in your robots.txt file. Includes copy-paste rules and gotchas.

Which AI bots exist?

Major AI companies run their own crawlers to train models or power search-style features. The most common ones you'll encounter in your server logs are listed below. Each respects robots.txt directives.

AI crawler list

Here are the user-agent strings for the most common AI crawlers in 2025–2026:

AI crawler user-agents
GPTBot          — OpenAI training + ChatGPT Browse
ChatGPT-User    — ChatGPT live browsing plugin
ClaudeBot       — Anthropic AI assistant
anthropic-ai    — Anthropic crawler
PerplexityBot   — Perplexity AI
CCBot           — Common Crawl (used by many LLM trainers)
Applebot        — Apple intelligence features
Bytespider      — ByteDance / TikTok
Amazonbot       — Amazon Alexa AI

Allow all AI bots (recommended for most sites)

If you want AI assistants to be able to cite and surface your content, allow all bots. The safest approach is to allow everything by default:

robots.txt
User-agent: *
Allow: /

Sitemap: https://yourdomain.com/sitemap.xml

Allow specific AI bots only

You can be selective — allow citation bots (Claude, Perplexity) while blocking training data scrapers (CCBot):

robots.txt
# Block training data scrapers
User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

# Allow AI assistants (citation / search features)
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

# Default for everyone else
User-agent: *
Allow: /

Block all AI bots

To opt out of all AI crawling — whether training or live-search — block each agent explicitly. Note: a blanket User-agent: * Disallow blocks human-readable search engines too, so be specific:

robots.txt
User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: *
Allow: /

Common mistakes

Using User-agent: * Disallow: / blocks Google too. Spelling mistakes in user-agent strings (e.g. 'GPTbot' vs 'GPTBot') render rules ineffective — match case exactly. Test with our scanner to confirm AI bots see explicit rules rather than a wildcard.

Is your site agent-ready?

Free scan across 18 checks — robots.txt, llms.txt, MCP, OAuth, and more.

Scan now →