Check if AI search platforms can crawl your content. Analyze robots.txt for GPTBot, PerplexityBot, ClaudeBot, Google-Extended, CCBot, and Bytespider. 12 checks.
6 AI bots robots.txt analysis Meta directives
Frequently asked questions
It checks 6 AI crawlers: GPTBot (ChatGPT/OpenAI), CCBot (Common Crawl, used to train many AI models), PerplexityBot (Perplexity AI search), Google-Extended (Google Gemini), ClaudeBot (Anthropic/Claude), and Bytespider (TikTok/ByteDance AI products).
AI platforms use web crawlers (bots) to discover and index content from websites, similar to how Google crawls the web. These bots respect robots.txt rules. If you block a specific bot in robots.txt with 'Disallow: /', that platform cannot crawl or use your content.
It depends on your goals. If you want AI platforms to cite your content and drive traffic, keep bots allowed. Some publishers block AI bots due to copyright concerns or to protect premium content. This tool helps you understand your current settings so you can make an informed decision.
The noai meta robots directive tells AI platforms not to use your page content for AI purposes. It's similar to noindex but specifically for AI. Some sites add this to prevent AI training on their content while still allowing regular search indexing.
Google-Extended is a crawler token used by Google specifically for Gemini AI and other generative AI features. Blocking Google-Extended prevents your content from being used in Gemini responses while still allowing regular Google Search indexing via Googlebot.
Edit your robots.txt file and remove or modify the User-agent and Disallow rules for the specific bot. For example, remove the 'User-agent: GPTBot / Disallow: /' section. Changes take effect as the bot re-crawls your site, which can take days to weeks.
Need help with AI search strategy?
Our team helps brands optimize for both traditional search and AI platforms. From robots.txt configuration to content strategy that gets cited by ChatGPT and Perplexity.