What is Google-Extended?
Google-Extended is Google's user-agent specifically for AI training data collection, separate from Googlebot used for Search indexing. Blocking Google-Extended prevents your content from training Gemini and other Google AI products but does not affect Google Search rankings or AI Overviews. It's the rare AI crawler that lets you opt out of training without losing search visibility.
Last updated: January 15, 2026
What is Google-Extended?
Google-Extended is a dedicated user-agent Google uses to crawl websites for AI training purposes. Crucially, it's separate from Googlebot—blocking Google-Extended does not affect your Google Search rankings.
This separation gives website owners a unique choice: you can prevent your content from training Google's AI models while maintaining full Google Search visibility.
Google-Extended vs Googlebot
Key insight: You can block Google-Extended while allowing Googlebot with no search penalty.
What Google-Extended Controls
Blocking Google-Extended prevents:
Blocking Google-Extended does NOT affect:
Configuring robots.txt for Google-Extended
Block AI training only:
# Block AI training
User-agent: Google-Extended
Disallow: /
# Allow search indexing (implicit, but explicit is clearer)
User-agent: Googlebot
Allow: /
Allow everything (default if not specified):
User-agent: Google-Extended
Allow: /
User-agent: Googlebot
Allow: /
Partial blocking:
User-agent: Google-Extended
Disallow: /proprietary-content/
Disallow: /internal-docs/
Allow: /
AI Overviews Misconception
A common misconception: blocking Google-Extended will remove you from AI Overviews.
This is false.
Google AI Overviews uses the Google Search index (Googlebot), not Google-Extended crawls. Your content can appear in AI Overviews even if you block Google-Extended.
Google-Extended specifically controls training data for models like Gemini—not Search features.
When to Block Google-Extended
Consider blocking if:
Consider allowing if:
Google's Official Position
Google has stated:
This is one of the clearer opt-out mechanisms in the AI ecosystem.
Google-Extended and Other AI Crawlers
Google-Extended only affects Google's AI training. You need separate rules for:
Each AI company operates independently. Blocking Google-Extended has no effect on other AI systems.
Verifying Google-Extended Activity
Check server logs:
grep "Google-Extended" /var/log/nginx/access.log
Or use Google Search Console to monitor crawl activity (though it combines all Google crawlers).
Strategic Considerations
The unique Google opportunity:
Unlike other AI providers, Google separates training from visibility. You can:
This isn't possible with most other AI systems where blocking the crawler removes you entirely.
Competitive dynamics:
If competitors allow Google-Extended and you don't, their content may influence Gemini's knowledge base more than yours. Consider whether this matters for your industry.
Common Mistakes
1. Thinking blocking affects Search
Google-Extended and Googlebot are independent. Blocking one doesn't affect the other.
2. Confusing Google-Extended with AI Overviews
AI Overviews uses Search index data, not Google-Extended crawls.
3. Blocking Googlebot to avoid AI
This would devastate your Search rankings. Only block Google-Extended.
4. Assuming one crawler setting covers all AI
Each AI company has separate crawlers requiring separate rules.
Summary
Google-Extended is Google's opt-out mechanism for AI training. It's unusual in allowing granular control: you can prevent your content from training Google's AI models while maintaining full Search visibility and AI Overview eligibility.
For most businesses, the decision comes down to priorities: maximum AI influence (allow) vs maximum control over training data (block).
Related Terms
AI Crawler
AI crawlers are web bots operated by AI companies to index content for language models and AI-powere...
GPTBot
GPTBot is OpenAI's official web crawler that indexes content for ChatGPT and other OpenAI products. ...
Generative Engine Optimization (GEO)
Generative Engine Optimization (GEO) is the practice of optimizing content, brand messaging, and dig...
ClaudeBot
ClaudeBot is Anthropic's web crawler that collects data for Claude AI models. Using user-agent strin...
Track Your Google-Extended
BrandVector helps you monitor and improve your AI visibility across ChatGPT, Claude, Perplexity, and Grok.