AI platforms like ChatGPT, Perplexity, and Meta AI often retrieve your website content in real time when a user asks about your brand. To confirm whether your site is being accessed by these AI crawlers, you can inspect your CDN weblogs. This guide starts with Cloudflare, one of the most common providers.
Why This Matters
Visibility in AI search: If AI crawlers can’t reach your pages, your brand may not be cited in responses.
Bot blocking issues: Over-aggressive firewall or bot settings can unintentionally block AI user agents.
Debugging access: Checking logs helps you verify if key AI bots are allowed through your configuration.
Scrunchbot diagnostics: Scrunch runs its own diagnostic bot that mimics how AI systems retrieve content, so you can see the same issues AI would encounter.
Step 1. Know Which AI User Agents to Look For
In Cloudflare logs, filter for these AI crawler user agents:
ChatGPT real-time retrieval
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot
Perplexity
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)
Meta AI
meta-externalagent
,meta-externalfetcher
, orfacebookexternalhit
Google Gemini
Appears asGooglebot
orGooglebot-Extended
Scrunchbot (for testing)
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Scrunchbot/1.0; +https://scrunchai.com/bots)
Scrunchbot does not crawl the general internet. It only accesses your site when you configure Scrunch to run a Site Audit or test page access. Its purpose is to mimic how AI systems search and retrieve content, so you can diagnose issues before they affect visibility.
Step 2. Access Your Cloudflare Logs
Log into your Cloudflare Dashboard.
Go to Analytics & Logs > Logs.
Enterprise customers can also export logs to a SIEM or storage bucket.
Use the Logpull API or Log Explorer to filter by
User Agent
.
Step 3. Filter for AI Bots
Example Log Explorer filter:
http.user_agent contains "ChatGPT-User"
Repeat for PerplexityBot
, meta-externalagent
, Scrunchbot
, etc.
Step 4. Confirm AI Visits
When you find hits from these user agents:
Check if Cloudflare’s bot protection is challenging or blocking them.
Look at HTTP status codes:
200
= success403
or429
= blocked
Review request paths to see which pages AI or Scrunchbot accessed.
Step 5. Take Action if Blocked
If AI crawlers are blocked:
Update Cloudflare Bot Management rules to allowlist these agents.
Confirm robots.txt allows them.
Test with Scrunchbot: run a Site Audit to simulate AI retrieval and confirm the fix.
Recheck with an AI system (ask ChatGPT or Perplexity about your brand) to see if your content is cited.
Scrunch Tip
Our Site Audit uses Scrunchbot to test how AI systems see your site. It checks accessibility, flags blocking issues, and helps you confirm that fixes are working. This saves you from manual log digging and provides continuous monitoring.
✅ Summary:
Checking Cloudflare logs for AI crawler visits lets you verify whether ChatGPT, Perplexity, Meta AI, Google Gemini—or Scrunchbot—are successfully reaching your site. Look for their user agent strings, confirm 200
responses, and update bot rules if needed. Scrunchbot gives you a safe way to simulate these AI checks before issues impact real-world visibility.