Skip to main content

Connecting your website to Bot Traffic Analytics using Cloudflare Workers

How to connect your website served by Cloudflare (CDN) to Scrunch's Bot Traffic Analytics tool to view all accesses and metrics from LLM bots to your domain using CloudFlare Workers

Updated over 3 weeks ago

Overview

The Bot Traffic Analytics tool in Scrunch lets you monitor how much access your site is getting from LLM bots—including ChatGPT, Perplexity, Gemini, Grok, and others.

If your website is proxied by Cloudflare, you can use Cloudflare Workers to send traffic logs to Scrunch. The Worker runs at the edge and forwards request details without slowing down your site.


What You’ll See

Once your Cloudflare Worker is connected, the Bot Traffic dashboard will show:

  • Total Bot Traffic in the last period

  • Bot traffic over time

  • Traffic distribution between Retrieval, Indexer, and Training LLM Bots

  • Comparison between the current period and the last period (%)

  • Top bot agents and when they were last seen

  • Top content pages accessed by LLM bots

  • Recent bot requests

  • A date filter to see data from the last 24 hours, last 7 days, or last 30 days

Scrunch AI's Bot Traffic Analytics feature allows customers to granularly track
which AI platforms are consuming their content (and for what purpose) to enable better understanding of how their content:

  • will be surfaced in AI platforms like ChatGPT

  • drives AI responses to relevant questions

  • and ultimately how it influences AI to describe and recommend their brand,
    products and services and click through to their site(s).


Adding Your Website

  1. Open the Scrunch app.

  2. Go to the Sites menu.

  3. You’ll see the list of your websites connected to Bot Traffic Analytics.

  4. Click + Connect Site.

  5. Select Cloudflare as the platform.

  6. You’ll see an instructions page with your Webhook URL, Site ID, and API Key.

Your page will look like this one:

ℹ️ Each site has its own endpoint and key. Don’t reuse them across different sites or integrations.


Integrating Cloudflare Workers

Step 1: Create a Worker

  • Log in to your Cloudflare dashboard.

  • Navigate to Workers & Pages → Create Application.

  • Choose Create Worker and give it a name (e.g. bot-tracking-worker).

Step 2: Add Worker Code

  • The Worker intercepts requests, collects traffic data (user agent, path, status code, etc.), and sends it to Scrunch asynchronously.

  • Replace the default Worker code with the following TypeScript snippet

interface ExecutionContext {
waitUntil(promise: Promise<any>): void;
passThroughOnException(): void;
}

interface Env {
SCRUNCH_SITE_ID: string;
SCRUNCH_WEBHOOK_URL: string;
SCRUNCH_API_KEY: string;
}

function getUserAgent(request: Request): string {
return (
request.headers.get("X-Original-UA")?.toLowerCase() ||
request.headers.get("User-Agent")?.toLowerCase() ||
"unknown"
);
}

async function logToScrunchWebhooks(
data: {
timestamp: string;
site_id: string;
domain: string;
user_agent: string;
url: string;
path: string;
method: string;
status_code: number;
response_time_ms: number;
ip_address: string;
},
env: Env,
): Promise<void> {
try {
const response = await fetch(env.SCRUNCH_WEBHOOK_URL, {
method: "POST",
headers: {
"Content-Type": "application/json",
"X-Api-Key": env.SCRUNCH_API_KEY,
},
body: JSON.stringify(data),
});

if (!response.ok) {
console.error("Failed to log to Scrunch webhooks:", response.status, response.statusText);
}
} catch (error) {
console.error("Error logging to Scrunch webhooks:", error);
}
}

export default {
async fetch(
request: Request,
env: Env,
ctx: ExecutionContext,
): Promise<Response> {
const url = new URL(request.url);
const startTime = Date.now();
const userAgent = getUserAgent(request);

const logAndReturn = (response: Response) => {
ctx.waitUntil(
logToScrunchWebhooks(
{
timestamp: new Date().toISOString(),
site_id: env.SCRUNCH_SITE_ID,
domain: url.hostname,
user_agent: userAgent,
url: url.toString(),
path: url.pathname,
method: request.method,
status_code: response.status,
response_time_ms: Date.now() - startTime,
ip_address: request.headers.get("CF-Connecting-IP") || "unknown",
},
env,
),
);
};

// Forward the request to your origin server
try {
// Create a new request that preserves all original properties
const forwardedRequest = new Request(request.url, {
method: request.method,
headers: request.headers,
body: request.body,
redirect: request.redirect,
signal: request.signal,
});

const response = await fetch(forwardedRequest);

// Log the traffic data asynchronously (non-blocking)
logAndReturn(response);

// Return the original response immediately
return response;
} catch (error) {
// If there's an error, still try to log it and return an error response
const errorResponse = new Response("Internal Server Error", { status: 500 });
logAndReturn(errorResponse);
return errorResponse;
}
},
};

Step 3: Set Environment Variables

  • In your Worker settings, go to Settings → Variables.

  • Add:

SCRUNCH_SITE_ID=<your-site-id>
SCRUNCH_WEBHOOK_URL=<your-webhook-url>
SCRUNCH_API_KEY=<your-api-key>

Step 4: Deploy the Worker

  • Click Deploy.

  • Once deployed, your Worker will begin forwarding traffic logs to Scrunch.

Step 5: Configure Custom Domain (Optional)

  • If you want your Worker attached to a custom domain:

    • Go to Workers & Pages → your-worker → Settings → Triggers → Custom Domains.

    • Add your domain.

Step 6: Verify Traffic

  • Wait up to 5 minutes for your site to show as “Active” in Scrunch.

  • If you don’t see traffic, test with:

curl -v -H "User-Agent: ScrunchAI-Testbot" https://yourdomain.com

This will send a sample request to confirm logs are flowing.

This is optional—your Worker will still track traffic without a custom domain.

👉 Once configured, your Cloudflare Worker will continuously forward logs to Scrunch, giving you real-time visibility into how LLM bots access your website.


Troubleshooting and Tips

Don’t see any traffic?

  • Make sure your Worker is deployed and active.

  • Double-check environment variables (SCRUNCH_SITE_ID, SCRUNCH_WEBHOOK_URL, SCRUNCH_API_KEY).

  • Check your Worker logs in Cloudflare for any errors.

  • Wait up to 5 minutes—Scrunch may take a short time to detect your configuration.

Tips for better results:

  • The Worker code logs asynchronously, so it won’t block traffic.

  • Always use the credentials shown in your Scrunch app for each site.

  • Repeat the process if you manage multiple sites.

Did this answer your question?