Agent Traffic API Overview

Overview

The Agent Traffic API provides aggregated analytics on AI bot activity across your websites. Track which AI platforms are crawling your content, what pages they access, and understand patterns in retrieval, training, and indexing behavior. This API is optimized for time-series analysis and bot classification, making it ideal for SEO teams, content strategists, and developers building AI visibility monitoring into their workflows. Each request returns aggregated bot traffic metrics grouped by the dimensions you select.

What the Agent Traffic API includes

The Agent Traffic API returns aggregated metrics including:

Request counts by bot source
Traffic patterns by date or week
Bot activity by page path
Classification by bot type (retrieval, training, indexer)

These metrics can be grouped by dimensions including:

Date (day or week buckets)
Site domain
URL path
Agent source (e.g., chatgpt-user, claudebot, gptbot)
Agent type (retrieval, training, indexer)

All results are aggregated summaries optimized for trend analysis and monitoring.

When to use the Agent Traffic API

Use the Agent Traffic API when you need:

Weekly or daily bot traffic reporting
Trend analysis of AI crawler behavior over time
Path-level breakdowns of bot activity
Bot classification by retrieval vs training purposes
Data for SEO dashboards monitoring AI visibility
Automated alerts based on traffic patterns

The Agent Traffic API is designed to answer questions like: “Which AI bots are crawling my content?” and “How is bot traffic trending across different sections of my site?”

When not to use the Agent Traffic API

The Agent Traffic API is not appropriate if you need:

Raw access log entries
Individual request details (user agents, IP addresses, timestamps)
Real-time streaming data
Non-bot traffic analytics

For CDN setup and log configuration, see the Agent Traffic Integration Guide.

Example query

curl -X GET \
  "https://api.scrunchai.com/v1/1234/sites/01JW849S5DJZ3CCE4DA6TFMYEY/agent-traffic?start_date=2025-01-01&end_date=2025-01-31&fields=date,agent_source,agent_type&time_bucket=week" \
  -H "Authorization: Bearer $SCRUNCH_API_TOKEN"

Response:

{
  "meta": {
    "start_date": "2025-01-01",
    "end_date": "2025-01-31",
    "time_bucket": "week"
  },
  "data": [
    {
      "date": "2025W01",
      "agent_source": "chatgpt-user",
      "agent_type": "retrieval",
      "requests": 1247
    },
    {
      "date": "2025W01",
      "agent_source": "claudebot",
      "agent_type": "training",
      "requests": 892
    }
  ]
}

Available dimensions

Field	Description	Example Values
`date`	Timestamp bucket	`20250115` (day) or `2025W03` (week)
`site`	Domain	`example.com`
`path`	URL path	`/blog/article`
`agent_source`	Bot identifier	`chatgpt-user`, `claudebot`, `gptbot`
`agent_type`	Bot category	`retrieval`, `training`, `indexer`

Time bucketing

Control the granularity of date aggregation using the time_bucket parameter:

day (default): Daily aggregation with dates formatted as YYYYMMDD
week: Weekly aggregation with dates formatted as YYYYWW (ISO week number)

Weekly buckets reduce result size and are recommended for long-range trend analysis.

Path filtering

Use the path parameter to filter results by URL path prefix:

# Only show bot traffic to blog articles
?path=/blog/

# Only show traffic to a specific section
?path=/products/widgets

Path matching uses prefix-based filtering with SQL LIKE patterns (path LIKE '/blog/%'). All user input is properly escaped to prevent SQL injection.

Limits and performance considerations

Maximum rows per request: 100,000
Default limit: 10,000 rows
Results are pre-aggregated for fast retrieval
Use pagination (limit and offset) for large result sets

For best performance:

Use weekly bucketing when possible to reduce cardinality
Keep path filters specific to reduce result size
Request only the dimensions you need

Security and validation

The Agent Traffic API implements strict security measures:

Site ID validation: All site IDs are validated against ULID format using regex
Parameter validation: All query parameters are validated before SQL generation
SQL injection prevention: Path filters use escaped LIKE patterns with no direct string concatenation
Authentication: All requests require valid bearer token authentication

Best practices

Use time_bucket=week for trend analysis spanning more than 30 days
Filter by path when analyzing specific site sections
Group by agent_type to distinguish retrieval bots from training crawlers
Run separate queries for different reporting needs rather than over-selecting dimensions
Monitor agent_source trends to identify new AI platforms crawling your content

Typical use cases

Teams commonly use the Agent Traffic API to:

Monitor which AI platforms are indexing their content
Identify pages with high bot traffic for SEO optimization
Track changes in crawler behavior after content updates
Build dashboards showing AI visibility by site section
Alert on unusual bot traffic patterns
Analyze the impact of robots.txt changes on AI crawler access

Prerequisites

Before using the Agent Traffic API, you must:

Configure your CDN or hosting provider to send access logs to Scrunch
Verify your site is properly configured in your Scrunch account
Obtain your site ID from the Scrunch dashboard

Set up Agent Traffic logging

Configure your CDN integration →

Query API

Responses API

Agent Traffic API

Site Audit API

Configuration API

Overview

What the Agent Traffic API includes

When to use the Agent Traffic API

When not to use the Agent Traffic API

Example query

Available dimensions

Time bucketing

Path filtering

Limits and performance considerations

Security and validation

Best practices

Typical use cases

Prerequisites

Set up Agent Traffic logging

Query API

Responses API

Agent Traffic API

Site Audit API

Configuration API

​Overview

​What the Agent Traffic API includes

​When to use the Agent Traffic API

​When not to use the Agent Traffic API

​Example query

​Available dimensions

​Time bucketing

​Path filtering

​Limits and performance considerations

​Security and validation

​Best practices

​Typical use cases

​Prerequisites

Set up Agent Traffic logging

Overview

What the Agent Traffic API includes

When to use the Agent Traffic API

When not to use the Agent Traffic API

Example query

Available dimensions

Time bucketing

Path filtering

Limits and performance considerations

Security and validation

Best practices

Typical use cases

Prerequisites