Skip to main content

Documentation Index

Fetch the complete documentation index at: https://developers.scrunch.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The Sitemap API exposes the pages Scrunch has discovered while crawling your brand’s sites, together with their AI search readiness audit score, priority and optimized-content flags, and per-page time-series metrics for citations, AI referrals, and AI agent traffic. Use it to enumerate the pages in your sitemap, drill into a single page, pull its performance trend, or dump everything to CSV or XLSX for downstream reporting. All endpoints are read-only and return data from the most recent completed (non-stale) crawl for the brand.

What the Sitemap API includes

  • The pages from the brand’s latest finished crawl, paginated and filterable
  • Title, meta description, depth, canonical URL, and most recent audit score
  • Priority flag (matches the brand’s priority_page and priority_path overrides)
  • Optimized-content flag (page has active AXP content deployed)
  • Per-page time-series for citations, AI referrals, and AI agent traffic, bucketed daily or weekly
  • A full CSV or XLSX export that mirrors the filters, columns, and trend percentages from the sitemap view in the Scrunch dashboard

When to use the Sitemap API

Use the Sitemap API when you need to:
  • Build a content inventory for the brand from the latest crawl
  • Identify priority pages or pages missing optimized content
  • Pull per-page citations, AI referrals, or agent traffic for a custom dashboard
  • Export the sitemap (with totals and trends) to a spreadsheet for review
For aggregated bot traffic across a whole site, use the Agent Traffic API instead.

Endpoints

MethodPathPurpose
GET/{brand_id}/sitemap/pagesPaginated list of pages with filters.
GET/{brand_id}/sitemap/pages/{page_id}Detail for a single page.
GET/{brand_id}/sitemap/pages/{page_id}/metricsTime-series metrics for one page.
GET/{brand_id}/sitemap/exportCSV or XLSX export of the filtered page set.
All endpoints require a bearer token with query scope. See Authentication.

Filters

The list and export endpoints share the same filter set:
FilterDescription
domainRestrict to a specific domain when the brand has multiple registered sites.
max_depthMaximum URL path depth (0 = root only).
path_prefixSegment-aligned path prefix. /blog matches /blog and /blog/post but not /blogger.
is_priorityKeep only (or exclude) pages flagged as priority for the brand.
has_optimized_contentKeep only (or exclude) pages with active AXP optimized content.
searchCase-insensitive substring match on the page URL or title.
The list endpoint additionally supports limit and offset for pagination.

Example: list pages

curl -X GET \
  "https://api.scrunchai.com/v1/1234/sitemap/pages?max_depth=2&is_priority=true&limit=20" \
  -H "Authorization: Bearer $SCRUNCH_API_TOKEN"
Response:
{
  "items": [
    {
      "id": 9087,
      "url": "https://example.com/products/widgets",
      "title": "Widgets — Example",
      "description": "Our flagship widget lineup.",
      "depth": 2,
      "audit_score": 84,
      "canonical_url": "https://example.com/products/widgets",
      "is_priority": true,
      "has_optimized_content": false
    }
  ],
  "total": 1,
  "offset": 0,
  "limit": 20,
  "domain": "example.com",
  "last_crawl_completed": "2025-05-08T03:14:00Z"
}

Example: per-page metrics

curl -X GET \
  "https://api.scrunchai.com/v1/1234/sitemap/pages/9087/metrics?start_date=2025-04-01&end_date=2025-05-01" \
  -H "Authorization: Bearer $SCRUNCH_API_TOKEN"
The response returns three series (citations, ai_referrals, agent_traffic), each as a list of buckets keyed by ISO date with counts split by AI platform. Granularity (daily or weekly) is chosen automatically based on the requested range.

Example: export to XLSX

curl -X GET \
  "https://api.scrunchai.com/v1/1234/sitemap/export?format=xlsx&include_metrics=true&start_date=2025-04-01&end_date=2025-05-01" \
  -H "Authorization: Bearer $SCRUNCH_API_TOKEN" \
  -o sitemap.xlsx
With include_metrics=true (the default), each row carries totals for agent_traffic, citations, and ai_referrals over the supplied date range, plus a percent-change column comparing the first and last buckets — matching the trend percentages shown in the dashboard. The filename comes from the response’s Content-Disposition header.

Limits and behavior

  • The export endpoint caps at 10,000 rows per call. Tighten the filters or contact support for a bulk export.
  • The metrics and export endpoints cap the start_date/end_date window at 366 days.
  • All endpoints read from the most recent completed, non-stale crawl. If no crawl has finished, the list endpoint returns 404.
  • User-controlled strings (URL, title, description, canonical URL) are sanitized in CSV and XLSX output to prevent spreadsheet formula injection.

Best practices

  • Use path_prefix to scope queries to a section of the site rather than filtering client-side.
  • Use search for find-as-you-type style flows; it matches the dashboard’s “Find pages…” input.
  • Pull metrics in weekly granularity (longer ranges) for trend reporting; the API chooses this automatically when the range exceeds the daily cap.
  • Pair is_priority=true with the export endpoint to produce a focused priority-page report.