Load a Data Mart from the Responses API

Build a data mart from the Scrunch Responses API by incrementally loading response-level data, normalizing it into fact and dimension tables, and computing metrics in your warehouse.

Prerequisites

A Scrunch API key with Responses access (provision one here)
A brand ID for the brand you want to load
A cloud data warehouse (BigQuery, Snowflake, Redshift, Databricks, etc.)

How the Responses API works

The Responses API returns one record per AI response. Each record includes the full response text, citations, brand and competitor evaluations, and prompt metadata. Key characteristics:

1:1 response model — each item represents a single, distinct AI response
Denormalized — arrays like tags, competitors, and citations are embedded in each record
Reverse chronological — newest responses are returned first
Supports date filtering and pagination — ideal for incremental ETL

For the full response schema, see the Responses API overview.

Incremental loading strategy

Use date windows and pagination to pull new responses on a recurring schedule.

Choose your sync window

Use start_date and end_date (both YYYY-MM-DD, UTC) to define a date range. For daily loads, pull the previous UTC day after midnight to ensure completeness.

Paginate through all results

The API returns paginated results. Increment offset by limit until you have all records for the window.

curl "https://api.scrunchai.com/v1/$BRAND_ID/responses?start_date=2025-06-01&end_date=2025-06-02&limit=1000&offset=0" \
  -H "Authorization: Bearer $SCRUNCH_API_KEY"

The response includes a total count so you know when you’re done:

{
  "total": 2430,
  "offset": 0,
  "limit": 1000,
  "items": [ ... ]
}

Deduplicate on id

Each response has a globally unique id. Use it as a natural key and upsert into your warehouse to handle re-runs or overlapping date windows.

Some fields like brand_present, brand_sentiment, tags, and stage may be re-evaluated if prompt metadata or brand configuration changes in Scrunch. Always upsert rather than append-only.

Schema design

Each API record includes arrays (tags, competitors, citations) that should be normalized into separate tables to avoid row explosion.

responses (1 row per response)
   |
   |—— response_tags (1 row per response x tag)
   |
   |—— response_competitors (1 row per response x competitor)
   |
   |—— response_citations (1 row per response x citation)

If your warehouse has strong support for JSON or ARRAY columns (e.g., BigQuery, Snowflake, Databricks), you can skip the bridge tables and store tags, competitors, and citations as native array or JSON fields directly on the responses table. Use UNNEST or LATERAL FLATTEN at query time instead of pre-normalizing. This simplifies your ETL at the cost of slightly more complex analytics queries.

Responses fact table

Store one row per response with scalar fields:

Column	Type	Notes
`id`	`STRING`	Primary key (globally unique)
`created_at`	`TIMESTAMP`	When the response was captured
`prompt_id`	`STRING`	Foreign key to the prompt
`prompt`	`STRING`	Text of the prompt
`platform`	`STRING`	ChatGPT, Perplexity, Claude, etc.
`persona_id`	`STRING`
`persona_name`	`STRING`
`stage`	`STRING`	Funnel stage enum
`branded`	`BOOLEAN`	Whether the prompt includes brand names
`brand_present`	`BOOLEAN`
`brand_sentiment`	`STRING`	`positive`, `mixed`, `negative`, or `null`
`brand_position`	`STRING`	`top`, `middle`, `bottom`, or `null`
`country`	`STRING`
`response_text`	`STRING`	Full AI-generated answer (markdown)

Bridge tables

Unnest each array into its own table, keyed by response_id. response_tags

Column	Type
`response_id`	`STRING`
`tag`	`STRING`

response_competitors

Column	Type
`response_id`	`STRING`
`competitor_name`	`STRING`
`present`	`BOOLEAN`
`sentiment`	`STRING`
`position`	`STRING`

response_citations

Column	Type
`response_id`	`STRING`
`url`	`STRING`
`title`	`STRING`
`snippet`	`STRING`
`source_type`	`STRING`

Never unnest multiple arrays in the same query. Crossing two arrays (e.g., tags x competitors) causes row explosion and inflates every metric.

Computing metrics

Once your data mart is loaded, you can aggregate responses into daily performance summaries. Group on DATE(created_at), prompt_id, prompt, platform to match the way Scrunch reports prompt-level performance by default. The following examples use PostgreSQL syntax. Translate as needed for your warehouse.

Response count

COUNT(*) AS responses

Brand presence percentage

COUNT(*) FILTER (WHERE brand_present = true)::float
  / NULLIF(COUNT(*)::float, 0.0) AS brand_presence_pct

Brand sentiment score

AVG(
  CASE
    WHEN brand_sentiment = 'positive' THEN 100
    WHEN brand_sentiment = 'mixed' THEN 50
    WHEN brand_sentiment = 'negative' THEN 0
    ELSE NULL
  END
) AS brand_sentiment_score

Brand position score

AVG(
  CASE
    WHEN brand_position = 'top' THEN 100
    WHEN brand_position = 'middle' THEN 50
    WHEN brand_position = 'bottom' THEN 0
    ELSE NULL
  END
) AS brand_position_score

Putting it together

SELECT
  DATE(created_at)  AS response_date,
  prompt_id,
  prompt,
  platform,
  COUNT(*)          AS responses,
  COUNT(*) FILTER (WHERE brand_present)::float
    / NULLIF(COUNT(*)::float, 0.0)
                    AS brand_presence_pct,
  AVG(CASE
    WHEN brand_sentiment = 'positive' THEN 100
    WHEN brand_sentiment = 'mixed'    THEN 50
    WHEN brand_sentiment = 'negative' THEN 0
  END)              AS brand_sentiment_score,
  AVG(CASE
    WHEN brand_position = 'top'    THEN 100
    WHEN brand_position = 'middle' THEN 50
    WHEN brand_position = 'bottom' THEN 0
  END)              AS brand_position_score
FROM responses
GROUP BY 1, 2, 3, 4;

These metric definitions match the ones used in the Scrunch dashboard and Query API. You can also define your own calculations if your use case requires it.

Responses API Overview

Full schema, filtering, and pagination details.

Responses Quickstart

Make your first API call in cURL, Python, or JavaScript.

Get Started

Quickstarts

Guides

Integrations

Load a Data Mart from the Responses API

Prerequisites

How the Responses API works

Incremental loading strategy

Schema design

Responses fact table

Bridge tables

Computing metrics

Response count

Brand presence percentage

Brand sentiment score

Brand position score

Putting it together

Responses API Overview

Responses Quickstart

Get Started

Quickstarts

Guides

Integrations

​Prerequisites

​How the Responses API works

​Incremental loading strategy

​Schema design

​Responses fact table

​Bridge tables

​Computing metrics

​Response count

​Brand presence percentage

​Brand sentiment score

​Brand position score

​Putting it together

​Related

Responses API Overview

Responses Quickstart

Prerequisites

How the Responses API works

Incremental loading strategy

Schema design

Responses fact table

Bridge tables

Computing metrics

Response count

Brand presence percentage

Brand sentiment score

Brand position score

Putting it together

Related