Most GA4 users start with Data Studio's native GA4 connector. It takes two minutes to set up, it's free, and it gets you a working dashboard quickly. For many use cases, that's enough.
But native connector users hit the same walls eventually: data sampling on longer date ranges, a fixed set of available dimensions and metrics, no ability to blend GA4 data with CRM records or advertising cost data, and aggregated numbers that can't be broken down at the session or user level. These aren't edge cases — they're limits that affect any serious reporting setup within a few months.
The BigQuery connector for Data Studio solves all of them. Because you're querying raw event data rather than GA4's pre-aggregated API, you get unsampled results, complete flexibility over what you calculate, and the ability to join GA4 data to anything else you've loaded into BigQuery — ad spend, orders, CRM data, product inventory.
Prerequisites: you need a GA4 property with the BigQuery export enabled, and a Google Cloud project with billing set up. If you haven't configured the export yet, start with our guide on setting up the GA4 BigQuery export. The Data Studio BigQuery connector is free — you only pay for the BigQuery query costs the connector triggers.
Two ways to connect — table vs. custom query
When you add BigQuery as a data source in Data Studio, you're given two connection modes. Choosing the right one upfront saves significant rework later.
Table
Point Data Studio at a specific BigQuery table or view. Data Studio reads the full table schema and lets you build metrics and calculated fields on top of it. Simple to set up, but you're working with raw GA4 event structure — including the nested event_params arrays — which Data Studio can't unnest natively.
Custom Query
Write a SQL query that shapes the data before it reaches Data Studio. You control exactly which columns appear, unnest event parameters into flat columns, pre-aggregate where appropriate, and join to other tables. More setup upfront, but the resulting data source is far easier to build reports on.
For GA4 data, the custom query approach is almost always the right choice. The raw GA4 event table schema — with its nested event_params, user_properties, and items arrays — is not designed for direct reporting. Unnesting and flattening it in SQL before it reaches Data Studio is what makes the connector genuinely useful.
Use BigQuery Views instead of inline custom queries where possible. Rather than pasting a long SQL query directly into the Data Studio connector, create a BigQuery View containing your query and point the connector at the view. This keeps your SQL in one place, makes it easier to update, and means multiple Data Studio reports can share the same logic without duplicating queries.
Adding BigQuery as a data source in Data Studio
The connection process itself is straightforward. Here's the exact sequence:
-
1
Open Data Studio and create or edit a report Go to datastudio.google.com. Open an existing report or create a new one. Click Add data in the toolbar — or, if starting fresh, you'll be prompted to add a data source on the blank canvas.
-
2
Select the BigQuery connector In the data source panel, scroll to find BigQuery in the Google connectors list. Click it. You'll be asked to authorise access if this is your first time — grant the requested permissions using the Google account that has access to your GCP project.
-
3
Choose your project, dataset, and connection mode Select your GCP project from the dropdown, then your GA4 dataset (named
analytics_XXXXXXXXX). At this point, choose Custom Query rather than Table for the reasons described above. -
4
Enable date range parameters Before writing your query, tick the Enable date range parameters checkbox. This is essential — it tells Data Studio to inject the dashboard's date filter into your query automatically, so users can change the date range in the report without you hardcoding dates in SQL.
-
5
Write your query and connect Enter your SQL in the query box. Click Add — Data Studio will run the query, infer the schema from the results, and create the data source. You can then rename fields, change aggregation types, and add calculated fields before using it in a report.
Using date range parameters correctly
The date range parameter is the most important part of a well-built BigQuery data source in Data Studio. Get it wrong and either your dashboard ignores the date filter users set, or it scans your entire dataset on every load — which is both slow and expensive.
When you enable date range parameters, Data Studio makes two variables available inside your query: @DS_START_DATE and @DS_END_DATE. These are injected as strings in YYYYMMDD format, which maps directly to GA4's BigQuery table suffix format.
BigQuery SQL — Using date range parameters in a Data Studio custom query
SELECT
event_date,
event_name,
COUNT(*) AS event_count,
COUNT(DISTINCT user_pseudo_id) AS unique_users
FROM
`your-project.analytics_XXXXXXXXX.events_*`
WHERE
_TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE
GROUP BY
event_date, event_name
ORDER BY
event_date DESC
The _TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE clause is what restricts the query to only the date partitions that fall within the user's selected date range. Without this, every query would scan all historical data regardless of what date range the report shows.
Always use _TABLE_SUFFIX for date filtering, not event_date. Filtering on WHERE event_date BETWEEN '2026-01-01' AND '2026-06-29' tells BigQuery which rows to return but not which partitions to skip. BigQuery still scans every table in the wildcard before applying the row filter. _TABLE_SUFFIX pruning happens at the partition level — it's what actually limits the data scanned and keeps query costs manageable.
Writing a flattening query for GA4 event data
The raw GA4 event table is not flat — key information is stored in nested arrays that SQL and Data Studio need help unpacking. A good flattening query extracts the parameters you actually need into named columns, so report builders don't have to think about GA4's underlying schema.
Here's a practical starting query for a GA4-powered Data Studio dashboard covering sessions, page views, and key conversion events:
BigQuery SQL — GA4 flattening query for Data Studio
SELECT
event_date,
user_pseudo_id,
event_name,
geo.country AS country,
device.category AS device_type,
traffic_source.source AS source,
traffic_source.medium AS medium,
traffic_source.name AS campaign,
-- Extract specific event parameters from the nested array
MAX(CASE WHEN ep.key = 'page_location'
THEN ep.value.string_value END) AS page_location,
MAX(CASE WHEN ep.key = 'page_title'
THEN ep.value.string_value END) AS page_title,
MAX(CASE WHEN ep.key = 'session_engaged'
THEN ep.value.string_value END) AS session_engaged,
MAX(CASE WHEN ep.key = 'ga_session_id'
THEN ep.value.int_value END) AS session_id,
MAX(CASE WHEN ep.key = 'transaction_id'
THEN ep.value.string_value END) AS transaction_id,
MAX(CASE WHEN ep.key = 'value'
THEN ep.value.double_value END) AS revenue
FROM
`your-project.analytics_XXXXXXXXX.events_*`,
UNNEST(event_params) AS ep
WHERE
_TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE
GROUP BY
event_date, user_pseudo_id, event_name,
country, device_type, source, medium, campaign
ORDER BY
event_date DESC
This query gives Data Studio a flat, one-row-per-event table with all the key dimensions already extracted. Report builders can then use Data Studio's native calculated fields and filters on top of it without needing to understand GA4's event schema.
Only extract the event parameters you'll actually use in reports. Every additional MAX(CASE WHEN...) clause adds to the query's complexity and cost. Start with the parameters your reports need, and add more as requirements grow. A leaner query is a faster, cheaper query.
Calculating sessions and engagement rate in BigQuery
One important difference between the BigQuery connector and the native GA4 connector: Data Studio can't calculate sessions automatically from raw event data the way GA4's API does. Sessions, engagement rate, and bounce rate need to be defined explicitly in your SQL.
A session in GA4 is identified by the combination of user_pseudo_id and ga_session_id. An engaged session is one where session_engaged = '1'. Here's how to calculate these correctly at the session level before surfacing them in Data Studio:
BigQuery SQL — Session-level metrics for Data Studio
WITH session_data AS (
SELECT
event_date,
user_pseudo_id,
MAX(CASE WHEN ep.key = 'ga_session_id'
THEN ep.value.int_value END) AS session_id,
traffic_source.source AS source,
traffic_source.medium AS medium,
device.category AS device_type,
geo.country AS country,
MAX(CASE WHEN ep.key = 'session_engaged'
THEN ep.value.string_value END) AS session_engaged,
SUM(CASE WHEN event_name = 'purchase'
THEN 1 ELSE 0 END) AS purchases,
SUM(CASE WHEN ep.key = 'value'
THEN ep.value.double_value ELSE 0 END) AS revenue
FROM
`your-project.analytics_XXXXXXXXX.events_*`,
UNNEST(event_params) AS ep
WHERE
_TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE
GROUP BY
event_date, user_pseudo_id, source, medium, device_type, country
)
SELECT
event_date,
source,
medium,
device_type,
country,
COUNT(*) AS sessions,
COUNT(DISTINCT user_pseudo_id) AS users,
SUM(CASE WHEN session_engaged = '1' THEN 1 ELSE 0 END) AS engaged_sessions,
ROUND(
SAFE_DIVIDE(
SUM(CASE WHEN session_engaged = '1' THEN 1 ELSE 0 END),
COUNT(*)
) * 100, 1
) AS engagement_rate_pct,
SUM(purchases) AS purchases,
ROUND(SUM(revenue), 2) AS revenue,
ROUND(
SAFE_DIVIDE(SUM(purchases), COUNT(*)) * 100, 2
) AS conversion_rate_pct
FROM
session_data
GROUP BY
event_date, source, medium, device_type, country
ORDER BY
event_date DESC
Note the use of SAFE_DIVIDE instead of a plain division operator. BigQuery returns an error if you divide by zero — SAFE_DIVIDE returns null instead, which Data Studio handles gracefully. Always use it when calculating rates.
Keeping your reports fast and costs low
The biggest mistake people make with BigQuery-connected Data Studio reports is not thinking about query cost until they receive their first GCP billing statement. Every time a Data Studio report loads or a filter changes, BigQuery runs your custom query — scanning whatever partitions your date range covers. On a large dataset, a poorly written query can cost several dollars every time someone opens the report.
These are the practices that keep BigQuery-powered reports both fast and affordable:
- Always use partition pruning with
_TABLE_SUFFIX. As described above, this is the single most impactful thing you can do. A query that scans 90 days of data instead of 365 days costs roughly a quarter as much. - Enable Data Studio's data freshness cache. In the data source settings, set a cache duration (typically 12 hours for daily dashboards). Data Studio caches query results and serves them from cache rather than re-querying BigQuery on every page load. This alone can reduce query costs by 80–90% for dashboards that don't need live data.
- Pre-aggregate in BigQuery, not in Data Studio. If your Data Studio report only ever shows data at the day and channel level, aggregate to that level in your SQL before it reaches Data Studio. Don't send Data Studio millions of event-level rows and ask it to aggregate them in the browser.
- Use SELECT * sparingly. BigQuery charges based on bytes scanned.
SELECT *reads every column in every partition in scope. Name only the columns you need. - Create BigQuery Views for shared queries. If multiple reports use the same underlying query, a view ensures they share one maintained SQL definition. It also makes it easy to add clustering or materialization later if performance becomes an issue.
Set a BigQuery budget alert before sharing dashboards widely. In Google Cloud Console, go to Billing → Budgets & Alerts and create an alert for your analytics project. A report that's opened 50 times a day by a team runs your query 50 times — with caching disabled, that adds up quickly. An alert at $20/month gives you early warning before costs become significant.
Joining BigQuery GA4 data with other sources
The most powerful capability the BigQuery connector unlocks — the one the native GA4 connector categorically cannot offer — is joining your analytics data to other datasets inside the same report.
Common joins that deliver real business value:
| Join type | What it enables | Join key |
|---|---|---|
| GA4 events + CRM orders | Match GA4 sessions to actual order records — reconcile GA4 conversion counts against backend data, identify sessions that converted without a GA4 event | transaction_id |
| GA4 sessions + ad spend | Calculate true ROAS and cost-per-acquisition using actual ad spend data alongside GA4 conversion values, broken down by campaign and channel | campaign / source / medium |
| GA4 events + product catalogue | Enrich item-level ecommerce events with product metadata — category, margin, stock status — that isn't available in GA4's event data | item_id |
| GA4 users + CRM segments | Overlay CRM customer segments (lifetime value tier, acquisition cohort) onto GA4 behavioural data to understand how different customer types engage with the site | user_id (requires user_id to be set in GA4) |
These joins happen in your BigQuery SQL, not in Data Studio's blend feature. Data Studio blending is limited and slow on large datasets — do the join in BigQuery where it's fast, cheap, and properly optimised, then surface the already-joined result to Data Studio.
BigQuery SQL — Joining GA4 conversions to CRM order data
WITH ga4_purchases AS (
SELECT
event_date,
user_pseudo_id,
traffic_source.source AS source,
traffic_source.medium AS medium,
MAX(CASE WHEN ep.key = 'transaction_id'
THEN ep.value.string_value END) AS transaction_id,
MAX(CASE WHEN ep.key = 'value'
THEN ep.value.double_value END) AS ga4_revenue
FROM
`your-project.analytics_XXXXXXXXX.events_*`,
UNNEST(event_params) AS ep
WHERE
_TABLE_SUFFIX BETWEEN @DS_START_DATE AND @DS_END_DATE
AND event_name = 'purchase'
GROUP BY
event_date, user_pseudo_id, source, medium
)
SELECT
p.event_date,
p.source,
p.medium,
p.transaction_id,
p.ga4_revenue,
o.actual_order_value,
o.order_status,
o.customer_type,
-- Flag discrepancies between GA4 and backend values
CASE
WHEN o.transaction_id IS NULL THEN 'GA4 only — not in CRM'
WHEN p.transaction_id IS NULL THEN 'CRM only — missed by GA4'
ELSE 'Matched'
END AS reconciliation_status
FROM
ga4_purchases p
FULL OUTER JOIN `your-project.crm_dataset.orders` o
USING (transaction_id)
ORDER BY
p.event_date DESC
The reconciliation_status column in this query is immediately useful as a Data Studio scorecard or table — it surfaces GA4 tracking gaps and over-counting in a way that no analytics-only tool can show you.
Data source schema tips for cleaner reports
Once your BigQuery query is connected, Data Studio creates a data source schema — a list of all the fields your query returns, with inferred types and default aggregation settings. Spending five minutes cleaning this up saves hours of confusion for anyone building reports on it.
- Rename fields to human-readable labels. Column names like
traffic_source_sourceandengagement_rate_pctare fine in SQL but confusing in a report builder. Rename them toChannel SourceandEngagement Rate (%)in the data source editor. - Set correct field types. Data Studio infers types from your SQL output. Check that date fields are typed as Date (not Text), that revenue fields are typed as Number with Currency formatting, and that percentage fields are typed as Percent. Incorrect types cause chart errors and formatting problems downstream.
- Set default aggregations. For metric fields, set whether the default aggregation is Sum, Average, or Count Distinct. A sessions field should default to Sum; an engagement rate field should default to Average. Report builders can override these, but sensible defaults prevent mistakes.
- Mark dimension vs. metric fields correctly. Data Studio distinguishes between dimensions (things you group by — country, device, channel) and metrics (things you measure — sessions, revenue, conversions). Make sure your schema reflects this accurately. A dimension set as a metric, or vice versa, breaks chart configurations.
When to switch from the native GA4 connector
Not every Data Studio report needs a BigQuery data source. The native GA4 connector is faster to set up, requires no SQL knowledge, and is perfectly adequate for many reporting needs. Here are the signals that it's time to make the switch:
Sampling warnings
If you regularly see the orange sampling indicator in GA4 reports, or if your Data Studio totals don't match GA4's UI for long date ranges, you're hitting the native connector's data threshold.
Missing dimensions
You need a dimension that GA4's pre-built schema doesn't expose — a custom event parameter, a calculated dimension, or anything that requires looking at the raw event data directly.
Cross-dataset reporting
You want to show GA4 data alongside ad spend, CRM data, or backend order data in the same report — something the native connector can only approximate through Data Studio's limited blending feature.
Custom metric definitions
The way GA4 calculates sessions, conversions, or engagement rate doesn't match your business definition. BigQuery lets you define these metrics from first principles using your own SQL logic.
If you're hitting any of these, the BigQuery connector is the right move. The setup investment — writing a solid flattening query, configuring the schema — typically takes a few hours. The payoff is a reporting layer that's faster, more accurate, and substantially more flexible than anything the native connector can produce.
Want a BigQuery-powered Data Studio setup built for your business?
We design and build BigQuery data pipelines and Data Studio dashboards that pull from your raw GA4 event data — unsampled, joined to your other data sources, and built to scale. Book a free 30-minute audit and we'll show you exactly what your current reporting setup is leaving on the table.