SQL Tips: User Segmentation and Analysis w BigQuery

Using GA4 data for user segmentation is crucial to understanding the behaviors of different groups and developing tailored strategies. In this article, I provide an example SQL query for user segmentation and a step-by-step explanation.

Query: Session Behavior by City

The following query segments users by their city (e.g., Istanbul, Ankara) and returns total session counts, average session duration, and revenue (conversions) data.

-- City-based segmentation query for GA4 BigQuery dataset
WITH city_segments AS (
SELECT
geo.city AS city_name,
user_pseudo_id,
event_name,
event_timestamp,
concat(user_pseudo_id,(select value.int_value from unnest(event_params) where key = 'ga_session_id')) as session_id, (max(event_timestamp)-min(event_timestamp))/1000000 as session_length_in_seconds
,
ecommerce.purchase_revenue AS revenue
FROM
your_project_id.analytics_XXXXXX.events_*//Dataset
WHERE
_TABLE_SUFFIX BETWEEN '20240101' AND '20240131'//Date Range
)

SELECT
city_name,
COUNT(DISTINCT user_pseudo_id) AS unique_users,
COUNTIF(event_name = 'session_start') AS total_sessions,
sum(session_length_in_seconds) / count(distinct session_id) as average_session_duration_seconds
SUM(revenue) AS total_revenue
FROM
city_segments
GROUP BY
city_name
ORDER BY
total_sessions DESC;

Step-by-Step Explanation

  1. Data Source: The query pulls data from the events_* tables in the GA4 BigQuery dataset. The date range is defined using the _TABLE_SUFFIX filter (January 2024 data).
  2. City Segmentation: The user’s city is retrieved using the geo.city field.
  3. Average Session Duration: For each session, the query calculates its duration in seconds by finding the difference between the maximum (max(event_timestamp)) and minimum (min(event_timestamp)) event timestamps within the session.The timestamps in BigQuery are stored in microseconds, so the result is divided by 1,000,000 to convert it to seconds.
  4. Grouping and Summarizing:
    For each city, the query calculates total sessions, unique users, average session duration (in seconds), and total revenue. Results are ordered by total session counts.

Interpreting Results

  • Which Cities Are More Active? Use total session counts and unique user data to identify the most active cities.
  • Revenue-Generating Cities: Highlight the cities contributing the most revenue.
  • Engagement Times: Compare average session durations across cities to understand behavioral differences.

This query provides a foundation for both beginner and advanced analyses. Apply it to your GA4 projects to start interpreting the results. Stay tuned for more segmentation ideas!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top