Blog

Basic SQL Recipes for Web Data in Snowplow

By
Snowplow Team
&
September 5, 2024
Share this post

Extracting actionable insights from Snowplow data often begins with mastering foundational SQL queries. In this post, we outline essential SQL recipes for analyzing web data, including page views, session counts, and conversion rates.

Q: How can I count page views by URL?

Tracking page views helps identify high-traffic content and assess user engagement. Use the following query to count page views by URL:

SELECT
  page_urlpath AS url,
  COUNT(*) AS page_views
FROM atomic.events
WHERE event_name = 'page_view'
GROUP BY page_urlpath
ORDER BY page_views DESC;

Q: How can I calculate average session duration?

Understanding session duration can reveal how long users engage with content. Calculate it using the following query:

WITH sessions AS (
  SELECT
    domain_sessionid,
    MIN(derived_tstamp) AS session_start,
    MAX(derived_tstamp) AS session_end
  FROM atomic.events
  GROUP BY domain_sessionid
)
SELECT
  domain_sessionid,
  (session_end - session_start) / 60000 AS session_duration_minutes
FROM sessions;

Q: How do I identify top referrers?

Referrer analysis helps track sources of incoming traffic. Extract referrer data with this query:

SELECT
  refr_urlhost AS referrer,
  COUNT(*) AS visits
FROM atomic.events
WHERE refr_urlhost IS NOT NULL
GROUP BY refr_urlhost
ORDER BY visits DESC;
Q: How can I calculate conversion rates?
To assess conversion rates, track key actions like sign-ups or purchases. Example query:
WITH conversions AS (
  SELECT
    user_id,
    COUNT(*) AS total_events,
    SUM(CASE WHEN event_name = 'conversion' THEN 1 ELSE 0 END) AS conversions
  FROM atomic.events
  GROUP BY user_id
)
SELECT
  user_id,
  (conversions::FLOAT / total_events) * 100 AS conversion_rate
FROM conversions;

Q: How do I track users across sessions?

Identifying repeat users across sessions enables deeper behavioral analysis. Use this query to list users and their session counts:

SELECT
  user_id,
  COUNT(DISTINCT domain_sessionid) AS session_count
FROM atomic.events
GROUP BY user_id
ORDER BY session_count DESC;

Final Thoughts

Snowplow’s event-level data model offers rich opportunities for web data analysis using SQL. By leveraging these foundational SQL recipes, data engineers and analysts can quickly extract insights and build more complex data models. Stay tuned for advanced SQL recipes and use cases in upcoming posts.

Subscribe to our newsletter

Get the latest content to your inbox monthly.

Get Started

Whether you’re modernizing your customer data infrastructure or building AI-powered applications, Snowplow helps eliminate engineering complexity so you can focus on delivering smarter customer experiences.