Featured, How to guides

Google Analytics alternatives: How to Try Snowplow in an afternoon

With the sunsetting of Universal Analytics next year, many data teams are considering alternatives to GA; concerned about ongoing compliance issues as well as the complexities and limitations of migrating to GA4, they’re on the lookout for a data platform that’s secure, flexible, and future-proof.

They want a solution that will enable them to fuel significant growth whilst staying fully compliant with evolving privacy regulations

As it just so happens, Snowplow fits the bill. 

Snowplow is a Behavioral Data Platform which generates, enhances and models incredibly rich, granular behavioral data. Our open core technology delivers high-quality, AI-ready behavioral data to your cloud data warehouse, lake, or stream, ready to power sophisticated data applications.

This includes data applications that you would generally associate with Google Analytics, including web analytics and ecommerce tracking.

If you’re curious to see how Snowplow compares with GA in delivering these data apps, you can simply Try Snowplow; it’s the free, easy-to-install version of our technology which allows you to generate and query Snowplow data. 

In this guide, we’ll tell you how. We’ll run you through how to set up Try Snowplow and implement the web and ecommerce data apps, highlighting how our technology differs from GA along the way (*note – This info is also available in our documentation, which also includes a number of tutorials on how to implement other data apps, such as funnel analysis and content analytics). 

Installing Try Snowplow 

The first step is to sign up to Try Snowplow. Once signed up, you’ll be guided through the process of installing a minified version of Snowplow BDP technology which creates and processes events, as well as a Postgres database (where the data will be delivered). 

Each user that signs up is given their own pipeline and Postgres database (hosted by Snowplow), which expires after 14 days. The entire stack is deleted once this period has finished, along with all data collected and stored in Postgres. 

Tracking events with Try Snowplow 

Once you’ve signed up to Try Snowplow, you’ll have access to a limited version of the Snowplow console.

Using a code snippet from the console (under ‘implementation instructions’), you can instrument web tracking in your application or via Google Tag Manager.

Instrumenting the web tracker in your application

  • Copy the JavaScript code snippet from the Try Snowplow console.
  • Paste the tracking code into the page source <head> section of your application and deploy the changes.
  • Your pipeline should now capture events from your application.

Instrumenting the web tracker via Google Tag Manager

  • Copy the JavaScript code snippet from the Try Snowplow console.
  • Navigate to the Google Tag Manager account you wish to instrument tracking to
  • Create a new Custom HTML tag and paste the Javascript snippet into the tag
  • Set it to fire on ‘All Pages’ or a trigger of your choosing
  • You can preview your tag to send some events into Try Snowplow before publishing it
  • Your pipeline should now capture events from your application

Debugging 

In the ‘Pipeline Status’ section of the Try Snowplow console, you can check the health of the application. If the first two lines are checked, your pipeline is ready to receive events. 

Accessing your data with Try Snowplow 

The events Try Snowplow creates will be stored in a postgres database, which contains the standard Snowplow schemas: atomic (for raw data), bad_rows (for data that has failed pipeline validation) and derived (for modeled tables).

To access this data, you need to request a password from within the Snowplow console; please bear in mind, you can only do this once for security reasons. 

Querying your data 

Like Snowplow BDP, Try Snowplow encourages you to connect your BI or query tool of choice to access the database and query your data.

You can either copy a sample query from the console tutorial, check out the Recipes or start exploring your data with your own queries (below, you’ll find the queries 

Web Analytics 

Snowplow Vs GA4 

Snowplow is an incredibly flexible tool, allowing you to gain an extremely detailed view of how users engage with your website/s. As opposed to GA4, our technology ensures that only high-quality, structured behavioral data reaches your data warehouse – ‘failed’ events do not make it through to the database, and no arbitrary opinions are expressed on the data via blackbox sampling. 

It’s also worth highlighting that GA4 has a daily export limit of 1 million events to BigQuery. In their words, “If your property consistently exceeds the export limit, the daily BigQuery export will be paused and previous days’ reports will not be reprocessed.” This represents a potentially significant blocker for teams looking to collect a high volume of events (to put this number into perspective, Netflix’s team has reported collecting 8 million events per second during peak hours). 

Implementing Web Analytics on Try Snowplow 

You have already set up Snowplow’s out-of-the-box tracking for web analytics above when you instrumented the Javascript Tracker. These include Page_view and Page_ping

The next step in implementing a web analytics use case on Try Snowplow is to aggregate this data into sessions. Whilst sessions don’t tell you everything, and don’t necessarily represent the whole customer journey, they’re generally a good starting point to investigate questions such as: 

  • How many sessions does each of your marketing channels generate?
  • What is the average time users spend engaging with your site in a given session? How does that compare to the average session length?
  • How many pages do users look at in a given session?

Updating the sessionization logic (optional)

The Snowplow JavaScript tracker automatically tracks a session identifier and a session index with all web events. Sessions are reset after 30 minutes of inactivity by default, but this can be changed this in the tracker initialization by adding the sessionCookieTimeout (in seconds):

window.snowplow("newTracker", "sp", ..., {
appId: "try-snowplow-tracking",
platform: "web",
sessionCookieTimeout: 3600,
contexts: {
webPage: true,
performanceTiming: true
}
});

Furthermore, you can manually reset a session, for example after a conversion, like so:

window.snowplow("newSession");

Go ahead and update the sessionization logic in your tracker implementation if you would like to. More information on the Snowplow session cookie can be found here.

Modeling sessions for web analytics

What does the model do?

For this recipe you’ll create a simple session table describing web engagement by running the following query in your query tool of choice. This is a very simplified version of the sessions table produced by our standard web data model. For each session, it will capture the session ID, session start and end times, marketing channel as well as engagement information: page views, link clicks and time engaged (in seconds).

First generate the table:

CREATE TABLE derived.sessions AS(
WITH sessions AS (
SELECT
ev.domain_sessionid AS session_id,
MIN(ev.derived_tstamp) AS session_start,
MAX(ev.derived_tstamp) AS session_end,
SUM(CASE WHEN ev.event_name = 'page_view' THEN 1 ELSE 0 END) AS page_views,
SUM(CASE WHEN ev.event_name = 'link_click' THEN 1 ELSE 0 END) AS link_clicks,
10*SUM(CASE WHEN ev.event_name = 'page_ping' THEN 1 ELSE 0 END) AS time_engaged_in_s
FROM atomic.events AS ev
GROUP BY 1
)
SELECT
-- session information
s.session_id,
s.session_start,
s.session_end,
-- marketing channel
CASE
WHEN ev.refr_medium IS NULL AND ev.page_url NOT ILIKE '%utm_%' THEN 'Direct'
WHEN (ev.refr_medium = 'search' AND ev.mkt_medium IS NULL) OR (ev.refr_medium = 'search' AND ev.mkt_medium = 'organic') THEN 'Organic Search'
WHEN ev.refr_medium = 'search' AND ev.mkt_medium SIMILAR TO '%(cpc|ppc|paidsearch)%' THEN 'Paid Search'
WHEN ev.refr_medium = 'social' OR ev.mkt_medium SIMILAR TO '%(social|social-network|social-media|sm|social network|social media)%' THEN 'Social'
WHEN ev.refr_medium = 'email' OR ev.mkt_medium ILIKE 'email' THEN 'Email'
WHEN ev.mkt_medium SIMILAR TO '%(display|cpm|banner)%' THEN 'Display'
ELSE 'Other'
END AS marketing_channel,
-- activity
s.page_views,
s.link_clicks,
s.time_engaged_in_s
FROM atomic.events AS ev
INNER JOIN sessions AS s
ON ev.domain_sessionid = s.session_id AND ev.derived_tstamp = s.session_start
GROUP BY 1,2,3,4,5,6,7
);

And then view it:

SELECT * FROM derived.sessions;

Other queries you might want to run:

Sessions by marketing channel:

SELECT
marketing_channel,
COUNT(DISTINCT session_id) AS sessions
FROM derived.sessions
GROUP BY 1 ORDER BY 2 DESC;
Average number of page views and time engaged in seconds per session:
SELECT
AVG(page_views) AS avg_page_views,
AVG(time_engaged_in_s) AS avg_time_engaged_in_s
FROM derived.sessions;

Web Analytics with Try Snowplow – Breakdown 

By this point, we hope you have a good idea of how Snowplow can be used to capture accurate web analytics. Following the above:

  • You have captured a session identifier with all web events, and customized the sessionization logic to match your requirements.
  • You have run a simple SQL query to model the Snowplow data collected from your website into sessions. Based on the sessions table, you can easily see how users are engaging with your site.

Ecommerce analytics 

Snowplow vs GA4 

The same advantages of Snowplow for web analytics apply to e-commerce analytics, when comparing it to GA4. The behavioral data that Snowplow creates is richer, more reliable, and can be used to power advanced AI and ML use cases. With Snowplow, you can define precisely the behavioral data you need to show how and why customers are purchasing (or not purchasing) products from your online store. 

Implementing ECommerce Analytics on Try Snowplow 

You have already set up the out-of-the-box trackers needed for ecommerce analytics when instrumenting the Javascript Tracker. 

To understand how people are engaging with your products, however, you’ll need to make a couple of additions to the tracking. This includes: 

  • Extending tracking to include cart actions and purposes; for this purpose, we’ve created a couple of custom events for you to instrument. 

Designing and implementing the product entity

Designing the product entity

We have already created a custom product entity for you, and uploaded its data structure to your Iglu server.

Snowplow uses self-describing JSON schemas to structure events and entities so that they can be validated in the pipeline and loaded into tidy tables in the warehouse. You can learn more about these data structures here, and about why we take this approach here.

While Try Snowplow only ships with a predesigned set of custom events and entities required for the recipes, Snowplow BDP lets you create an unlimited number of your own via the Data Structures UI (and API).The product entity has the following fields:

FieldDescriptionTypeValidationRequired?
nameThe name of the piece of contentstringmaxLength: 255✅ 
priceThe current price of the productnumberminimum: 0maximum: 100000 multipleOf: 0.01
quantityThe number of this product (used in basket events)integerminimum: 0 maximum: 100000
categoryThe category of the productstringmaxLength: 255
skuThe SKU for the productstringmaxLength: 255

Implementing the product entity

In the Javascript Tracker

Add the product entity to your page_view and page_ping events by editing your trackPageView events to include the entity. Specifically, you’ll update

window.snowplow('trackPageView');

To

window.snowplow('trackPageView', {
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name",
"quantity": 1,
"price": 100,
"category": "example_category",
"sku": "example_sku"
}
}]
});

Via Google Tag Manager

If you are using Google Tag Manager, you can add the variables like this:

window.snowplow('trackPageView', {
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "{{example_name_variable}}",
"quantity": {{example_quantity_variable}},
"price": {{example_price_variable}},
"category": "{{example_category_variable}}",
"sku": "{{example_sku_variable}}"
}
}]
});

Designing and implementing the cart_action event

Designing the cart_action event

The cart_action event records actions that the user performs to their cart. In this simplified version you’ll be recording a single property that describes whether an item was added or removed.

FieldDescriptionTypeValidationRequired?
typeThe type of action taken by the userstringenum: ["add", "remove“]✅ 

Implementing the cart_action event

When you trigger the cart_action event, you’ll also want to attach the product entity that we designed earlier to describe which product is being changed in the cart.Instrument the cart_action event when items are added to or removed from the cart on your website.

window.snowplow('trackSelfDescribingEvent', {
"event": {
"schema": "iglu:com.trysnowplow/cart_action/jsonschema/1-0-0",
"data": {
"type": "add" // or "remove"
}
},
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name",
"quantity": 1,
"price": 100,
"category": "example_category",
"sku": "example_sku"
}
}]
});

Designing and implementing the purchase event

Designing the purchase event

The purchase event is a simple event that should be triggered when a purchase is made.

The event itself has no properties, but should be sent with one or more product entities that describe which products were purchased.

Implementing the purchase event

When you trigger the purchase event, you’ll want to attach one or more of the product entity to describe what has been purchased.

Instrument the purchase event when a purchase is made in your store.

Example for a single product purchase

window.snowplow('trackSelfDescribingEvent', {
"event": {
"schema": "iglu:com.trysnowplow/purchase/jsonschema/1-0-0",
"data": {}
},
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name",
"quantity": 1,
"price": 100,
"category": "example_category",
"sku": "example_sku"
}
}]
});

Example for a multi-product purchase

window.snowplow('trackSelfDescribingEvent', {
"event": {
"schema": "iglu:com.trysnowplow/purchase/jsonschema/1-0-0",
"data": {}
},
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name",
"quantity": 1,
"price": 100,
"category": "example_category",
"sku": "example_sku"
}
},{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name_2",
"quantity": 1,
"price": 50,
"category": "example_category_2",
"sku": "example_sku_2"
}
}]
});

Modeling the data you’ve collected

What does the model do?

The tracking above captures events about the user’s product purchasing journey, and attaches the context of which product was engaged with to all events you are firing. You can now start to get a better understanding of how your products are performing.

For this recipe you’ll create a simple table describing product engagement. Specifically, for each product you’ll aggregate the number of product views, add to carts, remove from carts and purchases, as well as the revenue earned.

Once you have collected some data with your new tracking you can run the following two queries in your tool of choice.

First generate the table:

CREATE TABLE derived.products AS(
SELECT
p.category AS product_category,
p.name AS product_name,
p.sku AS product_sku,
p.price AS product_price,
SUM(CASE WHEN ev.event_name = 'page_view' THEN 1 ELSE 0 END) AS product_views,
SUM(CASE WHEN ev.event_name = 'cart_action' AND ca.type = 'add' THEN p.quantity ELSE 0 END) AS add_to_carts,
SUM(CASE WHEN ev.event_name = 'cart_action' AND ca.type = 'remove' THEN p.quantity ELSE 0 END) AS remove_from_carts,
SUM(CASE WHEN ev.event_name = 'purchase' THEN p.quantity ELSE 0 END) AS purchases,
SUM(CASE WHEN ev.event_name = 'purchase' THEN 1 ELSE 0 END * p.quantity * p.price) AS revenue
FROM atomic.events AS ev
INNER JOIN atomic.com_trysnowplow_product_1 AS p
ON ev.event_id = p.root_id AND ev.collector_tstamp = p.root_tstamp
LEFT JOIN atomic.com_trysnowplow_cart_action_1 AS ca
USING(root_id,root_tstamp)
WHERE ev.event_name IN ('page_view', 'cart_action', 'purchase')
GROUP BY 1,2,3,4
);

And then view it:

SELECT * FROM derived.products;

Ecommerce analytics with Try Snowplow – Breakdown  

  • You have captured granular data around how your users are engaging with your products throughout their purchasing journeys.
  • You have modeled this data into a product engagement table that surfaces the user engagement per product.

What you might want to do next

Understanding how your users are engaging with your products is the first step in optimizing your e-commerce store. Next, you might want to

  • Extend this table to include returns by joining this data with data from your transactional databases, so you get a more accurate picture of how products are actually performing.
  • Extend this table to include where these products are being promoted on your site to understand how visual merchandising affects performance.
  • Join this data with your inventory data to get a 360 view of e-commerce strategy.
  • Start mapping the relationships between products based on user behavior, working towards compelling product recommendations.
  • Pivot this data to look at users instead: understand which marketing channels customers come from, and their customer lifetime value.

Try Snowplow: What now? 

We hope this guide has helped you in implementing web and ecommerce analytics in Try Snowplow. By following these steps, you’ll have a glimpse into what a post- Universal Analytics world could look like outside of the Google ecosystem. With Snowplow delivering clean, granular event-level data to your warehouse, you could power incrementally complex data applications to propel your organization forward. 

If you have any questions on how Try Snowplow works or how it compares to GA4, feel free to get in contact with us here.

More about
the author

Will Stolton
Will Stolton

Solutions Marketing Manager at Snowplow

View author

Ready to start creating rich, first-party data?

Image of the Snowplow app UI