Google Analytics migration: step up to advanced analytics with Snowplow’s free trial
With the sunsetting of Universal Analytics next month, many data teams are considering migrating from Universal Analytics to an alternative tool.
If you’re concerned with ongoing GA compliance issues as well as the complexities and limitations of migrating to GA4, here’s an alternative data platform that’s secure, flexible, and future-proof. You can try Snowplow for free, or check out a sample data set to see how many more fields you get out of the box with Snowplow, not to mention the easy-to-document customizations you can build on top of this.
Installing Try Snowplow
The first step is to sign up to Try Snowplow. Once signed up, you’ll be guided through the process of installing a minified version of Snowplow BDP technology which creates and processes events, as well as a Postgres database (where the data will be delivered).
Each user that signs up is given their own pipeline and Postgres database (hosted by Snowplow), which expires after 14 days. The entire stack is deleted once this period has finished, along with all data collected and stored in Postgres.
Tracking events with Try Snowplow
Once you’ve signed up to Try Snowplow, you’ll have access to a limited version of the Snowplow console.
Using a code snippet from the console (under ‘implementation instructions’), you can instrument web tracking in your application or via Google Tag Manager.
Instrumenting the web tracker in your application
- Copy the JavaScript code snippet from the Try Snowplow console.
- Paste the tracking code into the page source <head> section of your application and deploy the changes.
- Your pipeline should now capture events from your application.
Instrumenting the web tracker via Google Tag Manager
- Copy the JavaScript code snippet from the Try Snowplow console.
- Navigate to the Google Tag Manager account you wish to instrument tracking to
- Create a new Custom HTML tag and paste the Javascript snippet into the tag
- Set it to fire on ‘All Pages’ or a trigger of your choosing
- You can preview your tag to send some events into Try Snowplow before publishing it
- Your pipeline should now capture events from your application
Debugging
In the ‘Pipeline Status’ section of the Try Snowplow console, you can check the health of the application. If the first two lines are checked, your pipeline is ready to receive events.
Accessing your data with Try Snowplow
The events Try Snowplow creates will be stored in a postgres database, which contains the standard Snowplow schemas: atomic (for raw data), bad_rows (for data that has failed pipeline validation) and derived (for modeled tables).
To access this data, you need to request a password from within the Snowplow console; please bear in mind, you can only do this once for security reasons.
Querying your data
Like Snowplow BDP, Try Snowplow encourages you to connect your BI or query tool of choice to access the database and query your data.
You can either copy a sample query from the console tutorial, check out the Recipes or start exploring your data with your own queries (below, you’ll find the queries
Web Analytics
Snowplow Vs GA4
Snowplow is an incredibly flexible tool, allowing you to gain an extremely detailed view of how users engage with your website/s. As opposed to GA4, our technology ensures that only high-quality, structured behavioral data reaches your data warehouse – ‘failed’ events do not make it through to the database, and no arbitrary opinions are expressed on the data via blackbox sampling.
It’s also worth highlighting that GA4 has a daily export limit of 1 million events to BigQuery. In their words, “If your property consistently exceeds the export limit, the daily BigQuery export will be paused and previous days’ reports will not be reprocessed.” This represents a potentially significant blocker for teams looking to collect a high volume of events (to put this number into perspective, Netflix’s team has reported collecting 8 million events per second during peak hours).
Implementing Web Analytics on Try Snowplow
You have already set up Snowplow’s out-of-the-box tracking for web analytics above when you instrumented the Javascript Tracker. These include Page_view
and Page_ping
.
The next step in implementing a web analytics use case on Try Snowplow is to aggregate this data into sessions. Whilst sessions don’t tell you everything, and don’t necessarily represent the whole customer journey, they’re generally a good starting point to investigate questions such as:
- How many sessions does each of your marketing channels generate?
- What is the average time users spend engaging with your site in a given session? How does that compare to the average session length?
- How many pages do users look at in a given session?
Updating the sessionization logic (optional)
The Snowplow JavaScript tracker automatically tracks a session identifier and a session index with all web events. Sessions are reset after 30 minutes of inactivity by default, but this can be changed this in the tracker initialization by adding the sessionCookieTimeout (in seconds):
window.snowplow("newTracker", "sp", ..., {
appId: "try-snowplow-tracking",
platform: "web",
sessionCookieTimeout: 3600,
contexts: {
webPage: true,
performanceTiming: true
}
});
Furthermore, you can manually reset a session, for example after a conversion, like so:
window.snowplow("newSession");
Go ahead and update the sessionization logic in your tracker implementation if you would like to. More information on the Snowplow session cookie can be found here.
Modeling sessions for web analytics
What does the model do?
For this recipe you’ll create a simple session table describing web engagement by running the following query in your query tool of choice. This is a very simplified version of the sessions table produced by our standard web data model. For each session, it will capture the session ID, session start and end times, marketing channel as well as engagement information: page views, link clicks and time engaged (in seconds).
First generate the table:
CREATE TABLE derived.sessions AS(
WITH sessions AS (
SELECT
ev.domain_sessionid AS session_id,
MIN(ev.derived_tstamp) AS session_start,
MAX(ev.derived_tstamp) AS session_end,
SUM(CASE WHEN ev.event_name = 'page_view' THEN 1 ELSE 0 END) AS page_views,
SUM(CASE WHEN ev.event_name = 'link_click' THEN 1 ELSE 0 END) AS link_clicks,
10*SUM(CASE WHEN ev.event_name = 'page_ping' THEN 1 ELSE 0 END) AS time_engaged_in_s
FROM atomic.events AS ev
GROUP BY 1
)
SELECT
-- session information
s.session_id,
s.session_start,
s.session_end,
-- marketing channel
CASE
WHEN ev.refr_medium IS NULL AND ev.page_url NOT ILIKE '%utm_%' THEN 'Direct'
WHEN (ev.refr_medium = 'search' AND ev.mkt_medium IS NULL) OR (ev.refr_medium = 'search' AND ev.mkt_medium = 'organic') THEN 'Organic Search'
WHEN ev.refr_medium = 'search' AND ev.mkt_medium SIMILAR TO '%(cpc|ppc|paidsearch)%' THEN 'Paid Search'
WHEN ev.refr_medium = 'social' OR ev.mkt_medium SIMILAR TO '%(social|social-network|social-media|sm|social network|social media)%' THEN 'Social'
WHEN ev.refr_medium = 'email' OR ev.mkt_medium ILIKE 'email' THEN 'Email'
WHEN ev.mkt_medium SIMILAR TO '%(display|cpm|banner)%' THEN 'Display'
ELSE 'Other'
END AS marketing_channel,
-- activity
s.page_views,
s.link_clicks,
s.time_engaged_in_s
FROM atomic.events AS ev
INNER JOIN sessions AS s
ON ev.domain_sessionid = s.session_id AND ev.derived_tstamp = s.session_start
GROUP BY 1,2,3,4,5,6,7
);
And then view it:
SELECT * FROM derived.sessions;
Other queries you might want to run:
Sessions by marketing channel:
SELECT
marketing_channel,
COUNT(DISTINCT session_id) AS sessions
FROM derived.sessions
GROUP BY 1 ORDER BY 2 DESC;
Average number of page views and time engaged in seconds per session:
SELECT
AVG(page_views) AS avg_page_views,
AVG(time_engaged_in_s) AS avg_time_engaged_in_s
FROM derived.sessions;
Web Analytics with Try Snowplow – Breakdown
By this point, we hope you have a good idea of how Snowplow can be used to capture accurate web analytics. Following the above:
- You have captured a session identifier with all web events, and customized the sessionization logic to match your requirements.
- You have run a simple SQL query to model the Snowplow data collected from your website into sessions. Based on the sessions table, you can easily see how users are engaging with your site.
Ecommerce analytics
Snowplow vs GA4
The same advantages of Snowplow for web analytics apply to e-commerce analytics, when comparing it to GA4. The behavioral data that Snowplow creates is richer, more reliable, and can be used to power advanced AI and ML use cases. With Snowplow, you can define precisely the behavioral data you need to show how and why customers are purchasing (or not purchasing) products from your online store.
Implementing ECommerce Analytics on Try Snowplow
You have already set up the out-of-the-box trackers needed for ecommerce analytics when instrumenting the Javascript Tracker.
To understand how people are engaging with your products, however, you’ll need to make a couple of additions to the tracking. This includes:
- Attaching a product entity to all of your product-related events. This ensures that you’re able to tie all product-related events to a specific product, rather than just pages. (Learn more about Snowplow’s approach to events and entities here)
- Extending tracking to include cart actions and purposes; for this purpose, we’ve created a couple of custom events for you to instrument.
Designing and implementing the product
entity
Designing the product
entity
We have already created a custom product
entity for you, and uploaded its data structure to your Iglu server.
Snowplow uses self-describing JSON schemas to structure events and entities so that they can be validated in the pipeline and loaded into tidy tables in the warehouse. You can learn more about these data structures here, and about why we take this approach here.
While Try Snowplow only ships with a predesigned set of custom events and entities required for the recipes, Snowplow BDP lets you create an unlimited number of your own via the Data Structures UI (and API).The product
entity has the following fields:
Field | Description | Type | Validation | Required? |
name | The name of the piece of content | string | maxLength: 255 | ✅ |
price | The current price of the product | number | minimum: 0maximum: 100000 multipleOf: 0.01 | ✅ |
quantity | The number of this product (used in basket events) | integer | minimum: 0 maximum: 100000 | ✅ |
category | The category of the product | string | maxLength: 255 | ❌ |
sku | The SKU for the product | string | maxLength: 255 | ❌ |
Implementing the product
entity
In the Javascript Tracker
Add the product entity to your page_view
and page_ping
events by editing your trackPageView
events to include the entity. Specifically, you’ll update
window.snowplow('trackPageView');
To
window.snowplow('trackPageView', {
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name",
"quantity": 1,
"price": 100,
"category": "example_category",
"sku": "example_sku"
}
}]
});
Via Google Tag Manager
If you are using Google Tag Manager, you can add the variables like this:
window.snowplow('trackPageView', {
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "{{example_name_variable}}",
"quantity": {{example_quantity_variable}},
"price": {{example_price_variable}},
"category": "{{example_category_variable}}",
"sku": "{{example_sku_variable}}"
}
}]
});
Designing and implementing the cart_action
event
Designing the cart_action
event
The cart_action
event records actions that the user performs to their cart. In this simplified version you’ll be recording a single property that describes whether an item was added or removed.
Field | Description | Type | Validation | Required? |
type | The type of action taken by the user | string | enum: ["add", "remove“] | ✅ |
Implementing the cart_action
event
When you trigger the cart_action
event, you’ll also want to attach the product
entity that we designed earlier to describe which product is being changed in the cart.Instrument the cart_action
event when items are added to or removed from the cart on your website.
window.snowplow('trackSelfDescribingEvent', {
"event": {
"schema": "iglu:com.trysnowplow/cart_action/jsonschema/1-0-0",
"data": {
"type": "add" // or "remove"
}
},
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name",
"quantity": 1,
"price": 100,
"category": "example_category",
"sku": "example_sku"
}
}]
});
Designing and implementing the purchase
event
Designing the purchase
event
The purchase
event is a simple event that should be triggered when a purchase is made.
The event itself has no properties, but should be sent with one or more product
entities that describe which products were purchased.
Implementing the purchase
event
When you trigger the purchase
event, you’ll want to attach one or more of the product
entity to describe what has been purchased.
Instrument the purchase
event when a purchase is made in your store.
Example for a single product purchase
window.snowplow('trackSelfDescribingEvent', {
"event": {
"schema": "iglu:com.trysnowplow/purchase/jsonschema/1-0-0",
"data": {}
},
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name",
"quantity": 1,
"price": 100,
"category": "example_category",
"sku": "example_sku"
}
}]
});
Example for a multi-product purchase
window.snowplow('trackSelfDescribingEvent', {
"event": {
"schema": "iglu:com.trysnowplow/purchase/jsonschema/1-0-0",
"data": {}
},
"context": [{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name",
"quantity": 1,
"price": 100,
"category": "example_category",
"sku": "example_sku"
}
},{
"schema": "iglu:com.trysnowplow/product/jsonschema/1-0-0",
"data": {
"name": "example_name_2",
"quantity": 1,
"price": 50,
"category": "example_category_2",
"sku": "example_sku_2"
}
}]
});
Modeling the data you’ve collected
What does the model do?
The tracking above captures events about the user’s product purchasing journey, and attaches the context of which product was engaged with to all events you are firing. You can now start to get a better understanding of how your products are performing.
For this recipe you’ll create a simple table describing product engagement. Specifically, for each product you’ll aggregate the number of product views, add to carts, remove from carts and purchases, as well as the revenue earned.
Once you have collected some data with your new tracking you can run the following two queries in your tool of choice.
First generate the table:
CREATE TABLE derived.products AS(
SELECT
p.category AS product_category,
p.name AS product_name,
p.sku AS product_sku,
p.price AS product_price,
SUM(CASE WHEN ev.event_name = 'page_view' THEN 1 ELSE 0 END) AS product_views,
SUM(CASE WHEN ev.event_name = 'cart_action' AND ca.type = 'add' THEN p.quantity ELSE 0 END) AS add_to_carts,
SUM(CASE WHEN ev.event_name = 'cart_action' AND ca.type = 'remove' THEN p.quantity ELSE 0 END) AS remove_from_carts,
SUM(CASE WHEN ev.event_name = 'purchase' THEN p.quantity ELSE 0 END) AS purchases,
SUM(CASE WHEN ev.event_name = 'purchase' THEN 1 ELSE 0 END * p.quantity * p.price) AS revenue
FROM atomic.events AS ev
INNER JOIN atomic.com_trysnowplow_product_1 AS p
ON ev.event_id = p.root_id AND ev.collector_tstamp = p.root_tstamp
LEFT JOIN atomic.com_trysnowplow_cart_action_1 AS ca
USING(root_id,root_tstamp)
WHERE ev.event_name IN ('page_view', 'cart_action', 'purchase')
GROUP BY 1,2,3,4
);
And then view it:
SELECT * FROM derived.products;
Ecommerce analytics with Try Snowplow – Breakdown
- You have captured granular data around how your users are engaging with your products throughout their purchasing journeys.
- You have modeled this data into a product engagement table that surfaces the user engagement per product.
What you might want to do next
Understanding how your users are engaging with your products is the first step in optimizing your e-commerce store. Next, you might want to
- Extend this table to include returns by joining this data with data from your transactional databases, so you get a more accurate picture of how products are actually performing.
- Extend this table to include where these products are being promoted on your site to understand how visual merchandising affects performance.
- Join this data with your inventory data to get a 360 view of e-commerce strategy.
- Start mapping the relationships between products based on user behavior, working towards compelling product recommendations.
- Pivot this data to look at users instead: understand which marketing channels customers come from, and their customer lifetime value.
Try Snowplow: What now?
We hope this guide has helped you in implementing web and ecommerce analytics in Try Snowplow. By following these steps, you’ll have a glimpse into what a post- Universal Analytics world could look like outside of the Google ecosystem. With Snowplow delivering clean, granular event-level data to your warehouse, you could power incrementally complex data applications to propel your organization forward.
If you have any questions on how Try Snowplow works or how it compares to GA4, feel free to get in contact with us here.