Snowplow 0.8.3 released with unstructured events
We’re pleased to announce the release of Snowplow 0.8.3. This release updates our JavaScript Tracker to version 0.11.2, adding the ability to send custom unstructured events to a Snowplow collector with trackUnstructEvent()
. The Clojure Collector is also bumped to 0.5.0, to include some important bug fixes.
Please note that this release only adds unstructured events to the JavaScript Tracker – adding unstructured events to our Enrichment process and storage targets is on the roadmap – but rest assured we are working on it!
Many thanks to community members Gabor Ratky, Andras Tarsoly and Laszlo Bacsi, all from Secret Sauce Partners, for contributing this great feature: Gabor and his team took JavaScript unstructured events from an item on our roadmap to a code-complete feature, big thanks guys! (And if you are interested in seeing how the design and implementation of this powerful feature evolved, do have a read of the original GitHub pull request.)
In the rest of this post, then, we will cover:
- What are unstructured events?
- When to use unstructured events?
- Usage
- Upgrading
- Roadmap for unstructured events
- Getting help
1. What are unstructured events?
Custom unstructured events are user events which do not fit one of the existing Snowplow event types (page views, ecommerce transactions etc), and do not fit easily into our existing custom structured event format. A custom unstructured event consists of two elements:
- A
name
, e.g. “Game saved” or “returned-order” - A set of
name: value
properties (also known as a hash, associative array or dictionary)
You might recognise what we call custom unstructured events from other analytics tools including MixPanel, KISSmetrics and Keen.io, where they are the primary trackable event type.
2. When to use unstructured events?
Custom unstructured events are great for a couple of use cases:
- Where you want to track event types which are proprietary/specific to your business (i.e. not already part of Snowplow)
- Where you want to track events which have unpredictable or frequently changing properties
Note: because unstructured events are not currently processed by the ETL and enrichment step, or added to storage, we recommend using custom structured events for custom events types, assuming that you can fit your events into our custom structured event schema.
3. Usage
Tracking an unstructured event with the JavaScript Tracker is very straightforward – use the trackUnstructEvent(name, properties)
function.
Here is an example taken from our codebase:
We have written a follow-up blog post to provide more information on using the new trackUnstructEvent
functionality – please read this post for more information.
4. Upgrading
There are two components to upgrade in this release:
- The JavaScript Tracker, to version 0.11.2
- The Clojure Collector, to version 0.5.0
If you are running the Clojure Collector, you must upgrade the Clojure Collector before upgrading the JavaScript Tracker, or you will experience some data loss.
Clojure Collector
This release bumps the Clojure Collector to version 0.5.0. To upgrade to this release:
- Download the new warfile by right-clicking on this link and selecting “Save As…”
- Log in to your Amazon Elastic Beanstalk console
- Browse to your Collector’s application
- Click the “Upload New Version” and upload your warfile
JavaScript Tracker
Please update your website(s) or tag manager to use the latest version of the JavaScript Tracker, which is version 0.11.2. As always, the updated minified tracker is available here:
http(s)://d1fc8wv8zag5ca.cloudfront.net/0.11.2/sp.js
5. Roadmap
We are well aware that this release is only the start of adding custom unstructured events to Snowplow.
It makes sense to work next on extracting unstructured events in our Enrichment process; unfortunately this is not trivial, because our Enrichment process currently only outputs to Redshift, and Redshift has no support for JSON objects or maps of properties, which we would need to store the unstructured event properties.
Therefor
e we are exploring two different strands:
- Storing Snowplow events in Avro. Avro is a rich data serialization system that will allow us to store the unstructured event properties within the event object. Initially, you would be able to query these Avro-serialized events using a range of tools on Hadoop including Pig, Hive, Scalding and Cascalog. It should also be relatively straightforward to load these events into NoSQL databases such as MongoDB. We would then work on mapping the Avro events into Redshift
- Storing Snowplow events in PostgreSQL. Postgres has a JSON datatype, although the querying capabilities on that JSON datatype are so-far very primitive. Nonetheless, it should be possible to at least store the unstructured event properties in an appropriate JSON field in Postgres
If you have a preference for one of the two above options, or a suggested third approach, then get in touch and let us know as soon as possible, as we are thining through these alternatives now.
Please keep an eye on our Roadmap wiki page to see how Snowplow’s support for unstructured events evolves.
6. Getting help
As always, if you do run into any issues or don’t understand any of the above changes, please raise an issue or get in touch with us via the usual channels.
And if you want to find out more about the syntax for trackUnstructEvent
, do checkout our Snowplow Unstructured Events Guide, which was also published today.