Snowplow Open Source vs Snowplow Behavioral Data Platform (BDP)
What do Snowplow BDP and Open Source have in common?
Snowplow is the most advanced behavioral data analytics tool available today. We help companies track thousands of digital events – a basic example being ‘time spent on page’ – in the most compliant and accurate way possible.
At the core of both Snowplow’s BDP products and Open Source (OS) lies a principle of tracking called Data Creation.
Data Creation for OS and BDP
Fundamentally, Data Creation is the process of deliberately designing and tracking digital events in order to power advanced analytics. Every data point has been planned, defined, and validated by the time it reaches your storage destination, preventing the dreaded data swamp that can result from ad hoc data collection.
It differs from data exhaust, which is the by-product of data from tools with their own parameters that must be unpicked to create a unified data set. Most data teams are dealing with this exhaust data, made from tools such as packaged analytics platforms, CDPs or SaaS tools.
This data was never really designed to be used in a data warehouse or lake, so significant preparation is necessary to make it useful. The result is often the failure of data projects, as there simply isn’t time to make a house from a heap of random, unsorted building materials.
We designed our product to be open-core to make Data Creation universally accessible. This is the foundation on which we build, so everyone can experience the benefits of Snowplow data – reliability, explainability, compliance, accuracy and predictiveness.
What is Snowplow Open Source and who is it for?
Snowplow Open Source includes all the basic blocks for Data Creation: tracking SDKs, enrichments, data-warehouse and data-lake loaders, etc. You can think of OS as a selection of processes and components, that if assembled correctly can form a robust pipeline.
OS is great for:
- teams with the skill, time and money to build a pipeline from scratch and deal with ongoing updates and maintenance
- anyone that wants to experiment with cutting-edge data projects without dealing with a vendor
- smaller teams without the need for significant collaboration with numerous stakeholders
- those looking for a proof of concept can use OS to better understand the foundations of Snowplow BDP
The benefits of Snowplow Open Source
1. An OS solution that really scales
Many open-core companies ensure that their OS solutions don’t scale to force users into their paid offering, almost like a freemium model. We have ample evidence that Snowplow OS can handle large implementations. In fact, it’s the third-most-used tracker in the world, used on 1.7m websites, which is a testament to the number of data teams making Snowplow’s tech their own.
This ability to scale is a benefit of OS and BDP, both of which scale to billions of events per day.
2. A large technical feature set
Snowplow OS has an enormous technical feature set:
- Create data with 20+ out-of-the-box trackers for web, mobile, and server-side events, as well as various webhooks
- Control what your data looks like with our schema-based validation technology
- Define your own custom event types, attributes, and so on
- Load your data into any major data warehouse or data lake (Amazon Redshift, Amazon S3, Google BigQuery, Google GCS, Snowflake, or Databricks), with a well-defined schema
- Get detailed insights with customizable data models (using dbt), which build on your page-, session- and user-level data sets
- Benefit from compliant and auditable data with GDPR contexts appended to every event
Check out this Snowplow data sample to see how comprehensive the results can be, with 100+ out-of-the-box attributes.
With the above features, your data is clean, compliant, meaningful, and well-structured from the start, making it a breeze to query it or build advanced data applications.
You can further increase the value of your data by enabling any of our 16 out-of-the-box real-time enrichments, or by plugging in your own custom logic. This can add extra context to your events, such as currency conversion.
3. Snowplow Quick Start guide and Data Product Accelerators
Snowplow Quick Start makes it easy to experience what Open Source is about. If you have your cloud and data warehouse/lake ready to go and are familiar with some deployment tooling, it will get you from zero to your first working pipeline in a few hours.
Building a new data application can be a daunting business, as the fully flexible nature of Snowplow’s data means you can literally take an application in any direction. Data Product Accelerators (DPAs) will get you started with some of the most common use cases and give you a chance to iterate on firm foundations. These can be used with both Snowplow OS and BDP.
We have 7000+ community members, comprising both BDP & OS users. Snowplow hosts regular meetups and, in 2022 alone, we hosted Snowplow meetups in London, Vienna, Amsterdam, Copenhagen, Stockholm, Sydney, Berlin and Boston.
At these events Snowplow users have contributed presentations on how they use Snowplow and have helped answer questions from fellow Snowplow users. On our YouTube channel you can see recordings from some of these meetups.
Our community actively supports the development of our product by raising Github issues, helping to maintain the highest standards in our product. The community also helps by providing feedback, answering surveys and helping us conduct product and market research. More details of members’ contributions can be found in our monthly Product Office Hours – check out the most recent editions.
You can also visit Snowplow’s Discourse to learn more.
What is Snowplow BDP and who is it for?
Snowplow BDP is our paid offering, which works on top of our open-source core. This model is key to how we believe technology should be made available. In essence, it means the key technical stack is open source and available for anyone to access and inspect; this in turn strengthens our commercial offering by building on principles of transparency and data sovereignty.
BDP is definitely the better choice for:
- large or complex teams. Snowplow BDP is extremely useful for large companies struggling with enforcing standards and collaborate effectively. Structured workflows help teams break down the barriers to building advanced data applications.
- those prioritising time to value. As BDP can save literally a year’s worth of effort setting up – in many cases – not to mention the time for building extra functionality on top. Time is also gained by Snowplow managing the ongoing maintenance and management of your pipeline.
- organizations with business-critical applications or use cases. If your OS breaks, significant value can be lost both for internal stakeholders and customers. BDP puts assurances in place – guarantees and SLAs – regarding data quality and timeliness. Furthermore, the pipeline is managed by Snowplow, so data applications are not overly reliant on one particular person or small group who may leave the business.
Benefits of Snowplow BDP
1. Configure and manage your pipeline more effectively with an intuitive user interface (UI)
Data quality UI and API: Snowplow’s UI provides automatic alerting on the emergence of all new data quality issues as they occur. With these alerts, your data team can avoid costly data downtime.
Pipeline configuration and monitoring UI: the user interface allows you to easily setup and monitor your data pipeline, and safely apply configuration changes once running.
2. Workflows that manage for evolving business requirements
Snowplow’s data structures UI and API have best-practice workflows to define, govern and evolve your data structures (events and entities). This means your team can build on solid foundations, maintain data integrity and socialise information effectively.
3. Better collaboration and data discovery
Features such as Tracking Catalog mean that anyone you give access to can understand what is being tracked, why, and when was it last updated. Granular event maps show all the properties and entities included within an event, as well as a common-language definition of the purpose of the event. This means anyone in the team can understand the tracking without having to ask for technical support or make assumptions based on the final data. This avoids relying on a G-doc, G-sheet or, worse, tribal knowledge in your team.
4. Single-sign on (SSO) and advanced user permissions
Maintain complete control over access with SSO and provide granular access permissions for flexible data security.
5. Outage protection for data uptime assurance
According to Gartner, data downtime costs an average of $5600 per minute. Our outage protection assurance ensures that your mission-critical use cases are working day in day out.
Our multi-region approach creates a “backup region” for your pipelines, so traffic is immediately re-routed in the event of an outage in your main region, significantly lowering the risk of data loss.
Snowplow outage protected pipelines in backup regions are deliberately minimally specced to reduce additional costs, but quickly scale up in the event of an outage to ensure that data loss is minimized. Downtime is consequently cut down to a few seconds compared to several hours or days.
6. Custom onboarding
Snowplow’s Customer Success team are on hand to help you get up and running with a custom onboarding process lasting 6-8 weeks.
The major benefit of this is that our solutions architects have an enormous amount of experience with optimizing the design of pipelines to match your business objectives, so you can drive maximum value from your use cases.
With Snowplow BDP Enterprise, we make absolutely sure that everything works for your security team, so when they finally sign off on the implementation, they’ve really thought about whether the configuration is up to spec.
This includes optimizing access permissions, designing in accordance with security protocols and integrating new features and requests to the product as part of onboarding.
We also have an AWS security bundle, which includes features such as VPC housing for your pipeline, custom tagging, custom security agents, a custom IAM policy, as well as SSH and HTTP access controls.
8. Pipeline optimization and customization
With Snowplow BDP Enterprise, our engineering team makes sure your data pipeline is set up and tailored to your specific services.
We can fully customize how we manage instance reservations, security protocols and how we write the size of pipelines for your particular traffic patterns. Similarly, we can limit the number of destinations you load to based on projected expenses. No other mainstream provider really allows you to design your pipeline based on the most optimal way for you to consume information.
9. 24/7 infrastructure and product support
We have a global support team that provides 24/7 support, monitoring the health of your pipeline day and night, and on hand to help you get the most out of Snowplow.
We also offer a support SLA, for total assurance, with technical assistance available within as little as 30 minutes.
10. Private SaaS and SaaS options
While our OS offering is self-hosted, Snowplow BDP Enterprise is “private SaaS” and BDP Cloud is SaaS.
With BDP Enterprise, the pipeline is hosted in your own cloud environment, but Snowplow’s engineers have limited access to ensure everything is set up correctly and is running smoothly. Our access is strictly controlled by your team. This model is known as “private SaaS” and offers many benefits regarding privacy and ownership.
BDP Cloud is hosted on your behalf, to reduce technical requirements and increase the speed of implementation.
Open source to BDP, a case study from “JustWatch”
Snowplow BDP has helped countless companies create more targeted online advertisements.
One recent example is JustWatch. This AdTech company enables movie studios and video- on-demand providers to advertise their content to the right audiences. This highly-personalized advertising is effective for the advertiser and enjoyable for the consumer, but is not possible without granular behavioral data.
Having been a long-term member of the Snowplow Open Source community, JustWatch became a BDP customer in order to focus on creating more advanced use cases. These range from the development of a unified customer view in their data warehouse, right up to advanced machine learning algorithms designed to better understand users’ interests.
Start creating data with Snowplow today
Snowplow Open Source
The world’s leading open source project for data creation.Learn More
Snowplow Behavioral Data Platform
The private SaaS deployment, with data management, governance and enterprise support.Learn More