Blog

‘Tis the Season for Data Quality’: Unwrapping Snowplow's Data Quality Roundup for 2024

By
Daniela Howard
&
December 18, 2024
Share this post

As we wrap up 2024, we're celebrating a transformative year in data quality - one where we've helped organizations shift their data governance "left," catching and preventing issues before they become costly downstream problems. 

Through our Data Product Studio, we've delivered a host of new tools that enable teams to lead with data quality. Let's unwrap these gifts that keep your data AI-ready from collection to delivery!

🎁 The Gift of Organization: Source Applications

We've introduced Source Applications to help you keep your tracking neat and tidy. Previously, application IDs and global contexts were managed implicitly - easily prone to errors caused by typos - but now they have a proper home in Data Product Studio. This new feature serves as your central source of truth, helping you:

  • Track how many applications have Snowplow tracking instrumented in, and how many data products are set up within each
  • Manage application entities (formerly global contexts) more efficiently 
  • Streamline the design of new data products
Source Applications

Each Source Application comes wrapped with:

  • A name
  • Clear description
  • Designated owner
  • Application IDs (0 or more)
  • Application entities
  • Associated data products

As a bonus, we've added visibility into unassigned app_ids, which can be added to a Source Application, making it easier to maintain consistent tracking across your implementation.

🎄 Template Treasures: Built-in Best Practices

Why risk data quality issues when you can start with battle-tested templates? Our new templated data products accelerate time-to-value with Snowplow. Quickly track (and model) all the common events for a standard implementation e.g. Ecommerce analytics, web analytics and media tracking.

These templates aren't just about speed - they're about starting with quality baked in. Each template includes validated event specifications, schemas, and tracking plugins, integrated seamlessly with our dbt packages, enabling teams to get up and running with new use cases in hours rather than months.

Data Product Templates

🔄 Propagating Quality: Data Product Cloning

It is now possible to clone data products and their associated event specifications, ensuring consistent quality standards while allowing for customization and enhancements instead of starting from scratch.

Data Product Cloning

🎁 Gifts for the Developers: Git-Backed Workflows

We've delivered two major gifts for developers this year: Git-backed Data Products and Git-backed Data Structures, enabling better collaboration, stronger controls, and seamless integration with modern development workflows.

With Data Structures (released in October), we introduced version control for your JSON schema specifications, enabling teams to validate and deploy schemas through familiar Git workflows. 

Building on this success, our December release of Git-backed Data Products extends these capabilities to your entire tracking design process - from source applications to event specifications - with changes automatically synchronized between your Git repositories and Snowplow's Console UI. 

Together, these features bring software development best practices to data management, enabling teams to leverage pull request workflows, CI/CD pipelines, and automated compliance checks for their entire data infrastructure.

⭐ Proactive Quality Management: Data Structure Upgrades

We've evolved how teams handle schema evolution with our new upgrade mechanism. Instead of risking broken pipelines or invalid data, our system:

  • Proactively identifies upgrades opportunities 
  • Automatically handles compatible changes
  • Provides guided workflows for complex transitions
  • Ensures downstream compatibility

See an example: 

When a new version of a Data Structure becomes available, the system will indicate that the event or entities referenced by the data structure has a new version available, showing an 'Upgrade' button in the UI.

Clicking the ‘Upgrade’ button navigates to a new page, informing the user of the new version they are upgrading to, along with 'View Changes'.

When clicked it will show the differences between the current version of the Data Structure and the one the user intends to upgrade to.

Read the documentation to find out more.

🔍 Quality at Scale: Enhanced Search and Filtering

As data implementations grow, maintaining quality becomes more challenging. Within Data Product Studio, teams can quickly search data products, filtering by Source Applications. Quickly narrow down to the data that is relevant to you or your team. In addition, we now display the data products in a more compact table view. 

Data Product Studio

❄️ The Gift of Enhanced Validation: Snowtype

Catch data quality issues instantly during development with our newly enhanced Snowtype tracking code generation tool, which now features comprehensive client-side validation across nine platforms including web, mobile, and server environments. The improvements deliver schema validation, cardinality rules validation, and property rules validation, eliminating the need for costly downstream cleanup. This advancement demonstrates our commitment to high-quality data collection at the source, providing developers with type-safe code generation, IDE autocomplete suggestions, and comprehensive error messaging across platforms like Browser JS/TS, iOS, Android, React Native, and Flutter.

As we close out 2024, we’re excited to continue delivering tools that make data governance an inherent part of data collection, not just an afterthought. Here's to another year of quality data and successful implementations!

For detailed instructions on unwrapping any of these features, visit our documentation or contact us today. 

🎅🎄  Happy Holidays from the Snowplow team! ❄️ 🎁

Subscribe to our newsletter

Get the latest blog posts to your inbox every week.

Get Started

Unlock the value of your behavioral data with customer data infrastructure for AI, advanced analytics, and personalized experiences