How Omio builds a strong culture of data quality and data ownership with Snowplow

A self-serve data culture and quality behavioral data place Omio’s data teams in the driver’s seat.

Omio is an online travel platform that aims to present the fastest, cheapest and easiest travel options to consumers for train, bus and flight journeys. Operating in 15 countries (and counting) and reaching 27 million monthly users, Omio is growing fast. 

To solve data quality issues and foster a more impactful data-driven culture, Omio chose Snowplow. Now, they have full ownership and control over their data and infrastructure, realize significant time savings across their analytics use cases, and have complete confidence in the quality of their data. 


As their data demands increased, Omio needed a better way to share data with data scientists, product teams and analysts—while maintaining a high standard of data quality. Their goal was to build a flexible data pipeline to support an efficient, self-serve data culture, while retaining complete control and ownership of their data and data infrastructure. 

Data quality, or completeness, was paramount. Until now, Omio spent a lot of time identifying and fixing problems, often due to data-quality issues, such as completeness, data duplication, anomalies, or errors at validation. 

Looking at the bigger picture, Omio knew they needed to evolve the data pipeline to control for data quality. This would help build the foundation for a self-serve data culture that was less reliant on the data engineering team. 


Omio chose Snowplow after an iterative selection process in which the data engineering team outlined their needs. Initially, they considered building their own data pipeline. After sketching out a homegrown solution, they realized the picture that emerged looked a lot like Snowplow. 

We often find that while companies may have the technical expertise to build a homegrown pipeline, the amount of work and subsequent maintenance can make it a time sink. At Snowplow, we’ve been learning about the little anomalies, faults and issues that can arise for over 10 years. 

Omio realized they could leverage Snowplow’s flexibility and tried-and-tested customizability to serve as their general-purpose data pipeline framework—one that could push any kind of data and had all the building blocks Omio needed. As a bonus, using Snowplow would save them considerable investment in pipeline building and maintenance. 

Snowplow enables several things that are important for Omio and for data quality. We want to be able to control and own all of our data. Snowplow is open source, which means that we can have confidence in it; we can look at the code and figure out what’s going on or change things.” 



With Snowplow, Omio can tackle the data quality problem head on. Snowplow solves the issue with the fully configurable validation of data against predefined schemas (or structures); complete, lossless data collection; and the flexibility to break free from black-box solutions and own the data to manage and audit as desired.

Further, Omio has the flexibility to customize the pipeline to meet evolving needs. As Snowplow is fully modular, our software offers the opportunity to create a custom pipeline: part Snowplow, part Omio. Omio is using Snowplow’s open source components to create a watertight pipeline for schema validation according to their own specifications and needs. 

The Omio data engineering team is working on migrating from AWS to GCP to align with the rest of the company, and Snowplow’s crosscloud functionality will make this transition easier. In addition, Omio is already working on continued development of their own customer server-side data pipeline, using a number of Snowplow pipeline components. 

The gist is that once you have all the relevant data for each event, which is possible with Snowplow, you can do whatever you want with it. Snowplow’s importance will only continue to grow as we customize our pipeline.”


