Snowplow and Census: Delivering behavioral data for actionable outcomes
Chapter 3: Snowplow and Census: Delivering behavioral data for actionable outcomes
It is difficult to overstate the potential of behavioral data to an organization. At its essence, behavioral data is truth. As Zach Wilson, Tech Lead at Airbnb succinctly put it:
Data, when correctly generated and processed, is truth. Giving your company truth to correct biases is critical for a fairer, more efficient, less error-prone world...When data reveals something counterintuitive, that’s really when data shines.
Behavioral data is so special because it reveals the truth about how your users and customers interact with your product. The “truth” can be used within a business context in all sorts of ways, with hugely powerful implications. With it, you can improve the recommendations your customers see on your website; you can discover the types of customers that are more likely to convert. Better still, you can gain a better understanding of your customers in general, how they behave within your product domain and why they might do what they do.
Armed with this information, you can then set a course for building competitive advantage. Not all your competitors are so equipped with the means to understand and serve their customers. This means doing behavioral data – namely the challenge of capturing, managing, modeling and delivering behavioral data to key decision makers – is a frontier on which there are winners and losers in the modern business world. The winners, Spotify, Airbnb, Netflix and the like, are steaming ahead. For the rest of us, it’s a battle just to get in the race.
Why driving value with behavioral data is such a challenge
Capturing and managing behavioral data to drive competitive advantage sounds great, but it’s far from easy. There are a number of complex challenges involved with driving value from behavioral data.
These range from the technical challenges:
- Building (or buying), managing and maintaining a robust data stack;
- Aggregating data sets from disparate sources into a cohesive data model that means something to the data consumers who use it;
- Handling large volumes of behavioral data.
To the people or operational challenges:
- Proficiency with data and analytical languages like SQL means that data literacy is not evenly distributed throughout the organization;
- Often data is broken up into ‘silos’ because individual teams prefer to work with their own packaged tools;
- Trust in the data can easily break down when data quality issues arise, which can negatively impact the whole data culture;
- Without a shared language around data, communication can break down between different stakeholders.
This is not even an exhaustive list, but a hint of the challenges organizations face when building their behavioral data set. At the heart of the issue is the idea of data democratization – ”How can I provide the right data, to the right people, where they need it?”. While some companies such as Strava have made great strides towards a culture around democratized data, many organizations struggle to fulfil this ambition.
As the company size increases, the scale of the challenge becomes magnified. There is more data flooding in, more disparate teams to serve with their own individual needs and requirements, and the tangled knots of data supply lines only get messier.
The role of behavioral data management
Tooling, processes and frameworks around behavioral data management play an important role in addressing these challenges. With the arrival of methodologies like DataOps, which borrow the popular principles of DevOps, we have seen new thinking around how organizations can handle behavioral data more efficiently to empower key decision makers in, say, product and marketing teams.
We have seen innovations in this area, from different ways to structure data teams and their placement within organizations to new tools in the landscape, both left and right of the centralized data warehouse – the modern “brain” of the business. Better warehousing and modeling, more robust pipeline infrastructure and more intuitive BI tools have all played a part in easing the pain of distributing data to those who need it. To summarise a few areas, we have seen:
- Better data management tooling that helps data teams send behavioral data to the data warehouse, where it can be aggregated and filtered into BI tools like Looker, Tableau and PowerBI where it’s more accessible to data consumers;
- Infrastructure and processes that help organizations build assurance in the quality and robustness of their data;
- A move towards total ownership of behavioral data, enabling data teams to apply their own logic to their data, leading to more meaningful data for the data user;
- Shifts toward the centralization of the data team in modern organizational structure, which allows data teams to sit under every part of the business;
- Warehousing data from multiple sources means there can be a single source of truth to drive richer insights and act as fuel for countless use cases.
These are all great steps forward. Even five years ago, we did not have access to the plethora of data tools available today. Nor were there thriving communities of data practitioners sharing and showcasing exciting ways they drive value with behavioral data in their industries.
The data warehouse has arguably been the most transformative influence in the recent data revolution. As a consequence, new tools and solutions have emerged to complement and capitalize on it.
But data warehouses have not solved one final step required on the data journey. We have fixed the “How do I get the right data, for the right people”, but not the “where they need it most” part of the equation. For that, we need to look at the next innovation in the data landscape, where data is not only captured and stored, but ‘operationalized’ for front-line business teams.
Operational analytics: Freeing up the data from the data warehouse
Operational analytics may seem like another opaque data phrase or buzz word, but it carries huge significance in this last aspect of delivering value from behavioral data. Making data ‘operational’ means putting it in front of the teams who need it most – the marketing, product and sales teams who drive business growth.
Or, as Boris Jabes, Census CEO and early pioneer of operational analytics, explains:
“At its core, operational analytics is about putting an organization’s data to work so everyone can make smart decisions about your business. You might’ve heard such a promise before from other technologies or platforms, but operational analytics is the only way to accomplish this at scale, because it introduces a set of fundamentals for leveraging data across your organization.” - Boris Jabes, Census CEO
If we consider the day-to-day schedule of a Product Manager, we can bring this problem to life. Let’s imagine a Product Manager, Rita, logs into a product analytics tool first thing in the morning to see the results of an A/B test the team has conducted.
In the A/B test, a small cohort of users are shown a banner in the mobile app, inviting them to check out the latest promotional offers on sale. The test is set to determine whether users who are shown the banner are more likely to add an item to the basket than those who don’t. Rita looks at the numbers on a continual basis to see how the A/B test is progressing, and determine what it can teach her team.
Within the product analytics tool, Rita can see the users and gain information about how they behave within the app. She may be able to see what devices they’re using, what region they live in and the lifetime value of purchases they’ve made. These are all useful insights.
However, what if Rita could drill into far richer information about those users and their behavior? She knows, for instance, that the company captures a wealth of data about their users, and that it’s possible – if she signs into specific dashboards in another BI tool – to discover every search entry, every button click and product selection those customers have made on an individual basis.
If these insights were available in Rita’s product analytics tool in real time, her team could orchestrate more powerful A/B tests, allowing them to answer questions like
- What impact does a banner ad have on these types of users?
- What happens when people who search for these queries are shown these products?
If Rita could connect the dots, she could build a deeper understanding of her users to inform the company’s product roadmap. She would be able to use that information to truly enhance the user experience.
Closing the loop on the data lifecycle with Snowplow and Census
The above scenario gives us just one example of how operationalizing data can open doors to richer insights. Yet the potential for operational analytics goes even further – to equip sales, marketing, product and other teams with behavioral data in the platforms they use to drive growth. To truly operationalize data by syncing it into apps like Marketo and Salesforce with “last-mile” models, companies need to look to a new breed of data tooling: reverse ETL.
While tools like Snowplow are ideal for capturing and delivering behavioral data to the data warehouse, reverse ETL solutions like Census allow product and marketing teams to unleash data into their platforms of choice. With this powerful combination, the organization retains total ownership and control over their valuable data set and can better ensure data freshness and quality throughout their pipeline, all without facing the challenges caused by relying on Customer Data Platforms (CDPs).
With data ready and available in CRMs, marketing platforms and product analytics tools, front-line teams can unlock behavioral data that would otherwise sit in the warehouse. There are a few key advantages to this:
- The data is where front-line teams need it most, where they can ‘action’ it to drive real user outcomes. For instance, a marketing team equipped with behavioral data can improve their email campaigns from their marketing platform.
- Data teams and analytics are no longer a bottleneck for data productivity because individual teams can self-serve.
- The single source of truth persists across the data journey – data silos are broken down as access to the data itself is democratized across teams.
There’s another key advantage to operational analytics, in that the data streams can be bi-directional. This means that, unlike with CDPs, behavioral data can flow both from the data warehouse into a marketing or sales tool via reverse ETL tools like Census, and back into the data warehouse again via ETL tools. In this way, there is a continuous loop of data flowing from the warehouse into operational platforms – as well as a feedback cycle where the centralized data asset can be enhanced by operational data (as seen below in a rough sketch of Boris’s).
The business intelligence of the whole company is therefore augmented by this two-way street (we’ll explore this more in our next chapter).