Snowplow and Census: Why we need to look beyond CDPs to deliver excellent experiences to customers
Chapter 2: Why we need to look beyond CDPs to deliver excellent experiences to customers
Building a single customer view is a key objective for many businesses today. Users demand an excellent, increasingly-personalized experience across digital platforms, or, at the very least, that companies tailor marketing and recommendations to them. This not only requires effective user identification across platforms and over time, but also requires companies to activate user data in downstream systems. In this chapter, we will review the tooling that companies have been using to tackle this important topic, and how we see the landscape changing.
The emergence of CDPs
Developing a better understanding of customers came into focus as early as the 1970s, when Customer Relationship Management tools (CRMs) first emerged and companies could build customer profiles based on transactions and other interactions. With the rise of big data tools and the ability to collect granular behavioural data, a new category of customer data tool emerged, the Customer Data Platform, or CDP. CDPs promise to make working with data easy for non-technical marketers, bringing all customer data together and stitching it into a single customer view.
More specifically, the CDP Institute defines a CDP as “a packaged software that creates a persistent, unified customer database that is accessible to other systems.” Customer-facing teams can leverage this software in various marketing tools, such as advertising platforms, email marketing solutions, and more. However, they generally don’t collect behavioral data from owned applications (such as websites and mobile apps), or make that underlying event-level data available for querying in a data warehouse.
Today, there’s more than 100 wildly different vendors claiming the title of CDP.
In a report from January 2020, Gartner stated the “Hype about customer data platforms (CDPs) as a panacea for customer-related problems is liable to confuse data and analytics leaders”.
In layman’s terms, the wide promises of the CDP market can make too much of everything seem possible. Tools like Segment or Tealium, which originally started as tag managers, now allow companies to track behavioral data across all their platforms and channels and then send it to many third parties, as well as the data warehouse.
Key limitations in building a single customer view using CDPs
While the adoption of CDPs enables marketing teams to personalize and optimize campaigns, there are some downsides to embracing the CDP for all your customer use cases:
As these tools collect this data purely as a means to an end, data quality is often an afterthought.
- Companies relinquish ownership of and control over some of their most important data: how their users behave across all of their digital platforms and channels.
- As these tools collect this data purely as a means to an end, data quality is often an afterthought.
- Leveraging these tools can lead to data silos. Since CDPs primarily focus on the marketing use case, other teams within the business have to procure or build separate solutions for their needs.
Companies are limited to sending raw data on to third parties, often without the ability to aggregate over data, add business logic or derive insights to forward to those third parties instead.
All of these issues will be discussed in more detail below.
1. No ownership and control over data
CDPs typically require sending all of your user behavioural data to their systems. This means you do not own your data, nor do you have control over where the data is processed and stored. This makes it increasingly difficult to comply with data privacy regulations such as GDPR and CCPA, as well as “right to be forgotten” requests that can arise due to these regulations. This also locks you into a specific vendor and makes you vulnerable to their pricing changes over time.
2. Low data quality
Because most CDPs are offered as public SaaS, they collect data about your customers and users as a third party. As a result, they’re often blocked by tracking restrictions introduced in the last year, including Apple’s ITP and other browser tracking prevention methods, as well as the privacy controls introduced in iOS14. All of which make it difficult for companies to reliably identify all their users. Furthermore, ad blockers can easily pick up CDP tracker names, creating further gaps in the data.
To enable easy forwarding of the collected data to third parties, data must follow a predefined, inflexible format, with limited ability to add more information to the events. Moreover, preprocessing of the data often happens in a black box. For example, it’s difficult to know how potential bot activity is treated, how different user identifiers are stitched together, or how much data is being excluded due to tracking issues.
In essence, companies often trade data richness and data quality for the ability to easily forward data to third parties. This tradeoff makes the use of data by other teams in the organization time consuming or outright impossible.
3. Data is siloed
The lack of flexibility around data collection methods, processing, and storage generally means that other teams across the organization end up procuring their own solutions for their use cases.
For example, the product team may use a dedicated product analytics tool, such as Pendo, Amplitude or Mixpanel, while marketing may use a packaged analytics tool such as Google Analytics 360 for their reporting and attribution combined with a CDP to manage marketing automation. Adding onto an already crowded stack, data engineering and data science teams often end up building their own solution to collect exactly the data they need for use cases such as personalization or recommendations.
Not only is this a significant investment for the business, different teams will also consume different data sets, which can lead to inconsistent decisions and distrust in data. The lack of a centralized data asset and holistic approach to understanding user behaviour makes it difficult to align on common strategies and build a truly data-informed culture in an organization.
4. Events, not insights
CDPs such as Segment focus on forwarding raw events to third parties. While this method gets raw data out to the tools teams use, there’s three main downsides to this data flow:
- You can’t add meaningful business context or aggregate data before sending it on. For example, as an ecommerce site, you might send purchase events to Facebook to target new users. With a CDP, you wouldn’t be able to reconcile purchases with returns ahead of time, wasting your ad budgets on customers already primed to return.
- You can’t calculate the correlation between different events or recommended next actions in your own systems. For example, this means Facebook would determine that users from a certain location are more likely to convert, but wouldn’t necessarily share that information back with you.
You’re leaking valuable information about your users to third parties that could, in some ways, compete with you. For example, a recruitment platform may share information about how users look for jobs with Facebook and Google, which also operate as job platforms.
Thus, companies are looking for new approaches to understanding their customer behaviour, and then leveraging that information to drive relevant marketing activities and meaningful user experiences across their platforms and channels.
The emergence of the modern data stack
Given the challenges discussed in the previous section, companies have started to move away from this approach toward a more modular setup, commonly referred to as the ‘Modern Data Stack’. Behavioral data platforms like Snowplow collect user data centrally and load into the data warehouse, where it can easily be joined with other data sets such as transactional data, CRM data, and more. All of this data can then be modeled for different use cases across the business using a tool like dbt. This prepares the data for consumption in BI tools like Looker or Tableau, and positions companies well to send this high-quality customer data back out to the front line tools they use most with reverse ETL with Census.
For example, an e-commerce company can build a segment of high-value customers in their data warehouse based on all the data they’ve collected throughout the sales journey, including browsing behaviour, purchases and returns. Census can then easily forward this segment to all the relevant marketing tools, treating the data warehouse as the single source of truth for all connected apps across the company. Simultaneously, other teams across the organisation can leverage this single source of truth too. For example, the product team can use the same data to build a product recommendation engine, or the buying team can leverage that data in dashboards that analyse trends and user preferences to inform their buying strategy.
For more information on using a CDP versus collecting data through a behavioural data platform, as well as using a reverse ETL tool to activate it in downstream systems, check out this excellent post by Census on the CDP that is already sitting in your data warehouse.
Coming next
In this chapter we have reviewed CDPs and the role they have played in helping companies build a single customer view. We also discussed their limitations, and what new approaches we see to help companies deliver excellent experiences to their customers and users. In the next chapter, we will dive deeper into how you can leverage Snowplow and Census to deliver behavioral data for actionable outcomes.