Building a web analytics stack: packaged vs modular
Download the full eBook Rethinking modern web analytics
All-in-one analytics solutions are wildly popular. And rightly so. To take Google Analytics as an example, Google’s decision to purchase Urchin in 2005 enabled them to enter the market early and bring web analytics to the masses.
According to a traffic usage survey, in 2008 Google Analytics was used by 55.1% of all the websites, amounting to a tool market share of 84.3%. Despite many new players entering the industry, today GA have managed to hold onto their dominance with an eye-watering 75% of the market. It’s fair to say that GA continues to be the go-to tool for web analytics, and for many organizations it is a hugely powerful solution that helps them get started quickly.
But despite the popularity of tools like Google Analytics (and other packaged tools), there are a number of challenges organizations run into when only relying on packaged tools for their web analytics. From browser privacy challenges and data silos, to a lack of control, it’s worth exploring what these challenges mean on a practical level to your business , and why a move to a more modular stack could be a better approach in the long term.
Watch: Building a strategic data capability
That being said, packaged tools are popular for a reason. It wouldn’t be fair – or accurate – to say that all companies should ignore packaged analytics solutions, and for many teams starting out on their data journey, packaged tools offer distinct advantages.
Packaged analytics tools are ideal for getting started
Early in the data maturity journey, it’s often not wise or necessary to build out a complex technology stack. This is where packaged analytics can shine.
- They are quick to set up and get going. A huge advantage of packaged tools is that they deliver value quickly. They can give you a quick understanding of how users are interacting with your websites and platforms, while you can always build out a wider set of use cases later.
- They offer an all-in-one solution. Packaged analytics tools are exactly that – a package, which means data collection, modeling and visualization are all included. This eliminates the need to hunt down and purchase multiple solutions, which is particularly advantageous at an early stage when resources are limited.
- They’re easy to use. While this may not be a major benefit to data teams or engineers, marketing teams and other internal data consumers can easily self-serve data from packaged analytics tools like Google Analytics, without being SQL proficient.
However, for all their advantages and simplicity, packaged solutions have their drawbacks.
Don’t stick to the packaged tools that you’re used to
It can be tempting to stay with the analytics tools you’ve grown used to. The risk here is that you’re not fulfilling the potential of one of your greatest business assets: your behavioral data. Packaged analytics solutions are limited in the following ways;
- They are one-size-fits all. Packaged analytics tools are designed to be off-the-shelf solutions. They are not customized to your particular needs or business logic. This can have serious implications for use cases such as marketing attribution, when a tool decides what counts as a ’conversion’ or ‘acquisition’ on your behalf. Often the assumptions a packaged tool makes on behalf of its customers are unhelpful or inaccurate.
This can be especially problematic for organizations that do not fit the mould of the typical e-commerce transaction, such as jobs boards or marketplaces with multiple users.
“With Snowplow, we discovered that we own the data, which isn’t formatted in a way that forces you to a specific use case — it’s free and open so you can do what you want with it. We collect the data, use it to build a BI dashboard and connect it to the product to help our contributors” – Timothy Carbone, Data Engineer, Unsplash
- They are black boxes. Let’s imagine you’ve set up your packaged analytics tool, and you’re beginning to explore data about your web visitors. For some of your web pages, your bounce rate looks pretty high, why is that?
At this point, you have no control over how ‘bounce rate’, ‘time spent’ or other important web metrics are recorded. You don’t even know how your data is captured and processed, where is it hosted? What logic goes into defining certain events?
Since packaged tools are closed off, you cannot look under the hood and discover (let alone change) the way your web data is being manipulated. It’s also often difficult (or impossible without paying large fees) to obtain and work with the raw data, before it becomes opinionated and modeled. For organizations beginning to recognize data as one of their most important assets, this is a red flag. It means you’re handing over control and ownership of your valuable behavioral data to a third party.
“Other solutions were like black boxes, and that is not the direction we wanted to take. We wanted a solution to become a core part of our business.” Kevin James Parks, Data Engineer, Tourlane
- They are siloed. Many companies are realizing the strategic benefits to building a single customer view. That is to say, unifying data sets from all your platforms and channels to construct a cohesive understanding of your users.
But with packaged analytics tools, it’s extremely difficult to unify data in this way, because your data is siloed off and structured completely differently to data captured from, say, social media, CRM, and other channels. Without the ability to structure the data the way you’d like, or access to the raw data, your data is stuck in your packaged analytics tools where its value is limited to only a few use cases, perhaps just reporting and analysis. Which brings us to our next drawback.
- They are limiting. The way companies work with data is constantly evolving. We’ve seen companies like Spotify use behavioral data to give their listeners unique experiences such as their weekly recommended playlists. There are now a number of game-changing use cases that can be achieved with behavioral data, from personalized content to product analytics and customer journey mapping and the list is growing.
Organizations relying solely on packaged analytics tools to capture and process their data run the risk of missing out on these opportunities, because their raw data is ‘stuck’ in the solution and often it’s either impossible (or expensive) to get it out. And as the competition to attract and retain customers in a post-digital world escalates, missing the potential to leverage game-changing data use cases could be fatal.
We want to be able to control and own all of our data. Snowplow is open source, which means that we can have confidence in it; we can look at the code and figure out what’s going on or change things. – Rahul Jain, Principal Engineering Manager, Business Intelligence Platform, Omio
- They make it difficult to build assurance in data quality. Organizations cannot build assurance that their data is accurate and complete without taking ownership of their data infrastructure.
This is often overlooked, but by relinquishing control of how their data is captured, processed and modeled, companies also lose control of the integrity of that data. Can our data be actioned effectively by key data consumers? Is it structured in a way that analysts can work with? Is the data complete, or are we missing data to ad-blockers and third-party cookies? These are questions that should be asked at the early stages of data capture, in order to ensure that the whole organization can get maximum value from their data – their most important asset.
Watch: identity resolution in a privacy conscious world
I want to break free – breaking out into a modular, best-in-class data stack
It’s not easy, but building a data stack to power your web analytics (and beyond) is worth the effort in the long run.
To get there, you will need to consider how to shape your end-to-end data infrastructure, from data capture, to modeling and transformation, to warehousing/storage, visualization and more. It will require investigation into a number of different options, and evaluating the choices between building, buying or running open source versions of the best-in-class solutions.
Your data team will likely lead the charge towards building a future proof data stack. But that doesn’t mean they should build all their own solutions. There is a growing market of cutting edge technologies for web analytics (and wider use cases) for you to explore.
We’ll cover more on the best tools for your web analytics in our next post, but for now, here are some key categories to consider when putting together your stack:
- Data capture and management
Capturing and managing behavioral data from web channels should be one of your first concerns when it comes to building the stack. Explore platforms that offer you complete control over your data and flexibility to decide its structure.
- Data Visualization
To provide your internal data consumers with the best insights, you’ll need a solution for visualizing and exploring the data. Look out for tools that make it possible for teams to self-serve data, without creating bottlenecks.
- Data Monitoring
Measuring and improving your data quality is a huge factor in getting the most from your web data. These tools will help you build assurance in your web data, so your internal teams can be confident their data is reliable and trustworthy.
- Tag Management
Tag management systems or ‘TMS’s are at the heart of your web analytics and marketing. They are especially important when it comes to setting cookies, capturing key information about your users and visitors (while respecting their privacy). Consider a TMS that allows for server-side tagging and one which is compatible with your other technologies.
- Testing/Debugging
Testing your web analytics stack for tracking failures is not the most exciting aspect of your stack, but it’s one of the most important. We recommend integrating tracking as part of your automated testing suites, so you can ensure your new builds don’t ship without properly functioning trackers ready to go.
- Data Transformation
Transforming, reformatting or modeling your data are all essential to ensuring your internal teams can action the data set that is most relevant to them. A good data transformation tool will enable you to turn raw data into actionable data sets that are understood and trusted by cross functional teams.
Not all about the stack
Technology is important, but ultimately it’s your people and processes that make the difference. Having the best tools available will not help you achieve your goals with web analytics. In fact, it’s sometimes best to start simply, build out a data team that will fit your key use cases and evolve your tech stack over time.
“A data stack will not move you along the data maturity curve if the team and processes in place aren’t already appropriate.” – Archit Goyal, Solutions Architecture Lead at Snowplow
If you’re unsure where to start, our internal experts can help you identify your immediate needs and scope out how you can realize your ambitions with behavioral data. It’s worth remembering that your organizations’ experience with data is a journey – there is nothing wrong with starting small, and building as you grow.