Blog

Introducing Advanced Event Filtering: Streamline Your Snowplow Data Pipelines

By
Nick Stanchenko
&
May 7, 2025
Share this post

Data engineers and developers, we've heard your feedback. Processing irrelevant events wastes your resources and inflates your costs. Today, Snowplow announces a solution that addresses this problem at its source: pipeline-level event filtering during enrichment.

This new capability lets you define precise JavaScript conditions to identify and remove unwanted events—whether from bots, deprecated apps, or test environments—before they progress through the rest of your pipeline. By filtering at an early stage, teams eliminate unnecessary processing costs, storage fees, and downstream filtering complexity.

The Challenge: Not All Data Deserves Equal Treatment

In the world of data collection, volume doesn't always equal value. Even perfectly structured and enriched events can be irrelevant to your business goals:

  • Bot traffic ranging from search engine crawlers to malicious scanners generates events that distort your customer behavior metrics
  • Deprecated applications continue sending analytics data long after they've been phased out of your ecosystem
  • Test environments produce events structurally identical to production but irrelevant for analysis

These events don't just muddy your analytics—they cost real money through:

  • Streaming and compute resources needed for processing
  • Storage fees for data you'll never use
  • Additional workloads to filter them downstream

Previous Limitations

Until now, options for handling irrelevant events have been suboptimal:

  • Bot protection products offer high effectiveness for known patterns but add costs, potential UX impacts like CAPTCHAs, and are limited to just bot traffic
  • Validation failure approaches remove unwanted events from analytics but pollute data quality metrics and can mask legitimate issues
  • Downstream filtering creates clean data layers but processing and storage costs remain, while adding pipeline complexity

Introducing Pipeline-Level Event Filtering

Our new feature enables you to filter out unwanted events directly during the enrichment phase of your Snowplow pipeline. This approach offers several key advantages:

1. Cost Efficiency

Events filtered at this stage:

  • Don't consume additional compute resources
  • Don't incur storage costs
  • Aren't counted toward your Snowplow usage metrics

2. Developer-Friendly Configuration

Filter conditions are defined using JavaScript, giving you the flexibility to:

  • Target specific event fields
  • Leverage existing enrichment data (including bot detection signals)
  • Implement complex conditional logic for precision filtering

Getting Started

This feature is now available to all Snowplow BDP customers. To implement, please refer to the documentation or get in touch at support@snowplow.io.

Subscribe to our newsletter

Get the latest content to your inbox monthly.

Get Started

Accelerate data time-to-value and action your analytical & operational use cases with same-day pipeline deployments.