Blog

Process More, Spend Less: A Year of Breakthrough Snowplow Pipeline Improvements

By
Daniela Howard
&
September 29, 2025
Share this post

Over the past year, we've continued our focus on advancing our core pipeline and broader customer data infrastructure (CDI) capabilities. The result? A series of meaningful improvements that enhance how organizations process, store, and derive value from their event data.


Recent 2025 Releases

Measured Performance 

The numbers speak for themselves. Enrich now processes up to 50% more events per CPU compared to previous versions. This isn't just an incremental improvement. It allows you to process significantly more events with your existing infrastructure or downsize your setup while maintaining the same throughput.

Meanwhile, the latest Collector responds to requests up to 3× faster on average. These performance gains come from architectural optimizations that our engineering team has been building, ensuring your data pipeline can handle modern scale requirements without breaking.

Event Filtering

With improvements to Enrich, we introduced the ability to filter out unwanted events directly in the pipeline. Bot traffic, security scans, and other irrelevant events can now be eliminated before they consume compute and storage resources in your cloud infrastructure and data warehouse.

This isn't just about data quality, it's about cost optimization. By filtering out noise at the source, you're reducing infrastructure costs while ensuring your downstream analytics focus on the events that matter to your business.

Fast Data Loading

For organizations using Databricks, we're putting the finishing touches on our Databricks Streaming Loader that will deliver Snowplow data to your lakehouse within 1 minute or less, leveraging some of the latest Databricks features like Lakeflow Declarative Pipelines and Streaming Live Tables.

Reducing Infrastructure Costs

Later this month, we will roll out new event compression capabilities for Collector and Enrich that will compress data flowing between these two pipeline components. For high-volume customers processing billions of events monthly, data streaming often represents up to 45% of their total cloud infrastructure costs. Our compression feature is designed to reduce overall costs by up to 15% depending on event volumes.

Security at Scale

Security isn't an afterthought at Snowplow. We proactively monitor and resolve Common Vulnerabilities and Exposures (CVEs), including several classified as "high" or "critical" severity.

Collector eliminates vulnerabilities including CVE-2024-2961 [8.8 - high], CVE-2024-33599 [7.6 - high], and CVE-2023-44487 [7.5 - high], among others. Enrich resolves critical issues like CVE-2022-1471 [9.8 - critical] and CVE-2024-1597 [9.8 - critical].

We're also proactively deprecating third-party components that are no longer maintained, ensuring your pipeline stays secure as the threat landscape evolves.

The Upcoming Snowflake Authentication Deadline 

If you're using Snowflake for your cloud data warehouse, there's an important deadline approaching. Snowflake is deprecating password authentication for service users during the Summer of 2026, requiring key pair authentication for all data loading operations.

Legacy Snowplow components that rely on password authentication will require special exceptions from Snowflake to continue working. Our Snowflake Streaming Loader supports the required key pair authentication, aligning with both the compliance requirement and security best practices, while also delivering the performance and cost benefits mentioned above (a big upgrade to your data pipeline). 


Looking Ahead

Our engineering team continues to ship new products and features to support performance optimizations and security enhancements. These releases are designed to help organizations extract maximum value from their behavioral data, all with underlying enterprise security support.

The data landscape continues to change, and the infrastructure powering your analytics and real-time personalization use cases needs to evolve with it. Whether you're building real-time personalization systems, powering AI-driven applications, or scaling customer analytics, having a pipeline that can keep pace with business goals is essential.


A Look Back: Our 2024 Releases

Fast Data Loading

Our Snowflake Streaming Loader represents a complete reimagining of how Snowplow data reaches your warehouse. Compared to traditional batch loading approaches, this streaming solution delivers data in real-time with 100× reduced latency and 80% lower operational costs. The architecture is simple, yet effective, replacing complex multi-stage loading processes.

We’ve equally released the BigQuery Loader designed to also reduce costs and store data more efficiently with fewer columns.

Reducing Infrastructure Costs

The Lake Loader, now available across AWS, GCP, and Azure, provides cost-effective data storage using open table formats like Iceberg and Delta, giving you maximum flexibility in how you architect your data infrastructure.

Enhanced Reliability

The enhanced RDB Loader (for Redshift and Databricks users) brings improved durability, ensuring your data pipeline remains stable even when dealing with unexpected data variations.

Snowbridge (the low latency tool supporting our Event Forwarding) now includes OAuth2 support for enhanced security and smoother integrations with external systems.

The upcoming Enrich release will include improved failed events handling, making it easier to explore and recover from processing issues, ensuring no valuable data is lost.


Want to Learn More?

If you’re using an older version of Snowplow open source software, get in touch with our team to find out how these new advancements can improve your data infrastructure, increase performance, and move your business forward. Take advantage of the competitive edge that real time, reliable behavioral data can offer.

If you’re new to Snowplow, contact us to schedule a demo to learn about the full set of capabilities and benefits you gain with a platform designed to keep pace with AI.

Subscribe to our newsletter

Get the latest content to your inbox monthly.

Get Started

Whether you’re modernizing your customer data infrastructure or building AI-powered applications, Snowplow helps eliminate engineering complexity so you can focus on delivering smarter customer experiences.