Introducing Snowplow Identities: Real-Time Identity Resolution Built into Your Data Pipeline
Someone browses your site anonymously on mobile. They come back on desktop a few days later. They log in through your app. They convert through an email campaign. Four sessions, three platforms, one person.
But your stack doesn’t know that. It sees four separate users. Your attribution is wrong, your analytics are fragmented, and your marketing team is targeting someone who already converted.
Today we’re announcing Snowplow Identities: real-time, deterministic identity resolution built directly into your Snowplow pipeline. Every event, whether from web, iOS, or Android, gets a persistent snowplowId attached before it reaches your destination. Your events arrive already stitched to a single customer profile, all without batch processing or third-party data sharing.

How it works
Most identity resolution happens after the fact. Data lands in your warehouse, a batch process runs, and profiles get stitched together hours or days after the events happened. That works for retrospective reporting. It doesn’t work if you need to know who a user is right now.
Snowplow Identities works on the event stream itself. A graph-based model links and merges profiles as new identifiers appear. When someone authenticates, navigates across domains, or switches devices, the identity graph updates in real time. So your profiles evolve with your users rather than going stale between batch runs.
That’s a different architecture to what most identity tools do. Warehouse-native CDPs and standalone vendors resolve identities against data that’s already been collected and stored. Snowplow resolves at the point of collection, in the pipeline, before data hits the warehouse. If latency matters for your use case, that changes what’s possible.
What you get
Graph-based identity model. Links identifiers, sessions, and events in real time. Handles shared devices, cross-domain navigation, users switching between personal and work emails. The edge cases that flat profile tables break on.
Persistent snowplowId. A durable identifier attached as an entity to every event in your pipeline. One consistent join key across sessions, devices, and platforms.
dbt data models. Out-of-the-box models that produce one-row-per-user tables and identifier mapping tables in your warehouse. No custom modeling required to start consuming identity-resolved data.
Customizable resolution logic. Support for standard and custom identifiers, with configurable rules for assigning unique IDs and setting identifier priorities. You control how resolution works for your context.
Data residency and compliance. Hosted in your Snowplow cloud account, not a third party. Encrypted in transit and at rest, with native GDPR deletion support. Your customer data stays in the infrastructure you control.
Why this is harder than it looks
The concept sounds simple: link identifiers to form one profile per user. The execution is not.
Your key customer data points are often locked inside third-party tools that make them hard to export, join, or act on. When your identity graphs live inside a vendor’s system, you can’t see how profiles are being matched or merged. You’re trusting a black box.
Then there are the edge cases. Deterministic matching on exact identifiers is table stakes. But multiple users on the same device? Cross-domain navigation? Someone switching from their personal email to a work address? That needs logic that goes well beyond simple lookups.
And if your identity layer runs on a batch schedule, your profiles are always behind. You can’t personalize in real time against data that was resolved yesterday.
Snowplow Identities handles all of this. Resolution happens at the pipeline level, on first-party data, in your cloud account. No third-party data processing, which removes a whole category of governance risk.
What this changes
Once you can reliably connect events to users, everything downstream gets better.
Attribution. Without identity resolution, a user who finds you through a paid ad on mobile, researches on desktop, and converts via a retargeting email looks like three separate people. Identities ties those touchpoints together. You can see which channels actually drive conversions rather than over-crediting the last click.
Personalization and ML. Accurate profiles are a prerequisite for useful models. With Identities, you can aggregate behavioral data across platforms and sessions to build feature sets for recommendation engines, propensity models, and real-time personalization. Because the graph updates as events flow through the pipeline, your models work on the latest signals rather than yesterday’s batch output.
Audience targeting. When the same customer exists as multiple profiles, you get duplicate targeting, wasted spend, and inconsistent messaging. Identities gives you deduplicated, identity-resolved profiles you can sync to marketing tools, CRMs, ad platforms, and engagement systems.
Agentic analytics. Tools like Snowflake Intelligence and Databricks AI/BI Genie let users query the warehouse in natural language. But these agents are only as good as the data underneath. Ask “how many times did this customer visit before purchasing?” and the agent can’t connect anonymous sessions to the authenticated purchase if identities were resolved in batch. Or not at all. With Identities, events land in the warehouse already resolved. The agent can give you a straight answer without complex joins or post-processing.
Better with Snowplow Signals and Event Forwarding
Snowplow Signals triggers real-time actions based on behavioral patterns, directly in session, whether via the product or agent engagements. Event Forwarding pushes enriched events to downstream destinations as they happen. Both get significantly more useful when events carry a resolved identity.
If you think about it, a user who browses product pages on mobile and then opens your app on desktop isn’t two anonymous visitors generating two separate signals. With Identities, it’s one customer with full behavioral context. Snowplow Signals can trigger actions against a known profile. Event Forwarding pushes identity-resolved events straight to your CRM, personalization engine, or marketing automation platform. Your downstream systems work from the same single customer view your pipeline produces.
Get started
Snowplow Identities is now generally available. If you are an existing Snowplow customer, explore our documentation and talk to your account team to enable Identities on your pipeline. If you are evaluating Snowplow for the first time, request a demo from our team.