What’s the difference between SaaS and private-SaaS (or Bring Your Own Cloud – BYOC)?
With regulatory frameworks becoming increasingly strict on the ownership and good governance of data, alternative models to SaaS tools are becoming more prominent.
What is private SaaS/Bring Your Own Cloud (BYOC)?
SaaS can be seen as a ‘multi-tenanted environment’ or a public cloud in which the Vendor hosts and manages a data tool for many clients.
Private SaaS, on the other hand, means the data pipeline is run in a client’s private storage environment, while ongoing pipeline maintenance is managed by the Vendor.
This model offers a combination of ownership and convenience, while requiring some technical know-how on the part of the Client.
SaaS vs private SaaS vs self-hosted data tools
This article focuses only on Cloud storage locations.
“Self hosted” generally refers to open source products.
|Hosting||Vendor public cloud||Client private Cloud||Client private Cloud|
|Choice of storage location (GDPR)||Rare||Yes||Yes|
|Data compliance||Dependent on Vendor||Significant Client control||Full Client control|
|Data governance||No||Full Client control||Full Client control|
|Management costs/overheads||Low||Fairly low||High|
|Black-box decisions made by Vendor||Frequent||No||No|
An example of a private SaaS/BYOC data pipeline
Critically, a private-SaaS deployment sees the Client’s data move from private digital products, such as apps and websites, to a fully owned storage destination without ever leaving a private cloud environment. This offers complete transparency on the way the data pipeline works, including any logic built into it.
Get started on your journey
The costs and benefits of private SaaS/BYOC applications for digital analytics
Private SaaS often brings some technical requirements that SaaS customers don’t need to think about, primarily involving the hosting of the service in their private cloud. Despite the fact that this is managed by the Vendor, the Client has to set up permissions and configure their cloud environment correctly in order to get started.
Providing these fairly simple steps are taken correctly, however, private SaaS offers a host of benefits, particularly around ownership and compliance. As the whole infrastructure is owned by the Client, every single decision taken about the data can be fully scrutinized as well as any associated metadata (great for monitoring and observability). This means audits can be carried out without hindrance, and decisions traced back all the way to first principles.
The good, the bad and the ugly of SaaS applications
SaaS tools have different degrees of transparency.
A minority are fully open-source with the option of a hosted SaaS solution. These solutions can provide full visibility into what’s going on under the hood whilst offering to host the platform to make things easier for the customer (shameless plug: we’ve just created such a solution called BDP Cloud).
The majority of SaaS tools, however, do not offer this transparency. Larger suites of data tools operate in the Vendor’s cloud and don’t allow the Client any visibility into the inner workings of the data pipeline. These inner workings are, in effect, the value provided and so are proprietary.
The benefits of SaaS
SaaS tools can still be very convenient and deliver a great time to value, frequently offering suites of tools that all seamlessly integrate together.
The most famous example in the world of data is Google; users of Google Analytics have the advantage of being able to easily integrate with a massive array of tools in the Google ecosystem, such as Pubsub for real-time streaming, Bigquery for storage and Data Studio for visualization, not to mention Google Ads, which is very convenient for marketers.
The problems with SaaS
Since Google Analytics is a black-box SaaS tool, it does have its limitations. Software of this type, also including Mixpanel, Segment, and Adobe, means users lack full governance over their data pipelines. This creates downstream issues, with one very topical example being a lack of ‘data sovereignty’ (i.e. users not having the choice of where to store or process their data). This has led to large fines being issued in France, Austria and Denmark for violations of GDPR.
Another example of a black-box limitation common with packaged SaaS tools is the question as to how your data is used by the provider. Within Google’s walled garden, for example, your data might be sold as part of their advertising ecosystem. You cannot control how or where this data is used, even with reference to the vague terms of service. The erosion of Privacy Shield has further undermined confidence in this type of arrangement.
A similar issue is found with how metrics are defined, such as session-length or how a ‘user’ is defined, with hidden decisions being baked into the SaaS logic. Is Google’s 30-minute session definition, for example, optimal for all digital products, without exception?
Users of SaaS tools should therefore plan for the potential trade-off between convenience and transparency by making a strong business case for their choice of deployment – private SaaS, SaaS or self-hosted.
Snowplow’s approach to private SaaS/BYOC
Snowplow has always been based on a private-SaaS ethos. We believe strongly that businesses should have full visibility over how their data is managed.
Snowplow manages everything from the automation, deployment and monitoring of pipelines, right up to the scaling and maintenance. Our BDP Enterprise tool is a private SaaS streaming analytics pipeline for behavioral event data. Metrics like ‘time on page’ and ‘session length’ can be tracked with incredible accuracy and used to create data products and applications, such as churn propensity or marketing attribution. Within this, our technical teams support integration with multiple cloud environments and manage integrations with existing customer systems.
“What we offer is a fully managed service, but it’s isolated in a client’s own sub-account. So essentially, what that means is that each client comes to us and gives us their own sub-account, such as their own Google Cloud project, and we set up and maintain a full data pipeline within that. This way every client has their own isolated infrastructure entirely segmented from every other client. Basically, there’s no shared tenancy across anything”Josh Beemster, Head of Engineering at Snowplow