June 1, 2023

Onehouse's privacy first data architecture

Onehouse's privacy first data architecture

When designing the data platform in your organization, some of the most important characteristics that should not be overlooked are the security, privacy, and compliance of the data architecture. Teams today are tasked with ensuring the architecture is well designed to prevent unauthorized access, prevent data exfiltration, and ensure compliance with complex and fast changing privacy regulations. Introducing third-party vendors who take control of your data into their environments can increase the scope of your infosec boundary and require you to put large trust in the hands of another company.

The tools you choose to use for Data Integration are a crucial part of your infrastructure since ingestion will touch all of your data and it sets the stage for how it is stored. In the vendor market today, a majority of Data Integration tools require you to export your data out of your cloud accounts and through their networks and compute.

In such cases, data exported out of your cloud account is processed and stored in external third party systems and becomes susceptible to vulnerabilities out of your control. With recent security events that occurred at places like Datadog suffering a breach from 3P CircleCI, Atlassian suffering a breach from 3P tool Envoy, and any company became vulnerable who was depending on 3P LastPass. It is increasingly hard to justify why data should leave your security boundaries. 

Onehouse Architecture

Onehouse architecture is designed to mitigate data privacy and security concerns. Onehouse deploys all services that process data within your private accounts and network. This ensures data remains protected within your VPC safeguarded by your organization’s existing security frameworks and policies and alleviates all data residency and sovereignty risks. By processing data on compute resources provisioned in your cloud account you can also benefit from negotiated pricing you already have in place. 

When you first onboard to Onehouse you can deploy our services with Cloud Formation or Terraform to get up and running in minutes. This gives full transparency to your IT administrators to review and audit all configurations in your environment. 

Onehouse architecture doesn’t require any inbound connection to your cloud account. Instead our services use a secured outbound connection to relay metadata and control messages to the control plane. This means you don’t have to punch holes in your network to expose ports to your databases and other data sources. Leveraging data lakehouse tech like Apache Hudi also allows you to easily comply with compliance regulations like GDPR. Read this article to learn how Zoom built efficient GDPR upserts for data deletion and ended up saving 80% on their compute costs and 90% on their storage costs.

To ensure your data lakehouse is both secure and interoperable, Onehouse writes data in open formats like Apache Hudi and now with newly released Onetable we can represent the same data as Apache Iceberg and Delta Lake formats. This prevents single vendor lock-in, and provides flexibility to extend or replace Onehouse with another solution at any given point.

If you want to learn more about Onehouse or would like to give it a try, please visit the Onehouse listing on the AWS Marketplace or contact gtm@onehouse.ai. It takes less than an hour to deploy onehouse and start consuming analytics ready tables in your data lakehouse.

Authors
No items found.

Read More:

test
Building an ExaByte-level Data Lake Using Apache Hudi at ByteDance

Subscribe to the Blog

Be the first to read new posts

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
We are hiring diverse, world-class talent — join us in building the future