The Universal Data Lakehouse Delivered

Accelerate and simplify your data lake journey with our fully-managed cloud service

Try it free

Our Product

Learn what Onehouse can do for you!

Continuous Data Ingestion

Effortlessly ingest data from your databases, event streams, cloud storage and other services at low latency. Built on industry leading change data capture technology for the lakehouse.

Automate Data Management

Onehouse eliminates tedious data chores by managing all of your table services that perform file-sizing, partitioning, cleaning, clustering, Z-order/Hilbert-Curves, compaction, masking, encryption, and more.

Low-Code Incremental Pipelines

Create declarative templates for low-latency incremental ingestion and transformation pipelines. Forget about operational burdens of scheduling, monitoring, and data quality management.

Data at your fingertips

Full ecosystem support for all major catalogs, query engines, and table formats through XTable, so you can plug and play the analytics tool of your choice. Data is automatically synced and ready for a self-serve experience for all of your data science and analytics workloads.

SOC 2 Type II Certified

Onehouse has been certified for both SOC 2 Type I (2022) and SOC 2 Type II (2023). Our Type II audit report featured zero deviations and attests that Onehouse has met tough standards set forth by the American Institute of Certified Public Accountants (AICPA).

How Customers Use Onehouse Today

Onehouse works with a variety of customers from large enterprises to startups who are starting their data journey. We have experience working across all verticals from Technology, Finance, Healthcare, Retail, and beyond. See what customers are doing with Onehouse today:

Full Change Data Capture

A Onehouse customer with large deployments of MySQL has many transactional datasets. With Onehouse they extract changelogs and create low-latency CDC pipelines to enable analytics ready Hudi tables on S3.

Sources

Analytics

Real-time machine learning pipelines

An insurance company uses Onehouse to help them generate real-time quotes for customers on their website. Onehouse helped access untapped datasets and reduced the time to generate an insurance quote from days/weeks to < 1 hour.

Sources

Analytics

Replace long batch processing time

A large tech SaaS company used Onehouse’s technology to reduce their batch processing times from 3+ hours to under 15 minutes all while saving ~40% on infrastructure costs. Replacing their DIY Spark jobs with a managed service, they can now operate their platform with a single engineer.

Sources

Analytics

Ingest Clickstream data

A talent marketplace company uses Onehouse to ingest all clickstream events from their mobile apps. They run multi-stage incremental transformation pipelines through Onehouse and query the resulting Hudi tables with BigQuery, Presto, and other analytics tools.

Sources

Analytics

Full Change Data Capture

A Onehouse customer with large deployments of MySQL has many transactional datasets. With Onehouse they extract changelogs and create low-latency CDC pipelines to enable analytics ready Hudi tables on S3.

Sources

Analytics

Real-time machine learning pipelines

An insurance company uses Onehouse to help them generate real-time quotes for customers on their website. Onehouse helped access untapped datasets and reduced the time to generate an insurance quote from days/weeks to < 1 hour.

Sources

Analytics

Replace long batch processing time

A large tech SaaS company used Onehouse’s technology to reduce their batch processing times from 3+ hours to under 15 minutes all while saving ~40% on infrastructure costs. Replacing their DIY Spark jobs with a managed service, they can now operate their platform with a single engineer.

Sources

Analytics

Ingest Clickstream data

A talent marketplace company uses Onehouse to ingest all clickstream events from their mobile apps. They run multi-stage incremental transformation pipelines through Onehouse and query the resulting Hudi tables with BigQuery, Presto, and other analytics tools.

Sources

Analytics

Do you have a similar story?

Meet With Us

How Does Onehouse Fit In?

You have questions, we have answers

What is a Lakehouse?

A Lakehouse is an architectural pattern that combines the best capabilities of a data lake and a data warehouse. Data lakes built on cloud storage like S3 are the cheapest and most flexible ways to store and process your data, but they are challenging to build and operate. Data warehouses are turn-key solutions, offering capabilities traditionally not possible on a lake like transaction support, schema enforcement, and advanced performance optimizations around clustering, indexing, etc.

Now with the emergence of Lakehouse technologies like Apache Hudi, you can unlock the power of a warehouse directly on the lake for orders of magnitude cost savings.

Is Onehouse an enterprise Hudi company?

While born from the roots of Apache Hudi and founded by it’s original creator, Onehouse is not an enterprise fork of Hudi. The Onehouse product and its services leverage OSS Hudi, to offer a data lake platform similar to what companies like Uber have built. We remain fully committed to contributing to and supporting the rapid growth and development of Hudi as the industry leading lakehouse platform.

Does Onehouse aim to replace other tools in my stack like Databricks or Snowflake?

No, Onehouse offers services that are complementary to Databricks, Snowflake, or any other data warehouse or lake query engine. Our mission is to accelerate your time to adoption of a lakehouse architecture. We focus on foundational data infrastructure that are left out as DIY struggles today in the data lake ecosystem. If you plan to use Databricks, Snowflake, EMR, BigQuery, Athena, Starburst, we can help accelerate and simplify your adoption of these services. Onehouse interoperates with Delta Lake and Apache Iceberg, to better support Databricks and Snowflake queries respectively through the XTable feature.

Where does Onehouse store my data and is it secure?

Onehouse delivers its management services on a data plane inside of your cloud account. Unlike many vendors, this ensures no data ever leaves the trust boundary of your private networks and sensitive production databases are not exposed externally. You maintain ownership of all your data in your personal S3, GCS, or other cloud storage buckets. Onehouse’ commitment to openness is to ensure your data is future-proof. As of this writing, Onehouse is SOC2 Type I compliant. We are also multi-cloud available.

When would I consider using Onehouse?

If you have data in RDMS databases, event streams, or even data lost inside data swamps, Onehouse can help you ingest, transform, manage, and make all of your data available in a fully managed lakehouse. Since we don’t build a query engine, we don’t play favorites and focus simply on making your underlying data to be performant and interoperable to any and all query engines.

If you are considering a data lake architecture, to either offload costs from a cloud warehouse or unlock data science, machine learning, Onehouse can provide standardization around how you build your data lake ingestion pipelines and leverage the battle-tested and industry leading technologies like Apache Hudi, to achieve your goals at much reduced cost and efforts.

What is Onehouse pricing?

Onehouse meters how many compute-hours are used to deliver its services and we charge an hourly compute cost based on usage. Connect with our account team to dive deeper into your use case and we can help provide total cost of ownership estimates for you. With our prices, we are proven to significantly lower the cost of your alternative DIY solutions.

How can Onehouse help existing Hudi users?

If you have large existing Hudi installations and you want help operating them better, Onehouse can offer a limited one-time technical advisory/implementation service. Onehouse engineers and developer advocates are active daily in the Apache Hudi community Slack and Github to answer questions on a best-effort basis.

Signup for a free trial to receive $1,000 free credit for 30 days

Try It Free
We are hiring diverse, world-class talent — join us in building the future.