Merge on read (MoR)
Merge on read (MoR) stores upserts for file groups into a row-based delta log as they arrive, so write performance is high. Queries then check the delta log as well as the base file, which causes a small hit to query performance. A compactor merges delta log files onto the Parquet base files, restoring optimal query performance. The merge frequency is configured by the user. MoR can use both Parquet and Avro files, depending on the lakehouse architecture.
Not all data lakehouse projects support merge on read or support it in the same way.
Related terms: Apache Parquet; copy on write (CoW); data lakehouse
On the Onehouse website:
Stay in the know
Be the first to hear about news and product updates