What we Learned Migrating a Financial Giant from Hudi to Delta (and Why Iceberg was in the Mix)
Regular 50 minute session for SQLBits 2026Thursday - 23 Apr 2026 - 15:10 - 16:00 Living LoungeTL; DR
A practical, vendor-neutral comparison of Iceberg, Delta Lake, and Hudi—grounded in real-world experience migrating a large investment-industry platform from Hudi to Delta. This session explores key architectural differences, the growing Iceberg–Delta race, and the challenges organizations face when moving off Hudi at scale.
Session Details
Having been a key member of a large-scale migration from Apache Hudi to Delta Lake for a major investment-industry client, I’ve seen firsthand how complex today’s lakehouse table-format landscape has become. As architectures mature, the competitive focus has sharpened—most visibly in the rapidly accelerating race between Apache Iceberg and Delta Lake to define the next generation of open data management standards. Yet despite the momentum around Iceberg and Delta, Apache Hudi remains deeply entrenched in many organizations, powering long-standing, high-throughput ingestion and CDC-driven pipelines.
This session offers a clear, vendor-neutral comparison of Iceberg, Delta Lake, and Hudi, unpacking the architectural differences and real-world trade-offs across schema evolution, ACID transactions, incremental processing, indexing, and engine interoperability. We will go beyond feature comparisons to examine a critical and often underestimated challenge: migrating production data from Hudi to Iceberg or Delta. Drawing on real enterprise experience, we will highlight practical obstacles such as metadata translation, pipeline rewrites, operational continuity, table layout changes, and governance implications.
Attendees will leave with a pragmatic understanding of why industry conversations tend to center on Iceberg vs. Delta, why many companies still rely heavily on Hudi, and what it truly takes to execute a successful migration. The session provides both a strategic decision framework and grounded lessons learned from real-world, large-scale implementations.
This session offers a clear, vendor-neutral comparison of Iceberg, Delta Lake, and Hudi, unpacking the architectural differences and real-world trade-offs across schema evolution, ACID transactions, incremental processing, indexing, and engine interoperability. We will go beyond feature comparisons to examine a critical and often underestimated challenge: migrating production data from Hudi to Iceberg or Delta. Drawing on real enterprise experience, we will highlight practical obstacles such as metadata translation, pipeline rewrites, operational continuity, table layout changes, and governance implications.
Attendees will leave with a pragmatic understanding of why industry conversations tend to center on Iceberg vs. Delta, why many companies still rely heavily on Hudi, and what it truly takes to execute a successful migration. The session provides both a strategic decision framework and grounded lessons learned from real-world, large-scale implementations.
3 things you'll get out of this session
1. Understand the core architectural differences between Apache Iceberg, Delta Lake, and Apache Hudi—including how they handle schema evolution, ACID transactions, incremental processing, and query-engine interoperability.
2. Evaluate the strategic and practical factors driving the industry’s Iceberg–Delta competition, and why many organizations still rely heavily on Hudi for high-throughput ingestion and CDC workloads.
3. Learn the key technical and operational challenges involved in migrating from Hudi to Iceberg or Delta—such as metadata conversion, pipeline refactoring, data layout changes, and governance impacts—based on real enterprise experience.
2. Evaluate the strategic and practical factors driving the industry’s Iceberg–Delta competition, and why many organizations still rely heavily on Hudi for high-throughput ingestion and CDC workloads.
3. Learn the key technical and operational challenges involved in migrating from Hudi to Iceberg or Delta—such as metadata conversion, pipeline refactoring, data layout changes, and governance impacts—based on real enterprise experience.
Speakers
Anna-Maria Wykes's other proposed sessions for 2026
From “Who Wrote This ETL?” to Databricks, Claude Saves the Day via Microsoft Foundry - 2026