Build a Lakehouse in a Day with Metadata & Open-Source Tools
Proposed session for SQLBits 2026TL; DR
During this workshop, participants will get an in-depth understanding of how CF.Cumulus can integrate Azure Data Factory, Azure Databricks, Azure Synapse Analytics, and Microsoft Fabric and other resources to streamline data insight deliveries.
Session Details
Unlock the power and speed of a metadata-driven Lakehouse architecture.
In the fast-paced, data-driven world, the ability to swiftly and efficiently deliver a robust data platform is key to maintaining a competitive edge. Join us for an immersive, full-day hands-on workshop, where we will guide you through the process of building a metadata-driven Lakehouse using the open-source product framework, known as CF.Cumulus. Leveraging and abstracting Microsoft cloud native technologies to ease delivery challenges.
During this workshop, participants will get an in-depth understanding of how CF.Cumulus can integrate Azure Data Factory, Azure Databricks, Azure Synapse Analytics, and Microsoft Fabric and other resources to streamline data insight deliveries.
Our expert instructors will provide practical insights on overcoming common data challenges, including fragmented data ingestion, change data capture, and orchestration scalability, using our proven best practices.
Attendees will learn how to utilise metadata, open-standards, and seamless cloud integration to accelerate time-to-insight with minimal technical debt, ensuring cost control and operational resilience. This workshop is ideal for both data engineers and data leaders who are looking to enhance their cloud data platform delivery and unlock the potential to build a Lakehouse in a day using a metadata driven approach.
In the fast-paced, data-driven world, the ability to swiftly and efficiently deliver a robust data platform is key to maintaining a competitive edge. Join us for an immersive, full-day hands-on workshop, where we will guide you through the process of building a metadata-driven Lakehouse using the open-source product framework, known as CF.Cumulus. Leveraging and abstracting Microsoft cloud native technologies to ease delivery challenges.
During this workshop, participants will get an in-depth understanding of how CF.Cumulus can integrate Azure Data Factory, Azure Databricks, Azure Synapse Analytics, and Microsoft Fabric and other resources to streamline data insight deliveries.
Our expert instructors will provide practical insights on overcoming common data challenges, including fragmented data ingestion, change data capture, and orchestration scalability, using our proven best practices.
Attendees will learn how to utilise metadata, open-standards, and seamless cloud integration to accelerate time-to-insight with minimal technical debt, ensuring cost control and operational resilience. This workshop is ideal for both data engineers and data leaders who are looking to enhance their cloud data platform delivery and unlock the potential to build a Lakehouse in a day using a metadata driven approach.
3 things you'll get out of this session
• Demonstrate how to build a metadata-driven Lakehouse using Microsoft cloud native technologies.
• Showcase strategies to overcome data challenges such as fragmented data ingestion, change data capture, and orchestration scalability.
• Highlight the use of automation, open-standards, and seamless cloud integration to achieve fast, efficient data platform delivery with minimal technical debt.
Speakers
Paul Andrew's other proposed sessions for 2026
Deciphering Data Architectures full-day workshop - 2026
An Evolution of Cloud Data Architectures - Lambda, Kappa, Delta, Mesh & Fabric - 2026
An Introduction to Delta Lake and The Lakehouse - 2026
Building Near Real-time Data Solutions in Microsoft Azure & Fabric - 2026
Data & Community: An Amazing Network Of Peers Supporting Innovation & Growth - 2026
Data Modelling: The Lost Art of Turning Inputs into Insights - 2026
Designing & Delivering Data Products: From Mesh Principles to Data Fabric Automation - 2026
Fabric Data Activator: Real-Time Data Feeds, Automated Alerts & Stock Intelligence - 2026
Fast-Track Your Lakehouse Build with a Metadata Framework - 2026
Microsoft Fabric Platform Governance - Where To Start - 2026
Paul Andrew's previous sessions
An Evolution of Data Architectures - Lambda, Kappa, Delta, Mesh & Fabric
How has advancements in highly scalable cloud technology influenced the design principals we apply when building data platform solutions?
Building an Azure Data Analytics Platform End-to-End
Based on real world experience let’s think about just how far the breadth of our knowledge now needs to reach when starting from nothing and building a complete Microsoft Azure Data Analytics solution.
Creating a Metadata Driven Orchestration Framework Using Azure Data Integration Pipelines
We'll explore delivering this framework within an enterprise and consider an architect’s perspective on a wider platform of ingestion/transformation workloads with multiple batches and execution stages.
ETL in Azure Made Easy with Data Factory Data Flows
What happens when you combine a cloud orchestration service with a Spark cluster?! The answer is a feature rich, graphical, scalable data flow environment to rival any ETL tech we’ve previously had available in Azure.
Using Azure DevOps for Azure Data Factory
DevOps as a concept does not always translate to the technology when implemented. In this session we'll explore that problem when working with Azure Data Factory and what the different cloud only CI/CD options are.
Complex Azure Orchestration w Dynamic Data Factory Pipelines
If you have already mastered the basics of Azure Data Factory (ADF) and are now looking to advance your knowledge of the tool this is the session for you.
Building an End to End IoT Solution Using Pi Sensors & Azure
Demonstrating an end to end IoT solution providing real-time sensor data from a Raspberry Pi into an Azure IoT Hub, through Stream Analytics, then with outputs to Power BI and SQL DB. Learn how to build this simplified IoT solution from scratch.
Matt Collins
medium.com/@mc12338
Matt Collins's other proposed sessions for 2026
Fast-Track Your Lakehouse Build with a Metadata Framework - 2026
Metadata To Mermaid Diagrams: Visualising Pipeline Lineage At Runtime - 2026
Python In Microsoft Fabric: Execution Options And Scaling - 2026
Rapid Data Insight Delivery - Breaking The Barrier To Entry For SMBs - 2026