Practical lessons in optimizing Data Engineering with Spark - Part 1

50+50 minute session for SQLBits 2026

TL; DR

Part 1 explores lessons from large Fabric data engineering deployments and how to optimize your platform. It focuses on Delta tables in Fabric, tuning options, how they work, and their impact on workloads, using demos to show real-world trade-offs.

Session Details

Based on Luke's experience in working with some of the largest Fabric deployment, this session walks practical tips for optimizing data engineering focused Fabric deployments. This session is very demo-centric to provide real world examples of the tradeoffs that occur and how simple changes can have a large impact - especially as a deployment scales.

In this Part 1 we will focus on the Delta as the primary table format within Fabric including:
- the impact of features like change data feed, delete vectors and more
- enhancements in Delta 4.0, coming as part of the Fabric 2.0 runtime
- how metadata tables and logging tables in Delta can impact performance - and ways to solve these concerns.

3 things you'll get out of this session

1 - understanding of different delta tables features and what they do.
2 - understanding of when different properties might make sense (and not) and importantly how to test these options.
3 - detailed knowledge on when delta tables are a poor fit and shouldn't be used.

Luke Moloney's other proposed sessions for 2026

Airflow in Fabric - an introduction informed by experience - 2026

Fabric for ISVs - Develop multi-tenancy apps on Fabric - 2026

Lessons from large-scale Fabric deployment - 2026

Practical lessons in optimizing Data Engineering with Spark - Part 2 - 2026

Luke Moloney's previous sessions

Flying High with Data Engineering in Microsoft Fabric

In this session demo centric session we will walk through Data Engineering in Microsoft including: Getting started with Notebooks, Learning how Fabric Lakehouse's provides flexibility to use the tools you prefer, The flexibility of data engineering in Spark - including Cluster Configuration, Environments and Library Management and finally, How Copilot enables all developers to be more efficient with Spark.

Harnessing Data Science and AI in Fabric

In this session you will learn about all the latest data science developments in Microsoft Fabric. We will show you how Fabric supports the end-to-end data science workflow and how we plan to evolve these capabilities. We will also look at exciting new AI capabilities including Copilot experiences and more!

Synapse Q&A with PG

SQLBits' has brought members of the Azure Synapse Program Group to Wales to ask YOUR questions. Bring your questions for: Dedicated SQL Pools, Serverless SQL Pools, Spark Pools, Pipelines, Kusto and everything else Synapse related.

The fundamentals of building a lakehouse with Synapse