Mastering Spark Notebooks and Capacity Optimization in Microsoft Fabric

Regular 50 minute session for SQLBits 2026

TL; DR

Learn how to get the most out of Spark Notebooks in Fabric by optimizing execution, cluster behavior, concurrency, and capacity usage. Practical patterns, real insights, and performance tips included.

Session Details

Running Spark notebooks in Microsoft Fabric opens up powerful possibilities—but also introduces a compute model that can feel unfamiliar, especially to those coming from a traditional SQL Server or data warehouse background. Every notebook gets its own dedicated compute session, and while this provides strong isolation, it can quickly lead to unexpected capacity consumption and limits if not managed thoughtfully.

This session offers a deep dive into how Spark compute works under the hood in Fabric, with a focus on how to run more efficiently without wasting Capacity Units. We’ll explore the impact of autoscaling Spark pools, how bursting works in practice, and the introduction of the new Autoscale Billing model that charges based on actual vCore usage per second, rather than maximum allocation. You’ll learn how to take control of your workloads through techniques like using small or single-node Spark pools, orchestrating notebooks with runMultiple(), and sharing sessions through High Concurrency Mode—both interactively and within Pipelines.

Whether you're building data pipelines, running exploratory work, or managing shared capacity across a team, this session will help you understand how Spark in Fabric behaves, how it’s billed, and how to optimize it for both performance and cost.

3 things you'll get out of this session

Understand how Spark Notebooks run inside Fabric (sessions, jobs, executors)
Reduce capacity consumption and avoid inefficient compute patterns
Diagnose performance problems using monitoring and Spark UI tools