Mastering Spark Notebooks and Capacity Optimization in Microsoft Fabric
Proposed session for SQLBits 2026TL; DR
Learn how to get the most out of Spark Notebooks in Fabric by optimizing execution, cluster behavior, concurrency, and capacity usage. Practical patterns, real insights, and performance tips included.
Session Details
Running Spark notebooks in Microsoft Fabric opens up powerful possibilities—but also introduces a compute model that can feel unfamiliar, especially to those coming from a traditional SQL Server or data warehouse background. Every notebook gets its own dedicated compute session, and while this provides strong isolation, it can quickly lead to unexpected capacity consumption and limits if not managed thoughtfully.
This session offers a deep dive into how Spark compute works under the hood in Fabric, with a focus on how to run more efficiently without wasting Capacity Units. We’ll explore the impact of autoscaling Spark pools, how bursting works in practice, and the introduction of the new Autoscale Billing model that charges based on actual vCore usage per second, rather than maximum allocation. You’ll learn how to take control of your workloads through techniques like using small or single-node Spark pools, orchestrating notebooks with runMultiple(), and sharing sessions through High Concurrency Mode—both interactively and within Pipelines.
Whether you're building data pipelines, running exploratory work, or managing shared capacity across a team, this session will help you understand how Spark in Fabric behaves, how it’s billed, and how to optimize it for both performance and cost.
This session offers a deep dive into how Spark compute works under the hood in Fabric, with a focus on how to run more efficiently without wasting Capacity Units. We’ll explore the impact of autoscaling Spark pools, how bursting works in practice, and the introduction of the new Autoscale Billing model that charges based on actual vCore usage per second, rather than maximum allocation. You’ll learn how to take control of your workloads through techniques like using small or single-node Spark pools, orchestrating notebooks with runMultiple(), and sharing sessions through High Concurrency Mode—both interactively and within Pipelines.
Whether you're building data pipelines, running exploratory work, or managing shared capacity across a team, this session will help you understand how Spark in Fabric behaves, how it’s billed, and how to optimize it for both performance and cost.
3 things you'll get out of this session
Understand how Spark Notebooks run inside Fabric (sessions, jobs, executors)
Reduce capacity consumption and avoid inefficient compute patterns
Diagnose performance problems using monitoring and Spark UI tools
Speakers
Just Blindbæk's other proposed sessions for 2026
Building a Unified Semantic Model - The 20-Minute Pattern - 2026
Choosing the Right Storage Mode: Import, Direct Lake, DirectQuery & Hybrid - 2026
Deep Dive: Mastering Semantic Model Size & Refresh Efficiency in Fabric Capacities - 2026
Deep Dive: Mastering Semantic Model Size & Refresh Efficiency in Fabric Capacities - 2026
Demystifying the Data Lakehouse in Microsoft Fabric - 2026
Unified Semantic Models & KPI Reporting: Architecture That Scales - 2026
Just Blindbæk's previous sessions
Mastering Fabric: Workspace Governance and Organization
Learn how to optimize and master workspace organization in Fabric. Explore best practices for effective governance, enhancing user experience, security, and permissions management. Gain practical insights and checklists for streamlined workspace management.
The Hitchhiker's Guide to navigating Power BI
Learn to navigate Power BI seamlessly in this concise session. Explore key areas: Internal Report navigation, App transitions, and Web Portal optimization. Elevate user experience with insights on service navigation, integration, and essential tools like Drill Through, Buttons, and Bookmarks.
Architectural blueprints for the Modern Data Warehouse
This session will walk you through three different ways to set up an affordable Azure based Data Warehouse solution. Covering pros and cons with each of the architectures.
Power BI: From Self-Service to Enterprise
How do you master to grow a Power BI solution from self-service to something that is scalable, managed and governed? A solution that can be trusted and used in the whole enterprise! Make a successful ownership transfer!
Excel Services 2013-the BI frontend of the future
The attendees will get a overview of the possibilities with Excel Services 2013 in a Business Intelligence perspective.
(1) Overview of the publish options
(2) How to control the layout
(3) Tips & Trick from real world examples