22-25 April 2026

Spark Tuning & Best Practices: From Theory to Production-Ready Performance

Regular 50 minute session for SQLBits 2026

TL; DR

An in-depth, production-driven look at Apache Spark optimization at Unity. Learn practical tuning techniques—data skew handling, query and memory optimization—to diagnose issues and cut job runtimes from hours to minutes.

Session Details

in-depth exploration of Apache Spark optimization techniques drawn from real-world production experience at Unity. This session bridges the gap between Spark fundamentals and practical performance tuning strategies that can dramatically improve your data processing pipelines.

What You'll Learn:

- Understanding Spark Architecture: Deep dive into executors, RDDs, and how Spark processes data across distributed clusters

- Tackling Data Skew: Learn to identify and resolve one of the most common performance bottlenecks, including:
- Adaptive Query Execution (AQE) strategies
- Manual salting techniques
- Handling null keys in joins
- Real case study: How we reduced a 6-hour job to under 3 hours

- Query Optimization Techniques:
- Proper partition key usage and filtering strategies
- Join optimization (Broadcast vs. Shuffle joins)
- Shuffle partition tuning for optimal performance
- Understanding Coalesce vs. Repartition trade-offs

- Memory Management: Configure executor and driver memory for stability and performance

- Production Case Studies: Real-world examples from Unity's data infrastructure, including challenges faced and solutions implemented

Who Should Attend:
This session is ideal for data engineers, data scientists, and analytics engineers who work with Spark and want to move beyond basic implementations to production-grade, optimized pipelines. Whether you're struggling with slow jobs, OOM errors, or data skew issues, you'll leave with actionable strategies to improve your Spark applications.

Key Takeaway:
Learn how to diagnose performance issues, understand the "why" behind Spark's behavior, and apply proven optimization patterns that can reduce job execution time from hours to minutes.

3 things you'll get out of this session

* Practical strategies to diagnose and fix real-world Spark performance bottlenecks
* Proven techniques to reduce job runtimes, handle data skew, and avoid OOM failures
* A deeper understanding of Spark’s execution model to build production-grade pipelines

Speakers

Asaf Sneh

Asaf Sneh's other proposed sessions for 2026

Streaming at Scale: Unity's Flink Data Platform - 2026