Azure Databricks has become one of the staples of big data processing. Based on Apache Spark, which was considered the "swiss army knife of big data", Azure Databricks is mainstream now in data processing from complex transformations, machine learning to integrated Azure activities within multi-stage processing pipelines.
In this talk Richard Conway breaks down how Apache Spark and Azure Databricks work. Through a series of demos and examples illustrating how workloads can be optimised through partitioning, predicate push-downs can be seamlessly built with parquet statistics, shuffling and sorting can be minimized, working with data sampling, caching using Databricks Delta and much more.