Azure DataBricks brings a PaaS offering of Apache Spark, which allows for blazing fast data processing, interactive querying and hosting of ML models all in one place! Most of the buzz is around Data Science & AI - what about the humble data engineer who wants to harness the in-memory processing power within their ETL pipelines?

This session focuses on Azure DataBricks as your data ingestion, transformation and curation tool of choice.

We will: 
Introduce the DataBricks service & language options available
Discuss the hosting & compute options available
Demonstrate a sample data processing task
Compare against alternative approaches using SSIS, U-SQL and HDInsight
Demonstrate pipeline management & orchestration
Review the wider architectures and extension patterns

The session is aimed at Data Engineers seeking to put the Azure DataBricks technology in the right context and learn how to use the service.

We will not be covering the python programming language in detail.
Presented by Simon Whiteley at SQLBits 2019