Azure Databricks: Engineering Vs Data Science

SQLBits encompasses everything from in-depth technical immersions to the enhancement of valuable soft skills. The full agenda will be announced in the spring; in the meantime check out the timetable and content we cover below.

Presenting 2024’s selection of training days, encompassing a deep dive into a range of subjects with some of the best data trainers in the world.

08:00 Registration opens and breakfast served.

All training days run simultaneously across the venue from 09:00 – 17:00 with co-ordinated breaks.

All training days include regular refreshment breaks and a lunch stop to rest, recharge, and chat to fellow delegates.

No evening events planned, but if you’re staying over the night beforehand, why not join us in the Aviator on Monday night to meet the training day speakers for an informal drinks reception.

Azure Databricks: Engineering Vs Data Science

Description

Have you looked at Azure DataBricks yet? No! Then you need to. Why you ask, there are many reasons. The number 1, knowing how to use Apache Spark will earn you more money. It is that simple. Data Engineers and Data Scientists who know Apace Spark are in-demand! This workshop is designed to introduce you to the skills required to do both.

In the morning we will introduce Azure DataBricks then discuss how to develop in-memory elastic scale data engineering pipelines. We will talk about shaping and cleaning data, the languages, notebooks, ways of working, design patterns and how to get the best performance. You will build an engineering pipeline with Python (Or possibly some other stuff we are not allowed to tell you about yet). The Engineering element will be delivered by UK MVP Simon Whiteley. Simon has been deploying engineering projects with Azure DataBricks since it was announced. He has real world experience in multiple environments.

Then we will shift gears, we will take the data we moved and cleansed and apply distributed machine learning at scale. We will train a model and productionise it. We will then enrich our data with our newly predicted values. The Data Science element will be led by UK MVP Terry McCann. Terry holds an MSc in Data Science and has been working with Apache Spark for the last 5 years. He is dedicated to applying engineering practices to data science to make model development, training and scoring as easy an as automated as possible

By the end of the day, you will understand how Azure Databricks supports both data engineering and data science, levering Apace Spark to deliver blisteringly fast data pipelines and distributed machine learning models. Bring your laptop as this will be hands on.

Pre-requisites
An understanding of ETL processing either ETL or ELT on either on-premises or in a big data environment. A basic level of Machine Learning would also be beneficial, but not critical.
Laptop Required:Yes

Software: In the session we will be using Azure Databricks. We will have labs and demos that you can follow if you want to. If you do want to then you will need the following: - An Azure Subscription - Money on the Azure Subscription - Enough access on the subscription to make service principals. - Azure Storage explorer- PowerShell
Subscriptions: Azure

Learning Objectives

Things I will need

Tech Covered

Deploying, performance, memory, Azure, ETL, Machine Learning, Data Science, Microsoft, Big Data, Python, Databricks

Book now

The Agenda

2024 Training Days

Azure Databricks: Engineering Vs Data Science

Description

Learning Objectives

Things I will need

Tech Covered

Simon Whiteley

advancinganalytics.co.uk/blog

Terry McCann