Advanced Data Engineering with Databricks on the Lakehouse

SQLBits encompasses everything from in-depth technical immersions to the enhancement of valuable soft skills. The full agenda will be announced in the spring; in the meantime check out the timetable and content we cover below.

Presenting 2024’s selection of training days, encompassing a deep dive into a range of subjects with some of the best data trainers in the world.

08:00 Registration opens and breakfast served.

All training days run simultaneously across the venue from 09:00 – 17:00 with co-ordinated breaks.

All training days include regular refreshment breaks and a lunch stop to rest, recharge, and chat to fellow delegates.

No evening events planned, but if you’re staying over the night beforehand, why not join us in the Aviator on Monday night to meet the training day speakers for an informal drinks reception.

Operations

Advanced Data Engineering with Databricks on the Lakehouse

Description

In this session, you will build upon existing knowledge of Apache Spark™, Structured Streaming and Delta Lake to unlock the full potential of the lakehouse by utilising the suite of tools provided by Databricks. This session places a heavy emphasis on designs favouring incremental data processing, enabling systems optimised to continuously ingest and analyse ever-growing data. The topics in this course helps learners to work towards the Databricks Certified Data Engineer Professional exam.

Learning Objectives

• Design databases and pipelines optimized for the Databricks Lakehouse Platform
• Implement efficient incremental data processing to validate and enrich data-driven business decisions and applications
• Leverage Databricks-native features for managing access to sensitive data and fulfilling right-to-be-forgotten requests
• Manage error troubleshooting, code promotion, task orchestration and production job monitoring using Databricks tools

Things I will need

Not all of the below experience are required, but 3/5 is recommended - Experience using PySpark APIs to perform advanced data transformations - Experience using SQL in production data warehouse or data lake implementations - Experience working in Databricks notebooks and configuring clusters - Familiarity with creating and manipulating data in Delta Lake tables with SQL - Ability to use Spark Structured Streaming to incrementally read from a Delta table

Tech Covered

Databricks, Operations

Book now

The Agenda

2024 Training Days

Advanced Data Engineering with Databricks on the Lakehouse

Description

Learning Objectives

Things I will need

Tech Covered

Vuong Nguyen

github.com/nkvuong

Liping Huang

dataleaps.co.uk