Data Engineering with Databricks

Data Transformation and Integration

Data Engineering with Databricks

Description

So you’ve heard of Databricks, but after setting up a notebook and writing some code you’re still not sure what the fuss is all about.

Yes you’ve heard it’s Spark, but then there’s this Delta thing that’s both a data lake and a data warehouse (isn’t that what Iceberg is?). And then Unity Catalog, which does more than just catalog data, it does access management but even surprising things like optimise your data and programmatic access to lineage and billing?

But then serverless came out and now you don’t even have to learn Spark? And of course there’s a bunch of AI stuff to use or create yourself.

So why not spend a single day learning the details of what Databricks does, and how it could make you look like a rockstar Data Engineer.

Overview

This hands on course will give you the building blocks to become a skilled Data Engineer in Databricks, allowing you to build real time pipelines, optimised workloads from the getgo and scalable architectures.

The instructor is Holly Smith, who spent half a decade delivering formidable projects for Databricks’ own consulting team.

Outline

ETL with Spark SQL and Delta

Incremental Data Processing with Structured Streaming

Medallion Architecture

Task Orchestration with Databricks Workflows

Unity Catalog and System Tables

AMA with Databricks experts

Learning Objectives

You will leave with the Databricks notebooks used in the class, the slides used for theory sections, $400 of compute to try in your own Databricks Express account.

Things I will need

A laptop that can run Google Chrome and access to the internet. It’s probably best to use your personal laptop if your work laptop is locked down. You will need basic knowledge of SQL up to and including joins, DML statements like Merge. Basic python variables, functions and control flow is preferred. Experience with cloud Data Engineering practices using virtual machines, object storage, identity management and metastores.

Tech Covered

Python, Spark, Databricks, Data Transformation and Integration, Grounded in Reality, Inspirational

Book now