SQLBits 2020

Databricks, Delta Lake and You

Databricks, Lakes & Parquet are a match made in heaven, but explode with extra power when using Delta Lake. This session will dive into the details of how Databricks Delta works and how to make the most of it.

Data Lakes & Parquet are a match made in heaven, but they’re cranked up to overdrive with the new features of Delta Lake. Available as the open source Delta Lake, or the premium Databricks Delta. This session will take a deeper look at why parquet is so good for analytics, but also highlight some of the problems that you’ll face when using immutable columnstore files.

We’ll then switch over to Databricks Delta, which takes parquet to the next level with a whole host of features – we’ll be looking at incremental merges, transactional consistency, temporal rollbacks, file optimisation and some deep and dirty performance tuning with partitioning and Z-ordering.

If you’re planning, currently building, or looking after a Data Lake with spark currently and want to get to the next level of performance and functionality, this session is for you. Never heard of parquet or delta? You’re about to learn a whole lot more!

Simon Whiteley's previous sessions

Behind the Hype - Architecture Trends in Data

Seasoned Data Engineer and YouTube grumbler Simon Whiteley takes us on a journey through the current industry trends and buzzwords, carving through the hype to get at the underlying ideals. Which is going to last and which is a sales gimmick? Which bandwagon might actually take you in the right strategic direction?

Nose-Dive Narratives: Slide Karaoke 2024

Get ready to wrap up a serious day of learning with a dash of humor, spontaneity, and friendly competition! SQLBits presents "Slide Karaoke" where SQLBits speakers reveal their hidden talents while vying for bragging rights. This session promises to be a one-of-a-kind experience that will leave you in stitches and awe, and the speakers scrambling for their non-existent notes!

Behind the Hype - Architecture Trends in Data

In this session, seasoned data engineer and youtube grumbler Simon Whiteley takes us on a journey through the current industry trends and buzzwords, carving through the hype to get at the underlying ideals.

Building a Lakehouse on the Microsoft Intelligent Data Platform

This session session aims to give you that context. We'll look at how spark-based engines work and how we can use them within Synapse Analytics. We'll dig into Delta, the underlying file format that enables the Lakehouse, and take a tour of how the Synapse compute engines interact with it. Finally, we'll draw out our whole Lakehouse architecture

Bringing Data Lakes to your Purview

A short, fast dive into the specific elements of Azure Purview that work well with Data Lakes, and how you implement them yourselves

Value-Driven Analytics Development

Ever spent an age releasing a data model, only to find no-one uses it? There's a better way of working, driven by both technology & agile working practices, let me tell you about Value Driven Development & DataOps

Databricks, Delta Lake and You

Databricks, Lakes & Parquet are a match made in heaven, but explode with extra power when using Delta Lake. This session will dive into the details of how Databricks Delta works and how to make the most of it.

The Azure Spark Showdown - Databricks VS Synapse Analytics

Azure now has two slick, platform-as-a-service spark offerings, but which one should you choose? A separate specialist tools or a one-size-fits-all solution? Join Simon as he compares and contrasts the spark offerings.

Azure SQL DataWarehouse: 0-100 (DWUs)

Azure SQLDW - WHAT, WHERE, WHEN and HOW to use it.