SQLBits 2024

Delta and Databricks vs SQL Server

Are you a DBA trying to learn more about Lakehouse’s? Or a Data Engineer wanting to better understand traditional Data Warehousing? Whether you are from a DBA or Data Engineering background, understanding the differences between traditional Warehousing systems and Lakehouse’s is invaluable. This allows us to work with both old and new, whether that be managing both side by side, ingesting from a traditional system into a Lakehouse, or moving completely away from a traditional SQL Warehouse to a Data Lake In this session we will do direct comparisons between features/functionality , illustrating how these different tools are ultimately the same and very different at the same time. Finally, we will talk about how these two technologies can be part of the same data platform solution!
Once upon a time, well 1989, we had the Data Warehouse in SQL Server and life was good in the land. It did have its challenges, particularly around loading/storing complex data types as well as the Budget! As data grew larger and more varied, the warehouse became too rigid and opinionated.

In 2012 analytics use cases were growing and Microsoft launched Column Store Index but were very limited however there was talk of a new land with new ideas. In 2013 databricks started a venture which brought a new approach to data warehouses with the separation of storage and compute. As we no longer needed the controls of a transaction database, this was lost in the changes along with many features like ACID.

Data lakes grew with cheap cloud storage from cloud provider and databricks became our compute. Things were great in this new land but the same protections were not in place, as ACID had been left behind in the old land. This was the frontiers where life was harsh without consistency or durability where mistakes could cost you.

Times were changing in the old lands of SQL Server, as cluster column indexes became usable, and the query engine was becoming better at adapting the to the diverse types of queries and making many of the Enterprise features free in 2016.

Things were much more liberal in the new lands as in 2022 Databricks open-sourced the entire code-base, including lots of advanced features that were previously Databricks-only. SQL Server also found its new offering in the cloud which changed its place in our data platform.

This is all great, but how do those who have been using traditional Warehousing tools, in particular SQL Server, make the leap to Delta and Databricks? In this session we will explore this question. We will do direct comparisons between features/functionality, illustrating how these different tools are ultimately the same and very different at the same time. Finally, we will talk about how these two technologies can be part of the same data platform solution! You could be a DBA, BI Developer or Data Engineer, whatever type of Data Professional you are, this session will compare the differences and help you understand them.


The key areas we will explore are:

Optimization
Storage
Compute
Security