SQLBits 2022
Looking under the hood of the parquet format
Understanding how the parquet format works helps with understanding why it can help you retreive your data fast, or perhaps why you struggle to get the desired performance out of your design.
Understanding how this format works helps with understanding why it can help you retreive your data fast, or perhaps why you struggle to get the desired performance out of your design.
And since the formats significantly overlap in architecture, this session can help you with design choices also if you're not using parquet but are using the SQL Server based columnstore type instead.
This is a bit of a deeper dive into the inner workings. But it's not merely aimed at satisfying the geek inside us. The goal of this session is to provide you with practical guidance through knowing well enough how it works internally.
feedback link: https://sqlb.it/?7142
And since the formats significantly overlap in architecture, this session can help you with design choices also if you're not using parquet but are using the SQL Server based columnstore type instead.
This is a bit of a deeper dive into the inner workings. But it's not merely aimed at satisfying the geek inside us. The goal of this session is to provide you with practical guidance through knowing well enough how it works internally.
feedback link: https://sqlb.it/?7142
Speakers
André Kamman's previous sessions
FinOps, how data engineers get their cloud cost under control
Managing cloud cost is no longer a "management approves the budget" type of thing. Cloud Engineers need to architecht their solutions in such a way that cost can be kept under control. This is not a one time thing. Monitoring, automatic downsizing, re-factoring are all parts of the yearly tasks of any cloud team. We'll discuss theory, techniques, best practices and lessons learned.
Generate test data quick, easy and lots of it with the Databricks Labs Data Generator
We're not supposed to use production in dev right! But generating proper test data is not easy, get's even harder when you need quite a lot of it. I generate Terabytes of it, and without much trouble. Let me show you how!
Keynote by The Community
Ben and Rob have found some wonderful folk to actually do the important parts of the community keynote. on the theme of
How to be a nonpassive member of the data community
Building your first Metadata Driven Azure Data Factory
Let's unleash the true power of ADF, it's ability to dynamically inject metadata almost anywhere. No complicated frameworks in this session, I'll show you some simple but very powerful examples.
Looking under the hood of the parquet format
Understanding how the parquet format works helps with understanding why it can help you retreive your data fast, or perhaps why you struggle to get the desired performance out of your design.