22-25 April 2026

Approximate functions: How do they work?

Proposed session for SQLBits 2026

TL; DR

In this session, we will look at the internals of how the "approximate processing" functions in SQL Server 2019 and 2022 are implemented, and what their benefits and drawbacks are.

Session Details

Sometimes, a close approximation is good enough. And sometimes, a close approximation is a lot faster. Microsoft has introduced “Approximate Query Processing” (the APPROX_COUNT_DISTINCT and APPROX_PERCENTILE functions) to give you exactly that benefit when you don't need exact answers.

But do you have a good response when you propose to use this function and your manager asks you to explain how they work first? Or is your only option to claim "black magic by smart Microsoft engineers"?

The algorithms used are not a secret. HyperLogLog and KLL Sketch. And now you most likely know exactly as much as you already knew before. And when you google for those terms ... you end up with a headache.

Time to join me for a session where I explain the black magic in the simplest possible terms, so that you can then explain it to your manager!

3 things you'll get out of this session

Explain the algorithm used for APROX_COUNT_DISTINCT Explain the algorithm used for APPROX_PERCENTILE_DISC and APPROX_PERCENTILE_CONT Explain the error margins of these approximate functions.

Speakers

Hugo Kornelis

sqlserverfast.com/blog

Hugo Kornelis's previous sessions

Here’s the execution plan … now what?
This session is for those who have learned about execution plans, but notice that the theory lessons have not prepared them for the messy reality of real production code and execution plans. Using more complex examples than typical for conference sessions, I will guide you through a few examples, to show how execution plans can be used to pinpoint isues and fix them
 
Parameter Sensitive Plan Optimization in SQL 2022 ... As Cool as it Sounds?
We'll provide a balance perspective of the benefits and limitations of the new Parameter Sensitive Plan Optimization feature in SQL 2022 to help you decide if it's right for your environment.
 
Here’s the execution plan … now what?
You know where to find an execution plan. You have taken your first steps reading them. But how are you going to apply this knowlledge to real world problems?
 
Fast Focus: Scalar User-defined Functions in SQL Server 2019
SQL Server 2019 introduces FROID, a framework to inline user-defined functions, promising much better performance. What problem does it solve? And how does it work?
 
Execution plans ... where do I start?
Execution plans are key to understanding bad query performance. But they can be overwhelming to the new user. Where to start? This session will show the basics!
 
From adaptive to intelligent: query procesing in SQL 2019
SQL Server 2019 includes new query processing features such as batch mode on rowstore, memory grant feedback, approximate query processing, and more. How do these work? Are they as good as Microsoft wants us to believe?
 
Normalization Beyond Third Normal Form
Many people think that normalization stops at Third Normal Form. But there are lots of higher normal forms. And they are not as complex or as irrelevant as often claimed. If you want to design better databases, then come attend this session!
 
Everything you always wanted to know about MERGE
In this demo-rich session, Hugo Kornelis shows how the full syntax of MERGE enables more than just synchronizing data. You'll get an overview of all the available options, plus a few surprising pitfalls you may not be aware of.
 
SQL Server 2012: Column store indexes
This session will present you with a fascinating behind-the-scenes deep-dive view of the new column store index feature. How do column store indexes work? How are they built? And how can they yield such enormous performance boosts to some workloads?