22-25 April 2026
Video unavailable
SQLBits 2020

Common Data Science Mistakes

Taken from the 20+ years of field experiences, many common statistical and data science mistakes have been detected. Session will tackle couple of them.

In the middle of deploying the model, team of data scientists realize that the predictions are "somewhat-off". Troubleshooting on the horizon and what to do. Session will guide you through most common mistakes data scientists and statisticians are making when preparing and engineering the data using T-SQL or any other database system.

Further more, we will explore common statistical and data science mistakes when modeling data, extracting know-how from the data, finding the hidden patterns and running different test against the structural models using mainly R, Python, or Spark. What not-to-do will be replaced with what to-do explanations using sample datasets and sample codes.

Some statistical knowledge or background is a plus!