SQLBits 2024

Common Data Science Mistakes

Most common statistical and data science mistakes and traps we all try to avoid them!
In the middle of deploying the model, team of data scientists realize that the predictions are "somewhat-off". Troubleshooting on the horizon and what to do. Session will guide you through most common mistakes data scientists and statisticians are making when preparing and engineering the data using T-SQL or any other database system. Further more, we will explore common statistical and data science mistakes when modeling data, extracting know-how from the data, finding the hidden patterns and running different test against the structural models using mainly R, Python, or Spark. What not-to-do will be replaced with what to-do explanations using sample datasets and sample codes. Some statistical knowledge or background is a plus!