SQLBits 2017
Machine Learning at Scale with Apache Spark
Richard will show you, from no knowledge of Spark, how to navigate the Spark framework ecosystem and build complex batch and near real time applications that use Spark's machine learning library mllib
In this session Richard Conway will show you from grass roots no knowledge of Spark how to navigate the Spark framework ecosystem and build complex batch and near real time applications that use Spark's machine learning library mllib. He'll cover everything from data shaping, basic statistics at scale, normalising, testing, training and building services and complex pipelines underpinned by machine learning. This is very fast-paced demo-heavy session going from nothing to big data and machine learning superstar by virtue of Apache Spark. If you're thinking of using Hadoop in the future this is the one session you don't want to miss.
Speakers
Richard Conway's other proposed sessions for 2026
Databricks vs Fabric: When to choose, when to combine, and why it’s confusing - 2026
Richard Conway's previous sessions
Performance Optimization with Azure Databricks
Azure Databricks has become one of the staples of big data processing. See how to make the most of it by understanding how Spark works under the covers.
Machine Learning at Scale with Apache Spark
Richard will show you, from no knowledge of Spark, how to navigate the Spark framework ecosystem and build complex batch and near real time applications that use Spark's machine learning library mllib