Video unavailable
SQLBits 2022
Redesigning Scalable Machine Learning with the open source Apache Spark
Building End to End machine learning workflow at scale with open source technologies on Azure
To create good products that leverage AI, you need to run machine learning algorithms on massive amounts of data. Distributed machine learning frameworks, such as Spark ML, help simplify the development and use of large-scale machine learning. With Apache Spark libraries, you can cover the entire basic machine learning workflow, from loading and preparing data to extracting features to fitting the model to scoring. Software and data engineers must understand the workflow in order to leverage what already exists and create more-enhanced products; likewise, tech leads and architects need to understand the workflow and available options to build better architecture and software. Join Adi Polak to explore the end-to-end machine learning workflow and learn how to build one with Apache Spark.