SQLBits 2018

Modernizing ETL with Azure Data Lake

Modernizing ETL with Azure Data Lake: Hyperscale, Multi-format, Multi-platform, & Intelligent

More and more customers who are looking to modernize analytics needs are exploring the data lake approach in Azure. Typically, they are most challenged by a bewildering array of poorly
integrated technologies and a variety of data formats, data types not all of
which are conveniently handled by existing ETL technologies. In this session,
we’ll explore the basic shape of a modern ETL pipeline through the lens of
Azure Data Lake. We will explore how this pipeline can scale from one to
thousands of nodes at a moment’s notice to respond to business needs, how it’s
extensibility model allows pipelines to simultaneously integrate procedural
code written in .NET languages or even Python and R, how that same
extensibility model allows pipelines to deal with a variety of formats such as
CSV, XML, JSON, Images, or any enterprise-specific document format, and finally
explore how the next generation of ETL scenarios are enabled though the integration
of Intelligence in the data layer in the form of built-in Cognitive
capabilities.

Speaker

Michael Rys is a Principal Program Manager in the Azure Big Data team at Microsoft, working on Azure Synapse, .NET support for Spark and other big data projects.

Michael Rys's Sessions

Big Data & Data W'housing Together w/ Azure Synapse Analytic

Big Data Processing with .NET and Spark

Execute your custom code in Python/.Net/R @ Scale with U-SQL

Modernizing ETL with Azure Data Lake

Scaling out your Cloud Database with SQL Azure Federations