More and more customers who are looking to modernize analytics needs are exploring the data lake approach in Azure. Typically, they are most challenged by a bewildering array of poorly
integrated technologies and a variety of data formats, data types not all of
which are conveniently handled by existing ETL technologies. In this session,
we’ll explore the basic shape of a modern ETL pipeline through the lens of
Azure Data Lake. We will explore how this pipeline can scale from one to
thousands of nodes at a moment’s notice to respond to business needs, how it’s
extensibility model allows pipelines to simultaneously integrate procedural
code written in .NET languages or even Python and R, how that same
extensibility model allows pipelines to deal with a variety of formats such as
CSV, XML, JSON, Images, or any enterprise-specific document format, and finally
explore how the next generation of ETL scenarios are enabled though the integration
of Intelligence in the data layer in the form of built-in Cognitive
(no tags)
Presented by Michael Rys at SQLBits XVII