SQLBits 2018
Modernizing ETL with Azure Data Lake
Modernizing ETL with Azure Data Lake: Hyperscale, Multi-format, Multi-platform, & Intelligent
More and more customers who are looking to modernize analytics needs are exploring the data lake approach in Azure. Typically, they are most challenged by a bewildering array of poorly
integrated technologies and a variety of data formats, data types not all of
which are conveniently handled by existing ETL technologies. In this session,
we’ll explore the basic shape of a modern ETL pipeline through the lens of
Azure Data Lake. We will explore how this pipeline can scale from one to
thousands of nodes at a moment’s notice to respond to business needs, how it’s
extensibility model allows pipelines to simultaneously integrate procedural
code written in .NET languages or even Python and R, how that same
extensibility model allows pipelines to deal with a variety of formats such as
CSV, XML, JSON, Images, or any enterprise-specific document format, and finally
explore how the next generation of ETL scenarios are enabled though the integration
of Intelligence in the data layer in the form of built-in Cognitive
capabilities.
integrated technologies and a variety of data formats, data types not all of
which are conveniently handled by existing ETL technologies. In this session,
we’ll explore the basic shape of a modern ETL pipeline through the lens of
Azure Data Lake. We will explore how this pipeline can scale from one to
thousands of nodes at a moment’s notice to respond to business needs, how it’s
extensibility model allows pipelines to simultaneously integrate procedural
code written in .NET languages or even Python and R, how that same
extensibility model allows pipelines to deal with a variety of formats such as
CSV, XML, JSON, Images, or any enterprise-specific document format, and finally
explore how the next generation of ETL scenarios are enabled though the integration
of Intelligence in the data layer in the form of built-in Cognitive
capabilities.
Speakers
Michael Rys's previous sessions
Big Data & Data W'housing Together w/ Azure Synapse Analytic
Come learn how Azure Synapse brings together big data and data warehousing through new technology and a unified development experience
Big Data Processing with .NET and Spark
Come learn how to use .NET and Spark together in Azure Synapse and elsewhere to cook and analyze your data!
Execute your custom code in Python/.Net/R @ Scale with U-SQL
In this session, I will showcase how you can bring your Python, R, and
.NET code to Azure Data Lake and apply it at scale using U-SQL.
Modernizing ETL with Azure Data Lake
Modernizing ETL with Azure Data Lake: Hyperscale, Multi-format, Multi-platform, & Intelligent
Scaling out your Cloud Database with SQL Azure Federations
In this presentation, we provide an introduction to SQL Azure Federations and also show
some interesting patterns that provide additional capabilities such as
scale-out query processing, cross-shard schema management and multi-key sharding.