SQLBits 2023
Breaking out of the habit of bulk loading everything
Lessons learned from integrating streaming data into an existing data warehouse / analytics platform based on conventional bulk loading patterns from on-prem systems/databases. When faced with integrating several new systems into the data environment, none of which have an accessible database, some reskilling, retooling, and rethinking of the ingestion patterns was needed. How does a team that is used to work with relational databases, Integration Services, daily updates of dimensions and facts, deal with AMQP/MQTT interfaces and the quest for near real time updates? Coming from an on-prem Microsoft data stack (SQL Server, SSIS,SSAS,SSRS) we look into Azure services. What are the architecture patterns that help us process business events we would like to analyse and capture? Can we combine requirements for near real time data analysis and actions with requirements for long term (10+ years) storage and analysis? How do we compare the cost of larger ready made PaaS building blocks with custom built code? The session describes how Event Hubs, Streaming Analytics, Azure functions and a host of other cloud services got integrated with the existing platform and daily operation, developed, and run by a small team.
Lessons learned from integrating streaming data into an existing data warehouse / analytics platform based on conventional bulk loading patterns from on-prem systems/databases. When faced with integrating several new systems into the data environment, none of which have an accessible database, some reskilling, retooling, and rethinking of the ingestion patterns was needed. How does a team that is used to work with relational databases, Integration Services, daily updates of dimensions and facts, deal with AMQP/MQTT interfaces and the quest for near real time updates? Coming from an on-prem Microsoft data stack (SQL Server, SSIS,SSAS,SSRS) we look into Azure services. What are the architecture patterns that help us process business events we would like to analyse and capture? Can we combine requirements for near real time data analysis and actions with requirements for long term (10+ years) storage and analysis? How do we compare the cost of larger ready made PaaS building blocks with custom built code? The session describes how Event Hubs, Streaming Analytics, Azure functions and a host of other cloud services got integrated with the existing platform and daily operation, developed, and run by a small team.