Data Engineering the Hard Way: Developing Spark Applications on Kubernetes

Self-hosting Spark in Containers
Managed Services like Databricks are fantastic for allowing data teams to get to work quickly. But what if such an option doesn’t exist in your environment?

And what if such a DIY approach created other opportunities, such as enabling developers to work in a realistic enviroment locally on the desktop?

Using containers, on top of Kubernetes has enabled us to achieve this.

Join us for a discussion of how we started on this journey, and what we learned along the way.

