Streaming at Scale: Unity's Flink Data Platform
Proposed session for SQLBits 2026TL; DR
Flair is Unity’s SQL-first data platform built on Apache Flink and Apache Paimon to power real-time and batch processing at massive scale. It simplifies stream processing while handling 1T+ messages and 1PB of data daily across Unity Ads.
Session Details
I am a staff data engineer working in the data platform team in Unity and we are build our data platform "Flair" which is Unity's enterprise-grade data platform, purpose-built to handle the massive scale of Unity's Ads division - processing over 1 trillion messages and 1 PB of new data every single day from nearly 9 billion devices worldwide.
The Challenge:
As Unity's data scale grew exponentially, we faced significant challenges with Apache Flink adoption. The Java API was too complex for most users, Python support was limited and slower, and teams were constantly reinventing the wheel with repetitive enrichments, configurations, UDFs, and I/O operations. Deployment was difficult, and maintaining these jobs at our scale became unsustainable.
The Solution:
We built Flair as a SQL-first platform that abstracts away all the complexity of distributed stream processing. By leveraging Apache Flink for real-time computation and Apache Paimon as our streaming data lake storage layer, we created a unified platform that handles both streaming and batch workloads seamlessly.
What Flair Provides:
- Developer-Friendly SQL Interface: Users write business logic in Flink SQL - no Java or Python required. They can work locally with IntellijIDEA or using a drag & drop UI that we created
- Full Infrastructure Management: Automated deployment, resource allocation, scaling, and orchestration on Kubernetes
- Built-in Observability: Comprehensive monitoring, alerting, and debugging tools
- Streamlined Workflow: Local development and testing with session clusters, then one-click deployment to production
- **Enterprise Features:** Role-based access control, audit logging, and governance
- Unified Storage with Paimon: Efficient handling of both streaming and batch data with a lakehouse architecture
The Impact:
Flair has democratized real-time data processing at Unity, enabling teams across the company to build production-grade streaming pipelines in days instead of months. Individual Flink jobs on our platform process up to 100 billion Kafka messages daily with 6TB of state, and we can backfill historical data processing 1.5 petabytes in a single job. The platform has become the backbone of Unity's real-time analytics, powering everything from ad attribution to fraud detection at unprecedented scale.
The Challenge:
As Unity's data scale grew exponentially, we faced significant challenges with Apache Flink adoption. The Java API was too complex for most users, Python support was limited and slower, and teams were constantly reinventing the wheel with repetitive enrichments, configurations, UDFs, and I/O operations. Deployment was difficult, and maintaining these jobs at our scale became unsustainable.
The Solution:
We built Flair as a SQL-first platform that abstracts away all the complexity of distributed stream processing. By leveraging Apache Flink for real-time computation and Apache Paimon as our streaming data lake storage layer, we created a unified platform that handles both streaming and batch workloads seamlessly.
What Flair Provides:
- Developer-Friendly SQL Interface: Users write business logic in Flink SQL - no Java or Python required. They can work locally with IntellijIDEA or using a drag & drop UI that we created
- Full Infrastructure Management: Automated deployment, resource allocation, scaling, and orchestration on Kubernetes
- Built-in Observability: Comprehensive monitoring, alerting, and debugging tools
- Streamlined Workflow: Local development and testing with session clusters, then one-click deployment to production
- **Enterprise Features:** Role-based access control, audit logging, and governance
- Unified Storage with Paimon: Efficient handling of both streaming and batch data with a lakehouse architecture
The Impact:
Flair has democratized real-time data processing at Unity, enabling teams across the company to build production-grade streaming pipelines in days instead of months. Individual Flink jobs on our platform process up to 100 billion Kafka messages daily with 6TB of state, and we can backfill historical data processing 1.5 petabytes in a single job. The platform has become the backbone of Unity's real-time analytics, powering everything from ad attribution to fraud detection at unprecedented scale.
3 things you'll get out of this session
* Learn how Unity built a SQL-first streaming platform using Flink and Apache Paimon at extreme scale
* Understand how to abstract Flink complexity while supporting both streaming and batch workloads
* Gain practical insights into operating, scaling, and governing real-time data platforms in production
Speakers
Asaf Sneh's other proposed sessions for 2026
Spark Tuning & Best Practices: From Theory to Production-Ready Performance - 2026