22-25 April 2026

Apache Iceberg Data Analytics with Python

Proposed session for SQLBits 2026

TL; DR

In this talk, we’ll explore how to unlock the full potential of Iceberg for data analytics in Python. From working with PyIceberg for low-level Iceberg table operations to leveraging Apache Polaris for catalog management, we’ll cover the essential tools and libraries available for Python users.

Session Details

Python continues to be a leading language for data analytics, and Apache Iceberg has emerged as a powerful table format for managing large-scale datasets. In this talk, we’ll explore how to unlock the full potential of Iceberg for data analytics in Python. From working with PyIceberg for low-level Iceberg table operations to leveraging Apache Polaris for catalog management, we’ll cover the essential tools and libraries available for Python users.

We’ll also dive into DataFusion, a high-performance query engine that integrates seamlessly with Iceberg, and the dremio-simple-query library, which simplifies querying Iceberg tables through Dremio. This session will provide hands-on examples, best practices, and real-world scenarios to help you harness Python’s flexibility and Iceberg’s scalability for analytics workloads. Whether you’re a data scientist, engineer, or analyst, you’ll leave with practical insights into building a Python-powered data analytics pipeline with Apache Iceberg.

3 things you'll get out of this session

Learn about Iceberg and using it with Python