Blog

Apache Iceberg and Hex

Query Iceberg tables and build interactive reports on top with Hex

Matthew David

Data

May 15, 2023

Apache Iceberg is an open-source protocol developed by Netflix for storing massive amounts of data. Iceberg tables are being used by many companies to perform large transformations on large amounts of data.

However, many of the current tools teams use to do this data work on top of Iceberg tables are built for the previous era and are not in the cloud or provide any level of collaboration. This is where Hex comes in.

Why Use Apache Iceberg?

Often teams choose Apache Iceberg when they prefer to manage their own data, without relying on on-premises or managed solutions, and want more control over security and partitioning. They may also want a table format that can be accessed by multiple query engines, such as Athena, Google, or Snowflake with consistent performance.

With Iceberg, you can create and manage tables in a more flexible way, without the need for complex and time-consuming schema migrations. Iceberg also provides better support for versioning and time travel, which are essential features for many use cases.

Query Iceberg Tables

There are many engines you can run on top of Iceberg tables which can then be directly queried with Hex. In this example we use Dremio, and then query the Iceberg Table like any other table:

Collaborate on Iceberg Tables

In Hex you can simply add a chart cell or use Python to visualize the data returned by your query on the Iceberg table. These visualization can then be composed into an interactive report to share with the whole team, while respecting access controls to the data.

Conclusion

Apache Iceberg provides users with the control they want over their data and Hex provides them with a powerful interface to work with that data. This allows data teams to easily collaborate on data engineering tasks, data science, and machine learning when using Iceberg tables. You don’t have to choose between open source storage and managed notebooks for analytics & data science.

This is something we think a lot about at Hex, where we're creating a platform that makes it easy to build and share interactive data products which can help teams be more impactful.

If this is is interesting, click below to get started, or to check out opportunities to join our team.

✨ Get started for free

👩‍💻 Open roles

Made with

🍩

☕

🥟

🍺

🍰

🔮

🔒

🥖

🍷