Quick level set so all readers are on the same page:
❄️Snowflake is a powerful and fast SQL data warehouse.
⬡ Hex is a collaborative data workspace for doing analytics and data science with Python and SQL.
There are many reasons they work well together, but in the spirit of being hex-y, we'll keep it to just six here.
Hex supports a vast selection of integrations and data sources, but we're especially proud to have recently become a Snowflake Select Technology Partner. In light of that, we figured we'd explain why we think Snowflake in particular is such a great match for Hex.
![Look at those hexagons! (src: Weatherwise magazine, AMS)](../../images/blog/ice-forming.jpeg)
The shape of water molecules causes nearly all ice crystals to form a six-sided structure, so although some snowflakes have 6 points rather than true sides, they're still pretty much hexagons. Hex loves hexagons (and all things that come in sixes). Do you really need 5 more reasons?
Many Python-based data science tools lack true SQL support, which means they also lack robust database configuration options. This presents not only user experience problems, but legitimate security concerns— there are often credentials floating around in environment variables or even worse, being copied and pasted into local notebooks and accidentally pushed to git. Either way, they're being handled regularly and by many people— this is generally Not Good.
Hex provides a secure connection process that, once configured by an administrator, allows users to easily query data without needing to handle usernames, passwords, or private keys. A Snowflake connection can be configured at the workspace level, making it available to all users, or at the project level where only the members of one project have access.
There's nothing worse than troubleshooting authentication issues across packages and tools. Configure your Snowflake connection once in Hex, and never worry about it again.
Even if you're confident that credentials are securely managed in your Python notebook, you don't want to write SQL in block quotes inside of a notebook cell, without syntax and error highlighting or any of the features you expect from a SQL IDE.
But users of notebooks don't tend to have a choice. If you want to get your data from a DB into your notebook, you're typically forced to blunder through this process. This leads to two problems.
First, SQL just takes longer to write without error highlighting, autocompletion, and proper formatting. It's less fun, you make more mistakes, and queries are harder to debug.
Second, to minimize writing SQL (see point #1), folks sometimes just run a
SELECT * FROM all_the_tables at the top of the notebook and then work with an in-memory version of the data. This, of course, is also slow— your local Python environment, Mac M1 chip and all, can't hope to compete with Snowflakes' enormous and optimized infrastructure.
This is what makes Hex such a perfect partner to Snowflake. Hex's data workspace includes true first-class support for both SQL and Python. You don't need to use a Python connector to query data from Snowflake; Just add a new SQL cell and write queries against your Snowflake data warehouse with full auto-complete, a built-in schema browser, smart formatting, and caching options. It's a night-and-day comparison.
This means you can easily and seamlessly make queries to Snowflake at any point in your project, swapping between Python and SQL wherever it makes sense. Analysts don't even need to switch between Hex and the Snowflake IDE to prototype queries; with the Hex schema browser, they can do all their work in one place.
Which leads to the next point...
Hex not only embraces SQL as a first-class citizen, it provides connection points between SQL and Python cells in a Hex project. You can parameterize SQL queries with Python variables from the rest of a project, allowing both for dynamic user input in queries and extremely complex chained queries— Or even queries that span multiple data sources.
Marketing team provide a .csv of the latest event attendees again rather than ETLing the data? No problem. Just load the file into a Hex project, read it as a dataframe, and use that as input to parameterize a Snowflake query:
You can even use complex jinja templating (h/t dbt for popularizing jinja + sql!) to build deeply complex queries with for loops, conditional execution, and more. Again, this lets data practitioners leverage the power and speed of Snowflake's distributed computing architecture, while maintaining the flexibility of a Python environment.
Hex powers complex analytical workloads, but it also serves data up to end users as curated and interactive "Data Apps". These users expect low query latency and app-like responsiveness. Snowflake's performance and scalability means data scientists can run queries on enormous datasets behind the scenes, while still exposing a performant and usable interface to that data via a Hex app.
Hex also includes powerful caching so you don't accidentally re-run slow or costly queries. You can choose whether to always cache a query for end users, cache a query on a set schedule, or let users always run a fresh query.
Hex is a perfect tool for doing complex analysis on a variety of data types. Because Hex places SQL and Python so close together, it can be frustrating when a nested dataset you know how to work with using pandas syntax doesn't work in a SQL query. Snowflake's rich support for semi-structured and nested columns makes it easy to store, query, and visualize messy or complex data.
This makes blended Python-SQL workflows in Hex feel natural and powerful. Snowflake SQL functions like
GET just make sense to have at your disposal when you're coming out of a Python cell that was using
pd.json_normalize or exploding a json with