Every day, thousands of people use Hex and Snowflake together to query, analyze, and explore their data.
Today we’re excited to announce that we’ve made Hex and the Snowflake data warehouse work even better together. We’ve built first-class support for Snowpark into Hex, giving users a new and powerful interface to their data. We’ve added Hex to Snowflake Partner Connect, so it’s only one click to set up a new trial with preconfigured access to your Snowflake instance.
With our newly minted Premier Partner status, this is just the tip of the iceberg of what Hex and Snowflake can do together. Let’s dig in!
Hex now comes with built-in support for Snowflake’s new Snowpark for Python API, which lets users write familiar Python code that’s executed on Snowflake, taking advantage of the massive scalability and distributed compute of the cloud data warehouse.
Under the hood, Snowpark operates with DataFrames just like those used by pandas or other Hex cells, with one major difference: Snowpark DataFrames are executed lazily, so data isn’t pulled into Hex until a user runs an evaluation on it. This means that for many workflows, Snowpark DataFrames are more efficient than Pandas DataFrames, and can help avoid memory limits common in traditional notebook workflows.
The below example uses the Snowpark API to calculate the min and max order date. But the data isn’t actually pulled into Hex just yet.
from snowflake.snowpark import functions as F min_max_date = hex_snowpark_session.table("PUBLIC.SUPERSTORE_DATASET").select(F.min('ORDER_DATE'), F.max('SHIP_DATE'))
When we run .collect() to evaluate the DataFrame and return the results, it doesn’t matter how large the underlying dataset is; the query is quickly executed on Snowflake’s infrastructure, and only the bite-sized rolled-up final results are pulled into Hex.
With packages to install and environments to configure, the road to getting started with Snowpark can be time consuming and technically complex. With the great power of Python also comes great
Hex sidesteps this by auto-magically installing the Snowpark API in any project connected to Snowflake, making it the easiest way to get started with Snowpark and Python. There’s no worrying about hosting a notebook, configuring environment variables, or downloading anything locally.
Here’s a step-by-step walkthrough (complete with live demo video!) of just how easy it is:
Assume I’m a business analyst for Superstore. I’ve been asked to pull a list of our top 100 customers for a special promotion. I’ll use Snowpark to solve the problem.
💡 This example uses hextoolkit, a special python package usable only from Hex Python projects. It supports programmatic access to data connections and (in the future) other Hex functionality.
Prefer prototyping in SQL? As always, Hex supports a polyglot workflow for Snowpark. You can return the results of any Snowflake SQL cell into a Snowpark DataFrame.
Watch as I find my top customers ranked by sales using Snowpark, live!
If you want to take a look at the notebook yourself, check it out here.
At Hex, we talk a lot about having a low floor and a high ceiling for our users. For folks who are new to the data science and analytics space, it’s pretty scary to think about all the overhead you might need to get started with Snowpark without Hex. We’re excited to support Snowflake SQL-savvy users who are new entrants into Python and data science, while also ensuring we provide a seamless experience for the Python jockeys of the world.
Snowflake Partner Connect gives Snowflake customers a one-click button to trial Hex with immediate preconfigured access to your underlying data in Snowflake.
Check out our tile on the Partner Connect landing page (top right corner of the Snowflake UI). Snowflake will auto-generate objects that you can use for the Hex connection OR easily select the database(s) to power your Hex project and you’re off to the races.