Blog

Curate your Hex Workspace for a better Explore experience

Curating your data for Explore makes everyone more productive

Hex-Explore Data Curation-hero

More and more people outside of the data team — from product, finance, ops, and marketing — are relying on Hex to get clearer and more specific answers on what’s happening in their world. This is exciting! It means data teams are producing powerful work in Hex that business partners want to dive into further.

We wanted these business partners (and data team members, too!) to have a first-class experience working with data in a way that didn’t require code. So we built a new no-code experience, called Explore. It lets business partners get their hands on the data directly using visual data exploration to answer their follow-up questions in Hex. No waiting on the data team or learning code.

If your business users are already using Hex or you have a hunch they’ll start to soon, we recommend doing just a bit of curation at the data warehouse level. This will ensure that the data they explore and summon with Magic AI can be trusted and is relevant to them.

Curation at the warehouse level for better Explore experiences

To start this process, head over to the Data browser and complete the following steps.

1. Create a data connection that’s dedicated to non-data team users

Chances are your business users don't need to wade through every nook and cranny of your data warehouse (cough, cough, looking at you… dev_user_42_test_table 👀). To give them access to the data only they need, set up a new data connection to house just the right databases, schemas, and tables that are relevant to them. Not only will your stakeholders feel right at home, but Magic will also serve up data insights tailored just for them. From there you can curate relevant data.

hex-databrowser@2x
knowledge

Data connection tips:

  • Use a clear, consistent, and descriptive name for your connection to make it easily identifiable to team members

  • Clearly document the purpose of this data connection, its internal owner, and notes about special configurations or limitations.

2. Within that connection, identify the tables that can business partners can trust

Business users want data that can give them trustworthy information, so it’s best to prune anything that could be inaccurate, not up-to-date, sensitive or just too raw. Think of this connection as the “Gold layer” from Databricks’ medallion architecture concept — ”organized in consumption-ready, project-specific databases.”

To create a smooth Explore experience, there are three ways to curate what's being seen within your data connection that are not relevant to business Explorers.

Schema filtering (in the Data browser)

Using the Data browser, admins can easily use schema filtering to include or exclude specific databases, schemas, or tables from your data connection. On the refresh, only your selected assets will be synced. We recommend filtering out STAGING/DEV/RAW schemas to start. Any excluded objects can still be queried, they just won’t appear in the data browser, autocomplete, or Magic AI responses. To fully remove access to certain objects, you’ll want to set up role permissions in the actual warehouse.

schema filtering

Magic - Include/Exclude toggles and Endorsements (in the Data browser)

Think of Magic AI like a data exploration sidekick — ready to assist any Explore users who might not speak fluent SQL or Python. Adding an endorsed status to databases, schemas, or tables is the easiest way to quickly tell Magic (and your eager end users) which data is "Approved" or "Trusted" by the data team. And now you can get endorsement suggestions from Magic itself…

data-browser@2x

*NEW: Magic Curation Suggestions 🪄 Magic will now suggest tables to endorse to Admins! In the Data browser, Magic will automatically surface popular tables and datasets to endorse and you can accept or dismiss suggestions. Magic will then prioritize any endorsed tables when answering questions and generating suggested prompts in Ask Magic. This helps your non-data team users ask the right questions and explore the right data.

curation gif

If you want to maintain access in the Data browser to certain databases/tables/schemas but never want Magic to use these tables, you can toggle them via “Include/Exclude for Magic” setting.

Warehouse permissioning - (in your data warehouse, not in Hex)

If you don’t want folks in your workspace to be able to access specific tables at all — like ultra sensitive data or raw warehouse data — configure user permissions in your warehouse and your data connection to prevent business partners from querying or viewing the data.

3. Add descriptions to the remaining tables in the connection

When anyone asks Magic a question, it first uses the metadata from the Data browser to perform a semantic similarity search for tables and columns that might answer that question. You can add descriptions to any database, schema, or table. The more information you add to the Data browser, the more likely that Magic will be match the right tables and columns.

What should you include in metadata? It’s best to include information about what can be calculated from a table and what it should be used for. If there is company jargon or synonyms, explain what they mean or referring to. Dig into more metadata tips that are useful for Magic.

To reduce any potential string hallucinations:

  • Add enumerations - For low cardinality string columns that are often filtered on or used in case statements,

    try explicitly enumerating or describing options for these fields in the Data Browser (this can reduce hallucination rates down to near 0). This could be useful for a question like: “How many orders have shipped but not yet been delivered?”

  • Try explaining the pattern in natural language - For high cardinality string columns that have too many options to list but follow a consistent pattern

    (like a City / State combo), calling out the pattern in natural language like “City State pairs, like 'Memphis TN’” can help Magic understand.

  • Add custom metadata - Try using natural language to tell Magic when a table should and shouldn’t be referenced.

    For example, you could write: only use this table if the prompt explicitly requires raw stripe data, otherwise use

    fct_orders.

knowledge

Pro tip: If you want to prototype descriptions and see how Magic does with them, feel free to directly edit/update them in Hex in our Data browser UI! You can then ask Magic your question and see how it does.

magic

4. Hook up your data connection with our dbt integration

If you use dbt Cloud, you can use metadata from your dbt project to enrich the Data browser, making the Explore experience with Magic even more useful. When you use our dbt integration with your data connection, Hex will grab metadata, like: model, source, and column descriptions and tests; when the model was last updated; source freshness tests, and more for Magic to reference. Explore users will also be able to see these descriptions in Hex.

5. Create a semantic model

Our last suggestion is to create a semantic model that abstracts away a lot of the complexity and predefines the metrics your business users care more about. We have more coming here soon so stay tuned!

Taking time to create a great Explore experience makes everyone more productive

Congrats on making it through our data curation crash course! By carving out just a bit of time to spruce up your Hex workspace, you're giving your stakeholders a reliable and relevant data exploration experience to drive even more value from your data.

Have more questions? Ask our team directly during our Magic AI live event! We’ll go more in-depth on how Explore works with Magic AI and how to set up business users for success!

Have more questions? Ask our team directly during our Magic AI live event! We’ll go more in-depth on how Explore works with Magic AI and how to set up business users for success!