Today, we’re introducing a whole new suite of versioning workflows in Hex, including a human-friendly file format, GitHub sync, and a rich in-app diff view.
Together, these incorporate the best of software engineering best practices, while embracing the highly visual and UI-driven process of exploratory analysis and app building. Let’s dig in.
Over the last few years, software engineering best practices have changed how data teams transform data, configure infrastructure, and deploy pipelines. These things are now managed as code, and versioned through Git, making it easy to review work, revert changes, and collaborate without getting in each others’ way.
Teams using these workflows build trust in the assets they produce — code that is version controlled and reviewed is less error-prone than a script running on someone’s laptop. Code reviews raise the quality bar, by giving people feedback on making code more efficient and readable. It feels like how it always should have been done!
Notebook .ipynb files are dense JSON that include serialized outputs, and are hard for team members to read or diff, let alone review. Analysts bookmark their queries in SQL runners, without any way for their peers to check their work. One-off dashboards float around, with no sense of governance or context.
Sometimes these things are checked into Git, without any way to understand what that code does, the artifacts that they produce, or what a stakeholder is actually consuming. And let’s be honest: a lot of data work still happens in spreadsheets, which are basically impossible to version control.
At times, this approach is ok, good even — when you’re pulling together a quick prototype, or spelunking for an insight, the ability to move quickly is a strength. But what happens when it moves past the exploratory phase? When your stakeholders are using your work to make business-critical decisions? Well, bad things can happen: bad analysis can cause bad decisions, eventually eroding trust in the work of a data team.
We believe analytical work deserves the same benefits of versioning workflows that other parts of our data stack have received. And today we’re releasing a number of features to get us there.
If you’ve ever opened a notebook file (those ones with the
.ipnyb extension) in a standard code editor, you would have been faced with a large blob of JSON. Notably, this JSON:
Hex has always supported importing and exporting .ipynbs as Hex Projects. But with a caveat: exporting to an .ipynb meant you lost many of the things that make Hex special, like SQL cells, input parameters, and app layouts.
We needed a file format that preserved all aspects of a Hex project, was easier for members of a data team to review, and that didn’t contain those potentially-sensitive outputs. So, we adopted a file format based on YAML.
You can now export (or... hexport) your Hex projects as YAML files — the YAML file contains all the code we need to to be able to reconstruct your project from scratch.
And, you can import these files back into Hex as well: either as new projects, or new versions of your existing project.
Ok, we have a file format - where are we putting it? Well, you can now sync your Hex projects to GitHub.
Every version of a project that you save will be represented as a commit on a branch, while the published version (i.e., what app users see) will reflect what’s on your main branch. You can either let Hex keep your GitHub branches up to date for you, or, if you’re a team that uses pull requests, you can require that team members get their work reviewed before publishing a Hex app.
Whether you choose to use pull requests or not, GitHub sync also unlocks a few other benefits:
But reviewing code can be challenging when you can’t see the context of those changes — sure that SQL looks fine, but does it run? What kind of results does it return? Did that refactor change the output of my chart (I didn’t want it to!), or was that bug I was trying to fix actually fixed? Well, once again, we can help you out with that too.
Our publish workflow contains a new tab: Logic diff.
This view does exactly what the name implies: it shows you the differences between the logic of the version that’s currently published, and the version you’re about to publish. New cells, updated outputs, markdown grammar fixes, and tweaked SQL queries are all visible and easily auditable before publishing.
This view also:
This is a powerful interface that brings together the changes to your code, and the impact of those changes. Whether you’re refactoring code and want to check that your final app hasn’t actually changed, adding new functionality to an existing project, or just can't quite remember what exactly you did change, you can review your changes to your project directly in Hex.
You can use this as a standalone feature, or in conjunction with GitHub sync to help pull request reviewers gain an understanding of the changes you want to merge.
If you’re a current Hex user on the Teams plan, this is all live in production now!
If you’re not on Hex yet, you can get started with a free trial below.
We can’t wait to hear what you think of it.