Blog

We’re not building "AI data scientists"

What if we needed more data people, not fewer?

not building ai data scientists

A few weeks ago we launched a major update to Hex’s integrated Magic AI features. They make it easy to go from prompt to analysis, whether you’re trying to construct a tricky query or build a beautiful visualization. They feel like, well, Magic!

Notice that Magic cleverly maps “Appetizers” to “Quick Bites” in the WHERE clause, thanks to rich metadata and dbt context.

We’re getting a ton of great feedback on how far these features have come, and how essential they are to our users’ workflows. But we also get questions about where this is going, like: “are you building Hex to become an 'AI data scientist?'”

The short answer is “no”: we’re long on humans, and think this could be the beginning of a new golden age for data teams – not the beginning of the end.

But I also understand where people are getting this from; there’s a lot of noise and hype right now, including products that are positioning themselves as replacements for humans.

There’s three big reasons I think this is fundamentally misguided.

Data work is a lot more than writing code

Look, if all you think human data people do is write code or build charts, then yeah – I guess LLMs can replace them. But that’s missing a lot.

A great analyst digs in with stakeholders, talks to peers, brainstorms ideas, conducts experiments, and presents the results to other teams. They raise questions. They understand motivations. They influence decisions.

LLMs can simulate some of those activities but are so, so far from being able to pull them off in the way a human can. Perhaps someday we’ll have LLM agents so advanced that they can absorb and integrate all these things a human data practitioner does. But I suspect that’s a very far way off.

In the meantime, it’s exciting and good for data practitioners that AI can help many of their tasks go faster. Who wants to be tracking down an errant parenthesis, fixing package imports, or futzing with types? If software can automate a bunch of this away, humans can focus on the interesting, creative, discursive parts of the job – the parts, one would assume, they got into data to do.

There’s an infinite demand for insight

Ok, but perhaps this means we need fewer people? If an AI can build a data pipeline 10x as fast… does it mean we need 1/10th the size of data teams? Can it just be one person prompt-hacking?

We don’t believe that. The demand for data work isn’t fixed – it’s actually many times larger than we realize. After all, there isn’t a finite number of stakeholder questions or decisions that could be informed with data. If you look at most organizations, they’re deeply data supply-constrained! Every data team I’ve seen has a line of stakeholders queued up, waiting for analyses and insights.

If they got through that list faster, it’s not like they’d run out of things to do. What about all the folks who aren’t even getting in line for the data team because the wait would be too long? What about the folks “going on gut” because the prospect of getting data to answer a question is too arduous?

In fact, I expect efficiencies will make data teams feel more impactful, and even produce a rebound effect, driving more demand for insight.

Humans want to talk to (and blame!) other humans

Imagine, for a moment, you worked with a Data Scientist who, while knowledgable and sharp, was well-known for hallucinating, making up facts, and completely refusing to explain how they reached conclusions. You would… not trust this person?

The idea of “AI data scientists” is like saying we want to hire a bunch of these people and set them loose in our organization.

“But the models will get better!” “Our RAG is really great!” That’s all well and good, but when it comes to establishing critical facts or making major decisions, humans want (and still need) a human that can explain their reasoning, and answer questions.

But I don’t think it’s only about reliability or explainability – it’s culpability, too. When something goes wrong, we want someone to point at, hold accountable, and even penalize.

When a human analyst provides or analysis or recommendation, we can ask them about it, drill them with questions, push back, or fire them. But what do we do with an “AI data analyst”? What happens when it offers up something deeply unintuitive? Are you going to your boss saying “it told us to raise prices, but it can’t explain why”? What happens when that price increase fails?

You’re going to want to point to someone!

A golden era for data teams

For all these reasons (in addition to fervent pro-human aesthetic sensibilities) I don’t think we’re entering the era of the “AI data scientist”.

But AI augmentation is a massive opportunity by itself! When I look at the work data folks do all day, I see a lot of tedium, but also a lot of creativity. And I see an opportunity to help humans focus on the things they can do the best, and let the computers do the rest.

It’s the dawn of a new era, with lots of opportunity. As a data person (and builder of tools for data people) I’m optimistic, and hope you’ll join me in that.

You can read more about how this relates to 19th century coal production here.

This is something we think a lot about at Hex, where we're creating a platform that makes it easy to build and share interactive data products which can help teams be more impactful.

If this is is interesting, click below to get started, or to check out opportunities to join our team.