Blog

Self-serve analytics is finally real (and that's terrifying)

The data team’s job isn't to be the gatekeeper of every number. It's to be the architect of trustworthy systems.

self-service analytics

For years, data teams have chased the dream of self-service analytics. We built dashboards. We wrote documentation. We ran workshops on how to answer your own questions. And still, every Monday morning brought another Slack: "Hey, quick question — where can I find...?"

We tried to control our data universes with our neat and clean dbt models, our docs, and our roles as intermediaries between the business and the data. But that way of working is slow. The data team becomes a bottleneck, and stakeholders get frustrated waiting for answers to questions that feel simple to them.

AI is changing that. At Hex, everyone on the team is a Hex editor with early access to our AI features (see what’s new here). Natural language questions are actually getting answered. Our Head of Revenue is exploring customer contraction patterns. Our Head of Product is analyzing pricing tier movement. Our sales team is digging into product usage data before renewals.

They're not asking us first. Sometimes we only find out these analyses happened when someone mentions them in passing.

This is both exactly what we wanted and absolutely terrifying.

The uncomfortable truth about self-serve

AI makes it possible for people to confidently answer their questions much faster, but we can’t guarantee those answers are right.

LLMs are inherently a little bit chaotic. They’re non-deterministic, so while I can do a lot to point them in the right direction, I cannot guarantee that they will deliver a correct result. They help people work beyond their skill sets. We've got people out doing analyses they can't interpret, and deciding whether they're right or wrong without data team review. This is very powerful, and a little bit risky.

But here's the thing: I get things wrong too. And so do my teammates. And while we might have a higher level of accuracy, we really slow things down if everyone in the company needs to wait for us to answer their questions.

You're going here whether you're ready or not

Your stakeholders are going to use AI for analytics whether you're ready or not. The choice isn't whether to enable self-serve. The choice is whether you'll build the guardrails that make it work, or leave your colleagues to figure it out on their own.

And ultimately, relinquishing this control and leaning into the chaos is worth it, because what I'm actually seeing is that suddenly the data team is delivering on our promises. Self-service is actually happening. The data team isn't the bottleneck. And our users are getting answers at the speed of the business.

We're also finding out that maybe they didn't need quite the level of rigor that we were providing. Maybe a less complete answer that they can get in a few minutes is fine. And when it's not, they loop in the data team, and we partner with them to dig deeper.

I got into data work because I wanted to help organizations make better decisions. And right now, I'm seeing more people show up with data than at any point in my career. I've never found a way to scale myself that enabled that kind of data usage. So, while it's a little humbling to have a bunch of robots achieve what I have never been able to in 15 years of data work and building data teams, I'm here for it.

Context gives AI the map

As I wrote in another post recently, analytics engineers have always been context engineers — we’ve created systems that help humans turn data into insights. But it's hard for humans to use our work; it's hard for people to remember to search the docs for a column definition, it's hard for a human to intuit how to use a table, and it's hard for humans to remember all the business context they need.

It's hard for me, as an analytics engineer, to get my work in front of people so that they know what resources are available to them. Even when I'm doing a really good job building resources, there's only so much impact I can have in improving individual humans' abilities to query our data and get the right answer.

AI agents are the best users of our work. They actually read the docs. I can give them a rules file and they'll look back at it every time they're writing a query. They'll scroll through all the analyses that exist, particularly anything that I've endorsed as being highly relevant to them, and they'll use that to inform how they use the data in the next analysis.

Getting that context out of the heads of the data team and into systems that will use that context every single time is such a huge expansion of the impact that we can have. Suddenly, instead of empowering a small number of analysts, we're truly empowering the entire company.

Here’s what we’re learning about context

Every hour you spend on documentation, semantic layers, and clean data architecture now serves hundreds of analyses instead of one. The same context you'd give a new analyst helps the AI too.

It's not easy, because none of us know exactly how all these things play together yet. We know that we're gonna need them in one way or another. What we don't know yet is how to stitch it all together to get the best outcome, but here's what we're learning so far:

  • Well-modeled data is the foundation. Your data warehouse needs to be clean and consistently structured (see this post for ideas). Naming conventions matter more than ever when an LLM is trying to figure out which tables to join. The agent will find your data, but if it's a mess, it'll build messy analyses on top of it.

  • Documentation isn't optional.

    We used to let docs slide because we could answer questions when they came up. Now the agent is answering those questions and it needs the same context we'd give a human, like: What does this field actually measure? When was this table last updated? What are the known data quality issues? Write it down.

  • Rules files fill in the blanks about how your business works.

    This is newer territory. We're experimenting with rules files that capture things like "Q4 revenue always spikes because of year-end deals" or "exclude any accounts created before 2020 from cohort analysis because of the data migration." These are the things you'd mention to an analyst in passing, but now they need to be written down somewhere the agent can find them.

  • Semantic layers define key metrics.

    When everyone can create their own metrics, you need a clear source of truth about what "active user" or "ARR" actually means. Semantic layers aren't just for humans anymore. They're how you ensure consistency across hundreds of analyses you'll never see.

  • Reusable analyses become building blocks.

    The agent can look at past analyses and use them as starting points. If you've built a solid churn analysis, the agent can extend it rather than starting from scratch. This is powerful, but it means the analyses you create need to be clear enough for an LLM to understand and build on.

What this means for your job

When your stakeholders can answer "How are trials trending?" themselves, analysts get to work on the questions that actually matter: the thorny, high-value problems. The analyses that shape strategy, not just satisfy curiosity.

The data team’s job isn't to be the gatekeeper of every number. It's to be the architect of trustworthy systems and the translator between business strategy and technical reality. AI doesn't replace that. It makes it more important.

Our businesses are getting the answers to questions that they really need. Our teams are working on things that are more interesting than what we've worked on in the past. The future of data work isn't about doing more analysis. It's about building the foundation that lets everyone analyze well.

That's harder. But it scales, it drives impact, and it’s a lot more interesting.

New to Hex and want to try agentic analytics?