Izzy Miller
Don't limit yourself to traditional, old-fashioned BI. Hex lets you turn even the most flexible and complex analyses into beautiful, easy to use dashboards.
Youโre running the data team at an eCommerce company, and whereas your data friends in other industries will be looking forward to winding down their Jupyter kernels for the holiday season, you know this is the most important part of the year for your business, and for your team.
Modern businesses run on data, and that data is of no use if it is locked in a data warehouse or database. You need to be able to show that data to the people who need to see it and give them the insights you and your team find from that data.
This is why metrics dashboards are a fundamental building block of data analytics.
An eCommerce company might rely on just-in-time stocking to make sure merchandise neither falls out of stock nor sits too long in warehouses, incurring unnecessary storage costs. Thus, they need to know what is in stock and what isnโt.
They might leverage customer data to tailor the homepage to display products that align with individual customer interests, send customized email campaigns with product recommendations, or offer personalized discounts. Thus, they must know past purchase history and preferences to create personalized shopping experiences.
They should focus on high-performing channels and tweak or discontinue underperforming ones to target the right customers and increase return on advertising spend. Thus, they need to know click-through rates, conversion rates, and customer acquisition costs.
Your data team plays a crucial role. By analyzing trends, predicting demand, and optimizing inventory levels, your team ensures the company can meet customer needs without overextending resources. With its surge in consumer spending, the holiday season becomes a critical time to apply these insights. Your team's work supports efficient operations and contributes significantly to the company's bottom line during this peak sales period.
Doing this type of analysis ad-hoc is an eggnog recipe for disaster. You need a way to have all of this data available and up-to-date so you can answer questions from operations, marketing, and executives when they need answers.
This is where building a business metrics dashboard comes in. A dashboard becomes a centralized, real-time platform that aggregates and visualizes key data points, enabling quick, informed decision-making. It allows different departments, from operations to marketing, to access and interpret relevant metrics effortlessly, ensuring that strategies are aligned with current trends and data insights.
With a well-designed dashboard, your team can proactively manage the business, responding to changes in customer behavior, inventory levels, and marketing performance with agility and precision, especially during critical periods like the holiday season. Letโs go through how you can create such a dashboard in Hex.
Our data is in a Snowflake warehouse. This is a standard setup for many modern businesses where data is replicated in a warehouse from production servers so the data teams can manipulate and analyze data as needed without disrupting the site or application.
We start by writing an SQL query to retrieve comprehensive details about orders from our data warehouse.
SELECT o.order_id,
o.user_id,
o.status,
o.created_at,
o.returned_at,
o.shipped_at,
o.delivered_at,
oi.product_id,
oi.inventory_item_id,
oi.sale_price,
ii.product_category,
ii.product_name,
ii.product_brand,
ii.product_distribution_center_id
FROM demo_data.ecommerce.orders o
left join demo_data.ecommerce.order_items oi on oi.order_id = o.order_id
left join demo_data.ecommerce.inventory_items ii on ii.id = oi.inventory_item_id
WHERE o.status != 'Cancelled'
GROUP BY 1,2,3,4,5,6,7,8,9,10,11,12,13,14
This returns a preview of our data:
Tabulating the data like this is helpful in terms of exploratory data analysis for the data team as they can use it to understand the structure of the data and the types of data. But it is less interesting for our final dashboard, so we wonโt have it visible.
Next, we want to roll all that data up into something more useful, such as orders per day. We donโt have to query the data warehouse again, instead we can use the data from that previous query, a pattern called Chained SQL. This is an excellent option if you are dealing with a lot of data and want to slowly dive deeper into the datasetโyou can start by pulling all the data and then filter and group as needed:
SELECT date_trunc(day,created_at) as day,
product_category,
product_brand,
product_name,
count(distinct order_id) as orders,
sum(sale_price) as sales
FROM orders_items
WHERE returned_at is null
and created_at >= DATEADD(day,-365,CURRENT_TIMESTAMP)
group by 1,2,3,4
order by 1
Weโll visualize this as a chart and include it in our published app:
Here, we use an in-built chart from Hex for visualization. But charts are an integral part of any dashboard, and you need to know how to build these visualizations no matter how you create your dashboard. To add a chart similar to this, you can use the Python matplotlib library:
thirty_day_orders["WEEK"] = thirty_day_orders["DAY"].dt.to_period("W")
agg_orders = thirty_day_orders.groupby("WEEK")["ORDERS"].sum().reset_index()
plt.scatter(agg_orders["WEEK"].astype(str), agg_orders["ORDERS"])
plt.xlabel("Week")
plt.ylabel("Sum of Orders")
plt.title("Sum of Orders by Week")
plt.show()
Which produces:
Itโs not quite as clean as the Hex chart, nor does it have interactivity, but it still allows you to start telling the story of your data.
Weโll also pull data from the previous period so we can later compute our growth:
SELECT date_trunc(day, created_at) as day,
product_category,
product_brand,
product_name,
count(distinct order_id) as orders,
sum(sale_price) as sales
FROM orders_items
WHERE returned_at is null
and created_at <= DATEADD(day,-200,CURRENT_TIMESTAMP) and created_at > DATEADD(day,-300,CURRENT_TIMESTAMP)
group by 1,2,3,4
order by 1
So far, weโve pulled and filtered data using direct SQL queries. Hex is polyglot, allowing you to combine different languages within the same notebook, so the SQL queries above output Pandas DataFrames. The next step is to compute the change in orders from the previous month. To do that, weโll use Python with our DataFrames:
orders_total = int(thirty_day_orders['ORDERS'].sum())
sales_total = thirty_day_orders['SALES'].sum()
prev_orders = int(prev_period_orders['ORDERS'].sum())
prev_total = prev_period_orders['SALES'].sum()
We could print these values to the notebook, but within Hex, we can use them as inputs to single value fields to highlight them better within the dashboard:
Again, we want to visualize this data. When creating a dashboard, you must think from a user-centric perspective rather than a data-centric view. The end-users need to understand the metrics and data at a glance. Charts, graphs, tables, and single-value visualizations like those above help with this.
Letโs add a line chart for our daily sales data:
Again, this is a Hex chart, but we can reproduce this (somewhat) using matplotlib:
import matplotlib.pyplot as plt
# Group by day and sum sales
sales_by_day = thirty_day_orders.groupby("DAY")["SALES"].sum()
# Create line chart
plt.figure(figsize=(10, 6))
plt.plot(sales_by_day.index, sales_by_day.values)
plt.xlabel("Day")
plt.ylabel("Sales")
plt.title("Sales by Day")
plt.show()
As we build our dashboard, we can see a pattern emerging. We write an SQL query to pull and filter the data we want to highlight, then sometimes follow that with some Python code to manipulate the resultant DataFrame, and then plot that data in the correct format.
So far, that format has just been tables, scatter plots, or line charts. However, the visualization depends entirely on the data type you have.
Time series data. Line charts are highly effective for data that changes over time, such as sales trends or website traffic. They clearly show trends, growth, or declines over time, making identifying patterns and anomalies easier. A variation, the area chart, can also emphasize the magnitude of change over time.
In Python: Use matplotlib or seaborn libraries. After importing, create a plot using plt.plot(date, value)
where date
is your time variable, and value
is the data point you want to plot.
Categorical data. When dealing with data that can be grouped into distinct categories, such as customer segments or product types, bar charts or column charts are ideal. They make comparisons among different categories straightforward and can effectively highlight differences or similarities.
In Python: Utilize matplotlib or seaborn. For a basic bar chart, use plt.bar(categories, values)
where categories
are your distinct groups and values
are the corresponding data points.
Distribution data. Histograms are a great choice to display the distribution of data and show how often values occur within a dataset. They help understand the underlying frequency distribution (e.g., normal distribution) of a set of continuous data points and can be pivotal in identifying patterns or outliers.
In Python: With matplotlib, you can simply use plt.hist(data)
, where data
contains the continuous values whose distribution you want to visualize.
Correlation data. If you're looking to show the relationship between two variables, scatter plots are the way to go. They are perfect for identifying correlations, trends, and potential outliers in datasets where the variables are continuous, and you want to explore how they affect each other.
In Python: In matplotlib, create a scatter plot with
plt.scatter(x, y)
where x
and y
are the two variables whose relationship you want to examine.
Hierarchical data. Treemaps or sunburst charts can be very useful for data structured hierarchically, such as organizational structures or product categories. These visualizations help understand each category's proportion at multiple levels and make complex hierarchical data more digestible.
For a tree map, you can use the squarify library. Create your plot with
squarify.plot(sizes, label=labels)
where sizes
are the values and labels
are the categories. The plotly library is a good choice for using plotly.express.sunburst(data, path=['Category', 'Subcategory'], values='Value')
.
Geographical data. Maps are the most appropriate visualization tool for data that is location-based or needs to be understood in a geographical context. This can include choropleth maps, where different shades or colors represent varying values in other regions (helpful in visualizing data like population density or sales by region), or point maps, which plot individual data points on a map (ideal for showing the location of events or store locations). Maps provide a clear and immediate geographical context to the data, making it easier to identify trends, patterns, and areas of interest based on location. They are particularly effective in scenarios where the spatial distribution of the data is as important as the data itself.
For choropleth maps, plotly or folium can be used. With plotly, use
plotly.express.choropleth(data_frame, locations, color)
. For point maps, use folium.Map(location)
to create a base map and add points using folium.Marker(location).add_to(base_map)
.
Effectively communicating data is as vital as the data itself. Two pivotal aspects of data storytelling are narration for contextualizing data through text and interactivity for enabling deeper engagement and exploration of the data through dynamic elements.
The power of a dashboard lies not just in the numbers it presents but in the story it tells. Narration plays a crucial role in this context. Adding text to your dashboard helps contextualize the data, making it more accessible and understandable to the user. This can include explanatory text, insights drawn from the data, or even questions that prompt the user to think about the data in a certain way. It's about translating data points into a cohesive story that resonates with the audience. This approach helps convey the meaning behind the numbers and guides users on how to interpret them. Compelling narration can turn a collection of charts and graphs into a persuasive narrative, leading to more informed decision-making and a deeper understanding of the underlying trends and patterns.
Clarity and Conciseness: Ensure that the text used for narration is clear and concise, avoiding jargon and overly complex language. The goal is to enhance understanding, not to overwhelm the user with information or technical terms.
Contextual Relevance: The narration should be directly relevant to the data presented. It should help in drawing connections between different data points and in explaining how individual metrics tie into the broader business goals or strategies.
Storytelling Approach: Use a storytelling approach in your narration. This could involve presenting data as a narrative over time, highlighting key events or milestones, and explaining how different data points contribute to the overall story of the business or a specific campaign.
Interactivity in dashboards elevates the user experience from passive viewing to active exploration. By incorporating interactive elements like filters, drill-downs, and sliders, users are empowered to delve deeper into the data. This hands-on approach allows for personalized exploration, where users can focus on what's most relevant to them. Interactive charts and graphs enable users to uncover the nuances and intricacies hidden within the broader trends. This makes the dashboard more engaging and fosters a deeper connection with the data. Users can test hypotheses, explore different scenarios, and gain insights that static dashboards cannot provide. Ultimately, interactivity transforms the dashboard into an exploratory tool, allowing users to navigate the data landscape and derive their own conclusions, which is essential in fostering an organization's data-driven culture.
User-Driven Exploration: Design interactive elements in a way that they encourage user-driven exploration. Users should be able to intuitively interact with the data, whether itโs through filtering, sorting, or drilling down into more detailed views.
Responsive and Fast: Ensure that interactive elements are responsive and fast. Delays or glitches in response can frustrate users and detract from the exploration experience.
Balance Between Detail and Overload: While interactivity allows for deeper exploration, itโs important to balance the level of detail. Too much complexity can be overwhelming for the user. Provide enough interactivity to facilitate meaningful exploration but not so much that it becomes confusing or cumbersome.
When all this comes together, you end up with a dashboard that allows your audience to clearly understand the data of your organization and to use that as a guide for for their decision making.
Every dashboard will be different, both between companies and teams. What a sales team needs to know in an eCommerce company differs from what a finance team needs to understand in a SaaS company. Whatever type of dashboard youโre building, here are some possible KPIs to include in the dashboard:
Total revenue. The total amount of income generated from sales of goods or services.
Revenue growth rate. Measures the rate at which a company's revenue increases over a specific period.
Average revenue per customer. The average amount of revenue generated per customer.
Conversion rate. The percentage of website visitors or campaign visitors who take a desired action (like making a purchase).
Customer acquisition cost (CAC). The cost of acquiring a new customer includes marketing and sales expenses.
Customer lifetime value (CLV). The total worth to a business of a customer over the whole period of their relationship.
Return on marketing investment (ROMI). The profitability and effectiveness of marketing campaigns.
Cost per lead. The total cost of generating one lead through marketing efforts.
Website traffic and sources. The number of visitors to a website and where they are coming from (like search engines, social media, etc.).
Social media engagement and reach. Measures the level of engagement (likes, shares, comments) and the total number of people who see social media content.
Email marketing performance. Includes open rate and click-through rate, indicating how recipients interact with email campaigns.
Lead conversion rate. The percentage of leads that convert into actual customers.
Net promoter score (NPS). A measure of customer loyalty and satisfaction based on how likely customers are to recommend a company to others.
Customer satisfaction score (CSAT). A measure of how satisfied customers are with a company's products or services.
Average resolution time. The average time taken to resolve customer issues or complaints.
Customer retention rate. The percentage of customers who remain with a company over a given period.
Customer churn rate. The percentage of customers who stop doing business with a company over a given period.
Inventory turnover. A measure of how often inventory is sold and replaced over a period.
Order fulfillment cycle time. The total time taken from receiving an order to delivering it to the customer.
Capacity utilization rate. The percentage of total capacity that is being used in production.
Supply chain efficiency. A measure of how effectively a company's supply chain is managed.
Employee productivity rates. The output of employees in a certain period.
Gross profit margin. The percentage of revenue that exceeds the cost of goods sold.
Net profit margin. The percentage of revenue that remains after all operating expenses, interest, taxes, and preferred stock dividends have been deducted from a company's total revenue.
Operating cash flow. The amount of cash generated by a company's normal business operations.
Return on investment (ROI). A measure of the profitability of an investment relative to its cost.
EBITDA. Earnings Before Interest, Taxes, Depreciation, and Amortization, a measure of a company's overall financial performance.
Cart abandonment rate. The percentage of online shopping carts that are filled but never make it through to the transaction.
Average order value. The average dollar amount spent each time a customer places an order on a website.
Product return rate. The rate at which customers return purchased products.
Customer reviews and ratings. Qualitative and quantitative assessments provided by customers about their experience with a product or service.
Selecting the right KPIs is crucial for the dashboard to be effective. They should align closely with the company's strategic goals and provide actionable insights. It's essential to regularly review and update the KPIs to ensure they remain relevant to the evolving business needs.
When building business metrics dashboards, following best practices ensures they are effective, user-friendly, and valuable in decision-making processes. Here are some key best practices to consider:
Define Clear Objectives: Understand the purpose of the dashboard. Clearly define what you want to achieve with it, whether it's tracking performance, making strategic decisions, or improving operational efficiency.
Identify Key Performance Indicators (KPIs): Start by identifying the most critical KPIs for your business. These should align with your company's strategic goals and provide clear insights into performance. As we showed above, KPIs can vary depending on the department (sales, marketing, operations, etc.) and should be carefully selected to ensure they provide actionable insights.
Ensure Data Accuracy and Consistency: The effectiveness of a dashboard depends on the accuracy and consistency of the data it displays. Ensure that the data sources are reliable and that there are processes in place to maintain data quality. Regular audits and data validation routines can help in maintaining data integrity.
Design for Clarity and Usability: The dashboard should be intuitive and easy to understand. Use clear visualizations and avoid clutter. Different user groups might require different data presentations, so consider creating multiple views or customizable dashboards to cater to varied needs.
Focus on Actionable Insights: The primary purpose of a dashboard is to facilitate decision-making. Ensure that the data presented leads to actionable insights. This means avoiding vanity metrics and focusing on data that can directly influence business strategies and actions.
Secure Your Data: Given the sensitivity of business data, security should be a top priority. Implement strong access controls and encryption, and ensure compliance with data protection regulations.
Each of these concepts plays a critical role in the success of a business metrics dashboard, ensuring it is not only functional and informative but also a key tool in strategic decision-making processes.
Discover how other data scientists and analysts use Hex for everything from dashboards to deep dives.
You can use Hex in two ways: our centrally-hosted Hex Cloud stack, or a private single-tenant VPC.