Blog
How to Build a Dashboard in Python
Turning your Python data visualizations into beautiful interactive dashboards
Imagine you’re a data scientist at a SaaS startup, and your sales team needs a basic dashboard that visualizes pipeline stages and growth. How do you build that in Python?
Python isn’t short of data visualization options. From the simplicity of Seaborn to the level of control of Matplotlib, if you want a graph, chart, map, table, plot, mesh, or spine, there’s a way to get it with Python. But a dashboard isn’t just data viz. You need to contextualize the data with titles, text, tables, and more plots. And you need to publish it so that people can act on the data and information.
In this post we are going to take you through how to set up a basic dashboard with the most common Python tools and libraries: Matplotlib, Seaborn, and Plotly for visualization, and Flask, Jupyter, Dash, and Hex for deployment.
Throughout the post, we'll be plotting a few variations on this sales pipeline dataset. Our goal is to visualize the data in a few ways: a bar chart, a line chart, a histogram, and a scatter plot. We'll also want to give our dashboard users the ability to change the data being displayed and change the data range we’re plotting.
Let’s go!
Building the basic visualizations of a dashboard
Matplotlib is the OG of Python data viz. And it looks like it. Modeled on the extortionately-priced but actually-kinda-awesome proprietary MATLAB language, Matplotlib puts data in first place and style way back in, like, nineteenth.
But it is powerful. Let’s plot some of that dataset. This code will give you a basic bar chart that splits deals by won and lost:
The bad news is that your sales team is losing a lot of deals.
The good news is how easy it was to plot that data with Matplotlib. But did we really plot it with Matplotlib? Yes and no. To create the plot we called a built-in Pandas plot() method working on the DataFrame, not a Matplotlib method. But the Pandas plotting methods are provided under the hood by… Matplotlib.
This is something you’ll see time and again in Python data visualization. Seaborn? Matplotlib under the hood. Cartopy? Matplotlib under the hood. Plotnine? Matplotlib under the hood. Here’s a list of all the other visualization tools using Matplotlib in some way.
To produce the same plot in pure Matplotlib, we’d replace:
sales_data['Opportunity Status'].value_counts().plot( kind='bar')with:
plt.bar(sales_data['Opportunity Status'].value_counts().index, sales_data['Opportunity Status'].value_counts())Note that to use Matplotlib directly, we’re passing our data as an argument to a Matplotlib function, rather than calling a method (.plot) that belongs to our Pandas data.
There are tons of different ways to “build” these kinds of Matplotlib visualizations, and they come in all shapes and sizes (sorry). Another popular one is using Figure:
from matplotlib.figure import Figuresales_data = pd.read_csv("Sales Dataset.csv")# Generate the figure **without using pyplot**. fig = Figure() ax = fig.subplots() ax.plot([1, 2]) ax.title.set_text('Won/ Lost Opportunities (FY 2019-20)') ax.set_xlabel('Opportunity Type') ax.set_ylabel('Number of Opportunities') ax.bar(sales_data['Opportunity Status'].value_counts().index, sales_data['Opportunity Status'].value_counts(), color= ['red', 'blue'], alpha = 0.8) fig.savefig("opportunities_won_lost.png", format="png")
For the curious, Matplotlib's architecture consists of three layers:
At the top is the scripting layer. Whenever you use the .plot() method like we did above, you are using the scripting layer. As Matplotlib was built as an open-source version of MATLAB and MATLAB is mostly used by scientists rather than developers, the idea of this scripting layer was to mimic how MATLAB worked and give scientists a less verbose way of creating plots.
In the middle sits the artist layer. This is where the heavy lifting of Matplotlib is performed. Using this layer you can call each component, or Artist instance, that makes up a plot: the Figure, Axes, Line2D, y-label, xticks… With this layer you can finely control what appears in the final render, which is what we’re doing with the Figure calls above.
At the bottom is the backend layer. This is the low-level rendering interface that is controlling where pixels go on the screen. The idea here is that this is detached from the higher-level APIs that can then be application-specific. It is this layer that Seaborn, Plotnine, etc. build on.
For our dashboard, we’ll stick with the Figure method, as using the scripting layer has been known to cause memory leaks when used on a server.
Back to some charts: our sales team wants to understand how quickly opportunities are getting handled (also called sales velocity). Let’s build a line chart:
It looks like the vast majority of opportunities are dealt with in less than 100 days, though there are a few significant outliers.
This plot won’t be very helpful in a dashboard – it should probably be a histogram. To produce a histogram we only need to change one line:
ax.plot(sorted(sales_data['Sales Velocity']))Becomes:
ax.hist(sales_data['Sales Velocity'], np.arange(1, 200))(I guess two lines as you probably want to save it with a different filename!)
Which produces:
This is much more useful. Now we can see clearly that there is a definite spike in sales velocity of deals around the day 20 mark, and the stark dropoff just before 100 days.
The same goes for the scatter plot, with the expectation that a) we need another variable, color, to map the color of the points to, and b) we’re going to plot against opportunity size:
This is good, but all we have so far is a bunch of individual pngs. For it to be a dashboard, we need to start contextualizing and get it deployed somewhere.
Publishing to the web
Flask is the simplest way to get your Python code onto the web – it’s a bare bones server that can serve HTTP requests. Flask comes as a Python package like everything else so you can just pip install flask. The “Hello, World” (literally) for Flask is just this:
from flask import Flaskapp = Flask(__name__)@app.route("/")
def hello_world():
return "<p>Hello, World!</p>"Each individual page in your Flask app is defined by a route (this one is just the root directory, /) and a function that returns the HTML for the page. Very simple.
Save the above file as “hello.py” (not “flask.py”, as this would cause a conflict with Flask internals), run flask --app hello run, and you’ll have a web page that you can easily pop on a server somewhere. Neat.
The Flask app has two parts: (1) all the logic to manipulate the data and create the charts, and (2) a return HTML to the page. Let’s get our charts in there.
We’ll start, as always, by importing our libraries:
from flask import Flask
import pandas as pd
from matplotlib.figure import Figure
Import numpy as np
import base64
from io import BytesIOThe first three imports you’ve seen before. NumPy is the main data analysis library for Python, for working with numerical data. We are only going to use one method, .arange(), to just output a range of numbers, but NumPy is one of the most powerful libraries you can use with Python.
Base64 lets us encode binary data as ASCII and BytesIO gives us an in-memory binary stream. We’ll see these in practice in a moment.
Next, we’ll create an instance of the Flask class, which is the core of the Flask program. It runs the WSGI (web server gateway interface) server that runs your app:
app = Flask(__name__)We need to use the route() decorator next. Decorators are a fantastic “syntactic sugar” for Python. When you add a decorator above a function, you add functionality to your function. Here, we add a simple @app.route() decorator that tells Flask knows which route it uses in this function. In this case, the home page:
@app.route("/")We’re mostly going to copy and paste the code we’ve written for our two charts. But since this code will be running on a server, we need a little middleware to make sure the output of the plots is optimized for web. This is where base64