Data Visualization with Python using Bokeh
The course is part of this learning path
Bokeh is an interactive visualization library in Python that provides visual artefacts for modern web browsers. In this course, we're going to have a look at the fundamental tools that are necessary to build interactive plots in Python using Bokeh.
Bokeh exposes two interface levels to users: bokeh.plotting and bokeh.models, and this course will focus mainly on the bokeh.plotting interface.
We'll start things off by exploring two key concepts in Bokeh: Column Data Source and Glyphs. Then we'll move on to looking at different aspects related to the customization of a bokeh plot, as well as focusing on how to introduce interactivity into a Bokeh object.
You'll also learn about using inspectors to report information about the plot and we'll also investigate different ways to plot multiple Bokeh objects in one figure. We'll round off the course by looking at plot methods for categorical variables.
- Learn about Columns Data Sources and Glyphs in Bokeh and how they are used
- Learn how to customize your plots and add interactivity to them
- Understand how inspectors can be added to plots to provide additional information
- Learn how to plot multiple Bokeh objects in one figure
- Understand the plot methods available for categorical variables
- Data scientists
- Anyone looking to build interactive plots in Python using Bokeh
To get the most out of this course, you should have a good understanding of Python. Before taking this course, we also recommend taking our Data Visualization with Python using Matplotlib course.
The GitHub repo for this course can be found here: https://github.com/cloudacademy/interactive-data-visualization-with-bokeh
Welcome back. In this lecture, we are going to cover a very important topic that is necessary to display multiple pieces of information in just one plot. In particular, we will investigate different ways to display such information with bokeh.
Bokeh is very dynamic in this regard: it includes several layout options for arranging plots in the same figure object. So let’s investigate all of them!
First we need to import our data and we also require the date column for each date to be a date time object. We also use the method we saw in lecture 3 to compute the financial returns, namely daily_change. This has been done for you as well.
Then, we make the necessary imports from the bokeh plotting interface and we import the figure, the show, and the output_notebook functions. We therefore call output_notebook() as well, as follows.
We are gonna use the following wrapper that I have created for you that will generate a bokeh figure object with a line glyph. The arguments in this function are the CDS from which we want to obtain the necessary information to display, the title (which is basically the annotation we want to display in our plot), and the plot width and height related to the figure object.
We then create three different CDSs from the corresponding pandas data frame as follows. Again, this has been done for you. To create a row of plots is very easy: we just need to create two (or more) figure objects and we pass them into the row() function that is imported from the bokeh layouts.
We then create a figure - that we’ll call p1 - using the wrapper creating_figure and this requires four arguments: we have to pass the source dataframe - in this case the Apple CDS, we then specify the title and we set it equal to "Apple Closing Price Jan-Sept 2020”, and also, we set plot width and height to be 450 and 300, respectively.
We repeat the same operation for Facebook by changing the CDS and the title, and we store that into a variable called p2. And finally, we create another object called p3 that is related to the Google CDS.
When we create a layout of plots, we can use the row function and this basically requires a list of figure objects we wish to display as an argument - in our case figure objects p1, p2 and p3. And we store this into the variable row_plots, and then we show it.
This is the result: we see each of them is independently defined, and therefore has its own tools, shown here as usual. Please note that here we have not specified any xlabel and ylabel arguments, but it is always good practice to do that. So maybe try that on your own as an exercise.
We can create columns of plots as well: to do so, we use the column method from the bokeh layouts. We therefore import columns from bokeh.layouts. Then, we create a variable called columns_plots which contains a list of our plots - namely p1, p2 and p3.
The results are as follows: instead of showing each single plot in the same row we show each of them in three different rows. Sometimes it is better to nest rows and columns in just one plot. Suppose that for instance we wish to display the Apple series in one row and in the second row two plots describing the relationship between Apple returns and two other stocks - in our case Facebook and Google.
To do so, we first need to create a new figure object, let’s call it p4, using the wrapper creating_figure we used a few moments ago, but now we specify plot width equal to 750 and height equal to 300.
We then compare returns between the main series - in this case Apple - and the two other series - namely Facebook and Google. To do so, I have created a method called compare_returns that is shown here.
This requires the two column data sources, namely src_df1 and src_df2, the title which is simply the title we wish to display in opur plot, and the plot and plot height. This will return a bokeh figure object with a circle glyph.
We then create an object describing the Returns Dependency between Apple and Facebook - and we assign this to the variable p12 - which is nothing more than a call of the compare_returns method between the Apple and Facebook source dataframes, the title, which is given as “Returns Dependency Between Apple and FB”, and finally plot width equal to 450 and height equal to 300.
We repeat the same logic for the return dependency between Apple and Google, and we store that object inside the variable p13.
We then nest the plots as follows: we specify a list of plots containing the returns - in our case p12 and p13 - inside the row() function. We also set the sizing_mode argument equal to “scale_width”. sizing_mode controls how the items in the layout will resize to fill the available space. If we set this equal to scale_width, the components will resize to stretch to the available width, while maintaining the original aspect ratio.
We define this object in the variable row_plots2. We then call the column function and we pass basically a list of objects, namely p4, which is the apple series, and row_plots2 containing the return dependencies between Apple and the other two stocks. And once again, we set the sizing_mode argument equal to “scale_width”. We store this into the layout variable, and we show it.
Here we go. this is the result. We see that in the first row we have the plot related to the main closing price series and in the bottom - in the second row - we have two plots describing the return dependency between stocks.
Bokeh also provides a gridplot() function that can be used to arrange Bokeh Plots in grid layout. Note that this function also collects all tools into a single toolbar, and the currently active tool is the same for all plots in the grid, whereas, in the previous plot we had three different toolbars for each single plot.
I’ll now provide a wrapper method called `creating_figure_returns` that is given as follows. This basically returns a scatter plot of simple financial returns: the logic is very similar to the other wrapper introduced in this lecture, so, if you feel the need, please pause the video and take your time to digest it.
We now create a new plot, called p5, that contains the simple financial returns of Apple using the wrapper creating_figure_returns. We pass the Apple CDS and the title as "Apple Returns Jan-Sept 2020". We also specify again the plot_width equal to 450 and the plot_height equal to 300.
We repeat the same arguments for Facebook and Google, respectively. Therefore we have p6 is gonna be related to the FB CDS and then we also have the same for Google.
We now create a gridplot object: we firstly import gridplot from bokeh layouts. We then create a gridplot by passing a list of lists of plots - in our case we pass the following list made of three distinct lists: [[p1,p5] related to apple, then [p2,p6] since they are both related to facebook, and then [p3, p7]] related to google. We store this into the layout object and we show it. Here is the result.
We can create also tabbed plots with bokeh. A tabbed layout consists of two Bokeh widget functions: Tab() and Panel() from the bokeh.models. Like using gridplot(), making a tabbed layout is pretty straightforward.
We import the tab and panel functions from bokeh models. And then we call tab1 which is basically a panel which is made of two plots, namely p1 and p5, and we pass them via the argument child using the row function - and note that we can pass them either with a series of those bokeh objects (P1 and P5 in this way) or using a list. We also set the title associated with the panel - in this case we call it “Apple”.
We do the same for both Facebook and Google, and store them into panel tab2 and tab3, respectively. For Facebook, we pass p2 and p6 to tab2 and for google, we pass p3 and p7. Finally, we create a tab object which is nothing more than the Tab function containing the tabs given in a list, that is tab1, tab2 and tab3. We then show the tab object.
This is cool, isn’t it? We see that now each single stock is associated with a particular tab that can be easily navigated from the top of the figure.
This concludes the lecture on multiple plots.
Andrea is a Data Scientist at Cloud Academy. He is passionate about statistical modeling and machine learning algorithms, especially for solving business tasks.
He holds a PhD in Statistics, and he has published in several peer-reviewed academic journals. He is also the author of the book Applied Machine Learning with Python.