Data Visualization using Matplotlib
The course is part of this learning path
This course will guide you through all the possible techniques that are used to visualize data using the Matplotlib Python library.
In this course, we will explore the main functionalities of Matplotlib: we will look at how to customize Matplotlib objects, how to use various plotting techniques, and finally, we will focus on how to communicate results.
If you have any feedback related to this course, feel free to contact us at firstname.lastname@example.org.
- Learn the fundamentals of Python's Matplotlib library and its main features
- Customize objects in Matplotlib
- Create multiple plots in Matplotlib
- Customize plots in Matplotlib (annotations, labels, linestyles, colors, etc)
- Understand the different plot types available
- Data scientists
- Anyone looking to create plots and visualize data in Matplotlib
To get the most out of this course, you should already be familiar with using Python, for which you can take our Introduction to Python learning path. Knowledge of Python's Pandas library would also be beneficial and you might want to take our courses Working with Pandas and Data Wrangling with Pandas before embarking on this Matplotlib course.
The data used in this course can be found in the following GitHub repository: https://github.com/cloudacademy/data-visualization-with-python-using-matplotlib
Welcome back. In this lecture, we're going to cover an important concept, which is the Matplotlibrc configuration file, which is used to customize objects in Matplotlib. I introduced it in lecture 2, and now you have the knowledge to cover this more advanced topic. So Matplotlib uses Matplotlibrc configuration files to customize all kinds of properties, which we call runtime configuration, that is RC parameters. You can control the defaults of almost every property in Matplotlib from figure size and DPI, to line width, color and style, axes, grid properties, and text properties.
All the rc parameters are stored in a dictionary-like variable, called rcParams, which is global to the Matplotlib package. So let's import pandas, Matplotlib, and we import getting series plot, that we saw in the previous lecture, from the wrapper_plot python file. Now, you can find this python file in the GitHub repo for this course. We read the Gapminder dataset again and filter on China. Let us print out all the settings included in the default Matplotlibrc file.
To do so, we call the print function on plt.rcParams. Now, this prints out all the configurations that have been set in the configuration file. To display where the currently active Matplotlibrc file is loaded from, we do the following. So basically, we import Matplotlib as mlb and then call mlb.Matplotlib_fname and that will show you the path where the file is stored.
We can modify the desired parameters easily and we do it like this. So, mlb.rcParams in position lines.linewidth=0.9. And then, mpl.rcParams, and position figure.figsize equal to the tuple 6.4 and 4.8. This will directly modify the source file. No actions are required to be performed outside the notebook. I find this very useful, since it permits us to dynamically set the configurations. Here we have an error, a misspelling. Yes, it should be MLB. And there we are.
So version 1.4 Matplotlib, released in August 2014 introduced a very convenient start module, which includes a number of default style sheets, as well as the ability to create and package your own styles. These style sheets are formatted similarly to the dot matplotlibRC files mentioned earlier, but must be named with a dot MPL style extension.
Even if you don't create your own style. The style sheets included by default are extremely useful. The available styles are listed as follows. You just print PLT.style.available. And the result is a list of strings. Each of them representing a style. We can see for instance, the GG plot style. We can also see the Seaborn style. And practically, Seaborn is a modern Python visualization library built on Matplotlib. Note that by default, the style sheet is classic, and this is the one that we have used so far.
Now, the best way to switch a style sheet is to call PLT.style.use. And then we call the style name. And this example, we're using classic but you can use whichever one you want. Now, please know that once you call the style.use method with the desired style name. This will change the style for the rest of the session.
If you only want to use a style for a specific block of code, but you don't want to change the global styling. The style package provides a context manager for limiting your changes to a specific scope. So to isolate your styling changes, you can write the following. So we've PLT.style.context and then say here we use Seaborn. We just paste the code we used at the end of lecture five. And then calling this now, we get the same output but with a Seaborn style. So this looks nice.
Now, we're gonna end this lecture with a very important method that we have not actually covered yet, save fig. Now, save fig is really useful, since it allows us to store the output into a file that can be shared with your colleagues, or shared with anyone for that matter. So to save a plot as image, you just need to call save fig on the figure object, and then pass the file name you wish the plot to be saved to, to the F name argument. So let's have a look.
In this case, we're going to say fig.savefig, and then pass the file name that we wish to assign to this plot. So in this case, we'll do my plot.png, and then know that we can control the quality of the image using the DPI of the image, also known as resolution, which is expressed in dots per inch, as we mentioned before. The arguments quality and optimize are deprecated nowadays. So in this lecture, we have covered advanced customization techniques in Matplotlib.
In the next lesson, we're going to explore different plot types in Matplotlib. If you're ready for that, then I'll see you in the next lecture.
Andrea is a Data Scientist at Cloud Academy. He is passionate about statistical modeling and machine learning algorithms, especially for solving business tasks.
He holds a PhD in Statistics, and he has published in several peer-reviewed academic journals. He is also the author of the book Applied Machine Learning with Python.