The course is part of these learning paths
In this Course, we cover Python Visualization Libraries and Tools, focusing particularly on Marplot and the Seaborn plotting library. You will learn how to use these to visualize your data using Python in a clear and effective way. We will go into depth particularly on Seaborn and you'll learn about the different plot available including regression plots, pairplots, and heat maps.
If you have any feedback relating to this Course, feel free to let us know at support@cloudacademy.com.
Learning Objectives
- Use Marplot to create plots to epresent data, and format the plots
- Add information to plots such as labels, titles, legends, etc.
- Get acquainted with the Seaborn plotting library
- Learn how to plot data using Seaborn in a variety of different plots
Intended Audience
This Course is intended for data scientists, data engineers, or anybody interested in learning how to use Python tools to visualize data.
Prerequisites
To get the most out of this course, you should be familiar with the basics of programming: variables, scope, functions.
Resources
The dataset(s) used in this course can be found in the following GitHub repository: https://github.com/cloudacademy/practical-data-science-python
So now what I want to do is show you we can generate a custom classification system. So these are all columns that are built into our data sets, we can generate categories that we want to visualize as well, just as a demonstration, I could write a function, which has the sole purpose of grouping people into age groups. So if they're under 10, and we label them under 10, if they're under 16, under 16, and so on and so forth, we can map this over our data frame, and generate a more customized violin plot. Violin plot passing in my data is equal to the data frame, X is going to be given by my height. And then Y, what I can pass in for Y is this function mapped over a column of my data set. Y is going to be equal to DF of age. And I can call .map, and then I pass in the name of the function I want to run over at age class. And then what I've got is a sort of customized visualization, where I can actually see that there's no one under 10. I've got some in the over 16 and under 16 segments, and I'm comparing these various heights.
So I'm just applying a function of my data, and then plotting the result of that. So in reality, what are we doing? We're not doing anything particularly complicated here. All we're doing is we're just saying that I want my heights to be on the X axis, and then on the y axis, I want to plot the category I've put each of these people into.
So what df.map is going to do, it should take the ages and tie them into a label. Based upon that label, we're grouping them and then plotting them on the same axis. So it's just utilizing what we know about anyway, the fact that we can do these sorts of things.
So now I want to talk about something called a cat plot, sns.catplot. What this does is it takes a step back, and it's designed for plotting categorical data. And you can specify a number of different kinds of plots that you might want to put on a set of axis. It's a more generalized function.
So I can pass in, for example, I still pass in data as equal to whatever is in my data frame. And I can pass in what I want to be on the X and the Y axis. And I've got X of age, and I've got Y is going to be again, final judgment. So I'll just copy a few of these things in. So I want the X variable to be H, I want the Y variable to be the final judgment score. I want row to be given by gender. So we'll see what row means. I'm saying that I would like a box plot or an orientation horizontal, and then these just specify dimensions. So aspect is width, height is height essentially. And if I run this what I get, instead of having on the same graphic, I'm splitting across different axis, males and females, for example.
So whereas previously with the box plot I was having a look at men and female in the same graphic, I'm essentially setting up a figure and saying, I want the rows to be dictated by the gender of the person, the rows in my figure.
So remember how figures are essentially in grid sort of, this is going to have two rows and one column. So I can choose any sort of plot, really, I can do it with a violin as well. So it simply generates a violin plot across multiple sets of things. So it's kind of taking a step back from specifically calling a function and defining a grid and saying I want to do this with the grid, for example.
Lectures
Delivering training and developing courseware for multiple aspects across Data Science curriculum, constantly updating and adapting to new trends and methods.