This Course explores how to interpret your data allowing you to effectively decide which chart type you should use to visualize and convey your data analytics. Using the correct visualization techniques allows you to gain the most from your data. In this Course, we will look at the importance of data visualization, and then move onto the relationships, comparisons, distribution, and composition of data.
If you have any feedback relating to this Course, feel free to get in touch with us at firstname.lastname@example.org.
- Get an overview of what data visualization is and why it's important
- Learn how to visualize relationships within data
- Learn about comparisons, distribution, and composition of data
This Course has been designed for those who work with big data or data analytics who need to interpret data results in an effective way.
As a prerequisite to this Course, you should have a very basic understanding of the terminology used in relation to tables and graphs
Hello and welcome to this lecture which will focus on a couple of different charts that help to show the relationships nested within your data in a clear and concise way.
The two charts I will be focusing on include Scatter and Bubble charts. Let’s start with a scatter chart, which is sometimes referred to as a scatter plot.
So the main purpose of a scatter chart is to show the data relationship between 2 sets of data using an X and Y axis. Let’s look at an example:
Suppose an ice-cream stall records their daily sales in addition to the average daily temperature. The following table shows this data covering a period of 2 weeks.
Even from just this small table of data, it can be difficult to see any kind of real relationship between the data sets. However, when we convert this to a scatter plot or scatter chart with one data set using the X axis, in this example the ‘Sales’ column, and the ‘Temperature’ using the Y axis, then we get the following results.
Straight away we can start to see a pattern or relationship forming between the 2 sets of data. From this we can clearly see that as the temperature begins to increase, the ice-cream stall generally has a greater return in sales. As a result, the temperature has a direct relationship with the amount of sales gained.
Using this data, we can then add a trend line to allow us to plot where we have gaps in our data, in our example we could add a line like the following.
This trend line can give us a good prediction of what we could expect our sales to be based on the different temperatures. These trend lines allow us to work with both Interpolation and Extrapolation values. Let me explain the difference between these two. When we use the trend line inside the boundaries of our existing data set we can find our interpolation value. For example, when we hit 67 degrees, our estimated profit would be $150.
Now if we were to extend out our linear trend line beyond our last data point, we could estimate our sales based on Extrapolation values. These are values outside of the existing data set, as you can see here:
Using the Extrapolation values we could expect sales of around $370 if the temperature hit 100 degrees Fahrenheit.
Let’s now take a look at Bubble Charts.
These are very closely related to scatter plots or scatter charts, however instead of looking at the relationship between 2 data sets, which in our example was Temperature and Sales, a 3rd data set is introduced and is displayed by a ‘bubble’ which relates to the size of the data point, or plot, on the chart.
So let’s take a look at the 3rd set of data in our table, which is identified as ‘New Customers.’
The New Customers column records how many new customers that the ice-cream stall took business from in each given day.
Now when we use this data in a bubble chart we again have our X and Y axis of Sales and Temperature, but the ‘New Customers’ value is represented by the size of the bubble as you can see here:
So what can we deduce from this type of bubble chart? We can see that there is a clear relationship to the number of new customers in relation to the increase in temperature, which in turn relates to the quantity of sales. So, the hotter it is, the more likely it is that the ice-cream stall will serve new customers.
So it adds a 3rd dimension to the existing scatter chart.
There are a couple of points to bear in mind when creating your bubble charts
The first one is to add a level of transparency to your bubbles in the visual. Without transparency, you could miss smaller bubbles that could be hidden by larger bubble data points and this could skew your interpretation of your data relationships
Scale your bubble size - Many applications allow you to scale your bubble size down if the point overlap too much, by scaling down your bubble size allows you to see more of your bubbles easier
In the next lecture I shall be discussing how to visualize your data when you want to perform comparisons on data sets, so let’s take a look!
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.