Features from Pixels
Convolutional Neural Networks
The course is part of this learning path
In this course, discover convolutions and the convolutional neural networks involved in Data and Machine Learning. Introducing the concept of tensor, which is essential for everything that follows.
Learn to apply the right kind of data such as images. Images store their information in pixels, but you will discover that it is not the value of each pixel that matters.
- Understand how convolutional neural networks are essential to the fundamentals of Data and Machine Learning.
- It is recommended to complete the Introduction to Data and Machine Learning course before starting.
Hello and welcome to this video on features from Pixels. In this video, we will talk about images and how to use them as inputs for a deep neural net. We will also introduce a very famous dataset of handwritten digits. The question is, how does a computer see an image? Let's start with a black and white image. A black and white image can be represented as a grid of points. And each point has a binary value. These points on the grid are called Pixels. And in a black and white image, they can only carry two possible values, 0 and 1. To extend this to a gray scale image, we allow the pixels to carry values that are intermediate between 0 and 1. Actually, since we do not really care about the infinite possible shades of Grey, we normally use unsigned integers with eight bits.
Which are the numbers from zero to 255. So a 10 by 10 gray scale image with eight bits of resolution can be imagined as a grid of numbers. Each of these numbers is an integer between 0 and 255. This, to the array with 100 numbers, corresponds to one data point in our dataset. So the question is, how can we train a machine learning algorithm on such data? The MNIST data set is a very famous dataset of handwritten digits and it has become a Benchmark for image recognition. It consists of 70000 images of 28 Pixels by 28 Pixels. Each representing a handwritten digit. If you are at the post office, you'd like to have a model that recognizes the digit from the image to automatically route mail using this zip code. So the target variables for this particular problem are the 10 digits from 0 to 9.
Each data point in the data set is itself a two-dimensional table and so we need to decide how to map it to features. The simplest way is to use each pixel as an individual feature. If we do this, the features base has a size of 28 by 28. Which means we have 784 features. Each feature is an integer between 0 and 255 or between 0 and 1 if we rescale each pixel to floats. Reading each pixel in order yields a long sequence of numbers that we will use as a feature vector to our model. So we have unrolled our image onto a very, very long vector of 784 numbers which are the features that we will send as inputs to our machine learning model. In conclusion, in this video, we've introduced how to unroll images as a long vector and use pixels as features as input for a machine learning model. And we've also introduced a famous data set which is called MNIST. Thank you for watching and see you in the next video.
About the Author
I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-founder at Spire, a Y-Combinator-backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.