Beyond Pixels
Start course
1h 19m

In this course, discover convolutions and the convolutional neural networks involved in Data and Machine Learning. Introducing the concept of tensor, which is essential for everything that follows.

Learn to apply the right kind of data such as images. Images store their information in pixels, but you will discover that it is not the value of each pixel that matters.

Learning Objectives

  • Understand how convolutional neural networks are essential to the fundamentals of Data and Machine Learning.

Intended Audience


Hello and welcome to this video on beyond pixels. In this video, we will talk more generally about feature extraction and we will introduce the concept of local patterns for image recognition. In the previous video, we have trained the model to recognize handwritten digits by feeding the raw values of the pixels in the image as a long sequence. The model performed pretty well on the training data but had some trouble generalizing to the test set. Is there a better way to proceed when dealing with images? The process of going from an image to a vector of pixels is just a simple case of feature extraction from an image. 

There are many other ways in which we can extract features from an image. And these include Fourier transforms, wavelets, histograms and many others. These are all methods that take an image in input and return a vector of numbers that we can feed to our model. The bank notes data set we used in the previous chapter is an example of features extracted from images. So if you're curious, just go read the documentation to see which features were used in that data set. Although really powerful these methods require very deep domain knowledge and each was developed to solve a specific problem in image recognition. 

It would be great if we could avoid using these special methods and just learn the best features from the image problem itself. Let's consider an image in more detail. What makes an image different from a vector, is that the values of the pixels are correlated both horizontally and vertically. It's the two-dimensional pattern that carries the information about what's represented in the image. And these two-dimensional patterns, like for example, horizontal and vertical contrast lines are specific to an image or to a set of images. So it would be great if we had a technique that is able to capture these patterns automatically.

 Also, if all we care about is recognizing an object, we should be insensitive to the position of the object in the image. And our features should rely much more on local patterns of pixels are arranged in the form and shape of the object than on the position of such group of pixels in the grid that represents the image. The mathematical operation that allows us to look for local patterns is called a convolution but before we learn about it we have to learn about tensors. And that's what the next lecture is going to be about. In conclusion, in this video we talked about feature extraction and how these methods require strong domain knowledge. We've also mentioned convolutions as a way to automatically detect local patterns. Thank you for watching and see you in the next video.

About the Author
Learning Paths

I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-­founder at Spire, a Y-Combinator-­backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.