Images as Tensors
The course is part of this learning path
Convolutional Neural Networks
In this course, discover convolutions and the convolutional neural networks involved in Data and Machine Learning. Introducing the concept of tensor, which is essential for everything that follows.
Learn to apply the right kind of data such as images. Images store their information in pixels, but you will discover that it is not the value of each pixel that matters.
- Understand how convolutional neural networks are essential to the fundamentals of Data and Machine Learning.
- It is recommended to complete the Introduction to Data and Machine Learning course before starting.
Hello and welcome to this video on images as tensors. In this video we will talk about tensors and introduce them and we will also explain how images can be thought of as tensors. Let's start with scalars. Scalars are just numbers, everyday numbers we are used to like zero, one, 23 and even rational and irrational numbers, two-thirds, pi, et cetera. These numbers have no dimensions. Then there are vectors. Vectors can be thought of as lists of numbers. The number of elements in a vector is also called the vector length and sometimes the number of dimensions. So, for example a list of three numbers is a three-dimensional vector. You may have seen these if you've started physics or engineering. The numbers in the list are the coordinates of a point in space with the same number of dimensions as the number of entries in the list. The vector zero, one, zero represents a vector in a three-dimensional space. Matrices are tables of numbers arranged in rows and columns.
For example, a gray scale image as we saw is a two-dimensional matrix where each entry in the matrix corresponds to the value of a pixel. Notice also that a matrix can be thought of as a list of vectors of the same length. In the case of the image, each vector would represent either a row or a column. In the same way, a vector can be thought of as a list of scalars which are numbers. This recursive construction where an object of a certain order can be thought of as a list of objects of the previous order allows us to organize scalars, vectors, and matrices in the larger family of tensors. Tensors can be understood as nested lists of objects of the previous order all with the same size. For example, an order three tensor can be thought of as a list of matrices all of which have the same number of rows and columns. These matrices are tensors of order two and since they have all the same number of rows and columns, the tensor of order three is actually like cuboid of numbers and we can find numbers by going along any of the three axis. Each number is identified by the row, the column, and the depth at which it's stored. We can formalize this idea in the concept of shape.
The shape of a tensor tells us how many objects there are when counting along a particular axis. For example, a vector has only one axis and the number will indicate the length of the vector. A matrix has two axis indicating the number of rows and the number of columns. An order three tensor has three axis indicating the number of rows, the number of columns, and the depth. Notice, for example that in this case here we have an order three tensor of shape two, two, four so we can think of this as a list of two matrices that are two by four, but also as a list of four matrices that are two by two if we go along a different axis.
Let's talk about colored images. A colored image is actually a set of gray scale images each corresponding to a primary color channel. In the case of RGB we have three channels red, green, and blue and each of these channels contains the pixels of the image in that particular color. This can be represented by an order three tensor and there are two major ordering conventions. If we think of the image as a list of three single color images like the one in the figure, then the axis order will be channels first, then height, and then width. On the other hand, we can also think of this image tensor as an order two list of vector pixels where each pixel contains three numbers and each of these three numbers represents a color. This is called channel last and it's the convention we'll use in the rest of the course.
This powerful representation will allow us to use convolutional networks to recognize objects in images. In conclusion, in this video we introduced the concept of tensors and we've seen that we can represent colored images as order three tensors. Thank you for watching and see you in the next video.
About the Author
I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-founder at Spire, a Y-Combinator-backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.