This course introduces you to PyTorch and focuses on two main concepts: PyTorch tensors and the autograd module. We are going to get our hands dirty throughout the course, using a demo environment to explore the methodologies covered. We’ll look at the pros and cons of each method, and when they should be used.
- Create a tensor in PyTorch
- Understand when to use the autograd attribute
- Create a dataset in PyTorch
- Understand what backpropagation is and why it is important
This course is intended for anyone interested in machine learning, and especially for data scientists and data engineers.
To follow along with this course, you should have PyTorch version 1.5 or later.
The Python scripts used in this course can be found in the GitHub repo here: https://github.com/cloudacademy/ca-pytorch-101
In this lecture, we are going to continue discussing tensors, but in particular, here we will focus on two main concepts: first we will look at how to create tensors starting from a NumPy array. And second, we will start performing operations between tensors.
So the nice thing about Torch tensors is that they are flexible and can live on GPUs. We can even create Torch tensors from NumPy arrays. This is pretty interesting since it might help us in building Tensors starting from previous NumPy outputs.
So, we import NumPy as np and I also create the variable NumPy_data, which is nothing more than the call of the NumPy random dot rand method with 2 rows and 3 columns. Remember that this array lives only on the CPU.
Now, we create a torch tensor from a NumPy array using the method from_NumPy. We just need to pass the NumPy_data to the from_NumPy method, and we assign the result of this call to the variable torch_NumPy!
You can easily see that we have a 2 by 3 tensor made of float64 values. Please note that torch is still in my memory, and I do not need to import it again. In other cases, just import torch as we did in Lecture 2.
This is fine but please note that more recent versions of PyTorch allow you to pass NumPy data directly inside the tensor function. Assign this object to the variable tensor_NumPy_direct. Same result as above, right?
Consider now the case in which you need to work with GPUs - for example, to train a complex Machine Learning model - but then you need to go back to the CPU to perform ex-post analysis - because on your local machine you do not have any GPUs.
You can move the tensor that was on the GPU to a simple NumPy array with just a single line of code. You need to call the tensor - in our case tensor_NumPy_direct - and then you apply the cpu method and the NumPy method. In this way, you are going to get back a NumPy array, which is entirely on CPU memory, and therefore you can play with it without the need for a GPU.
As I said, in this lecture, the objective is twofold: we have already seen how to create tensors from a NumPy array. But now I want to go further and show you how to perform operations between tensors. Being able to perform operations between tensors is absolutely crucial. We’ll first investigate the possibility to concatenate two or more tensors.
Let me proceed as follows: I am going to create another NumPy array like the one you see here - so I just copy it and paste below here - and let me check it: it is going to be different from the previous one since it is a random generation.
Now, I am going to create a new tensor using the NumPy data, and we assign it to the variable my_tns2. For simplicity, I am also going to rename the torch_NumPy tensor to my_tns1 - it is easier this way to perform operations between the two tensors.
To concatenate two or more tensors, we can use the cat method: this method is similar to the pandas concat method. You pass a list of objects you wish to concatenate - in our case, my_tns1 and my_tns2 - and also, inside the function, you need to specify the dimension. In our case, this is set to be equal to zero, meaning that we are concatenating with respect to the row dimension. If we run this, we get a 4 by 3 matrix, as expected.
We can concatenate obviously with respect to the column dimension by setting dim equal to 1 - and in this way, we are going to get a flattened matrix, i.e. we are going to get a 2 by 6 matrix. Makes sense, right?
Another useful operation we can perform on a tensor is the following. We can clip the tensor if we believe it is better to apply a lower and an upper bound to the generated data. This makes a lot of sense if we believe there is an upper or lower bound in true data distribution. Think about the temperature observed in a precise location. It makes sense to think that that temperature is bound by certain values.
You can therefore transform the data using the clip method on the tensor, and you specify two elements: the lower bound and the upper bound. You can see that the above values have been bounded by this clip method consistently.
Now let us go back to operations between tensors. Let’s investigate two main operations: row-wise and matrix multiplication. Let’s start with element-wise multiplication. The Elementwise multiplication of two matrices is also called Hadamard product, and is done with the method mul. You just need to apply the mul method to my_tns1 and you pass the tensor you wish to multiply - in our case my_tns2.
Since it is an element-wise multiplication, we got a tensor with the same shape as the starting tensors, where each element is the product of the two tensors’ values.
You can easily replicate this operation using the star operator between the two tensors, as follows, and you get the same result.
Instead, matrix multiplication is done using the matmul method. The logic is the same: this method is applied to a tensor and we pass the other tensor. Let’s try to run it. We get an error: it says that mat1 and mat2 cannot be multiplied. Hence we need to transpose the second matrix. Now you get the correct output made of a 2 by 2 matrix. This result can also be obtained by using the @ operator between the two tensors, and we get the same result.
Now suppose you have a scalar tensor - a tensor made of just one element - you can convert it to a Python value using the item method.
For example, let’s create the variable my_sum containing the sum of my_tns2. We can get this using the sum method on the tensor. If we inspect this variable, we can easily see that it is a scalar. As I said, we can convert it to a Python numerical value using item. If you then apply the type to this element, you see it is not a tensor anymore but it’s a float! I recommend this operation if, for some reason, you don’t want to work with tensors.
The nice thing about PyTorch is that we can create a grid of integers using the arange method, and we can specify the number of elements we wish to have - in this case, twenty.
So why is this useful? Well, once we define this new tensor, we can actually create a matrix from it using the reshape method. Reshape is pretty useful and requires you to specify the number of rows and columns you wish to have as a result of the reshape operation - in our case 5 rows and 4 columns. We assign this object to the variable matrix A. A simple inspection confirms our reshape operation.
Now, we can clone the matrix we just created with the clone method, and store this new result into the variable matrix B. This is obtained by calling the clone method on matrix A. And it’s as simple as that!
Now, we can sum the two matrices, say matrix B plus matrix A. We can compute the mean using the mean method. But please note that we get an error here coming from the application of the mean method on matrix B. Why is that? Well, that’s because in PyTorch you can’t compute the mean of integer types.
Hence we need to convert to the long type instead. So we go back to the definition of matrix A, and inside arange, we specify the dtype as equal to torch dot float32. This way we are forcing arange to generate a grid made of floats.
Let’s run back to the previous cells, and now if we run the mean method on matrix B, everything works correctly.
By default, the mean is computed on all values, but you can specify the dimension as well; you can compute the mean by column by specifying dimension equal to zero.
The same result can be obtained by calling sum with arguments axis equal to zero, and dividing it by the number of rows of the tensor - easily obtainable by using the matrix B shape in position zero.
Ok, that concludes this lecture. In the next one, we will look at another very important concept: the autograd module. See you there!
Andrea is a Data Scientist at Cloud Academy. He is passionate about statistical modeling and machine learning algorithms, especially for solving business tasks.
He holds a PhD in Statistics, and he has published in several peer-reviewed academic journals. He is also the author of the book Applied Machine Learning with Python.