This course introduces you to PyTorch and focuses on two main concepts: PyTorch tensors and the autograd module. We are going to get our hands dirty throughout the course, using a demo environment to explore the methodologies covered. We’ll look at the pros and cons of each method, and when they should be used.
- Create a tensor in PyTorch
- Understand when to use the autograd attribute
- Create a dataset in PyTorch
- Understand what backpropagation is and why it is important
This course is intended for anyone interested in machine learning, and especially for data scientists and data engineers.
To follow along with this course, you should have PyTorch version 1.5 or later.
The Python scripts used in this course can be found in the GitHub repo here: https://github.com/cloudacademy/ca-pytorch-101
Welcome back. In this lecture, we are going to investigate the concept of tensor, which is a fundamental building block in PyTorch.
As you can see, I have opened a Google Colab session here on my screen. I recommend you set up the same environment on your localhost, so that it will be easier for you to follow all the necessary steps I am going to show you in this lecture.
Everything in PyTorch is based on tensors operations. A tensor can have different dimensions: it can be 1-dimension, namely, it contains a scalar; or 2-dimension, namely a vector; or even 3-dimension, which is a matrix, in this case, or higher.
Let's have a look at how we can create a tensor in PyTorch. First, we import torch, and the nice thing about Google Colab is that it comes with the most popular Python libraries already installed for you, so that you do not need to worry about the installation of specific versions of some libraries that are typically included in the requirements dot txt file.
We then create some data as a list of lists, and we store this inside the variable my_data. In particular, we create a 2-by-2 array by creating a list of two lists, each of length two.
A Tensor is a multi-dimensional matrix containing elements of a single data type. In our case, we are going to use the 2 by 2 matrix to create a tensor. So the natural question now is: how can I create a tensor in PyTorch?
Well, a tensor can be easily constructed from a Python list or sequence using the Torch tensor constructor. You just need to pass the data inside the method, and a tensor will be created for you in the backend. The interesting thing is that Torch defines 10 different tensor types with CPU and GPU variants, depending on the data type.
We can access the tensor type using the attribute dtype. In order to see this, let’s create a variable called my_tensor containing the tensor we created before. And then I access the attribute dtype of this tensor, and we have an int64 dtype. And notably, associated with an int64 dtype is a torch LongTensor on CPU. In order to check it, you can use the type method applied on the tensor, and you see that it is indeed a LongTensor.
This type is going to change if you are on GPU. In particular, it’s going to be a torch cuda LongTensor on GPU.
There are many other dtypes available, and we will investigate them in this course. We have, however, implicitly introduced an important concept that is worth highlighting here: the concept of tensor memory.
Notably, tensors are defined as an array or matrix that contain data, which are ultimately allocated on a specific device. Such devices can be of two types: CPU or GPUs, depending on the machine. Hence, tensors live in what we call memory devices, and here comes the magic of PyTorch: As opposed to NumPy, where arrays can live only on CPUs, with PyTorch, you can specify where the Tensor is or where it will be allocated.
How do you do that? Well, that is somewhat easier: you can choose the memory device with the function device, but note that by default, tensors are allocated to a CPU device, since GPUs are not always available. And you can check the actual device with the attribute device: you see that the tensor is indeed on CPU.
In the last lecture, I said that we would use the colab environment because it has a great advantage: it gives us the possibility to work with GPUs for free. Indeed, it gives us a GPU inside a colab environment, and that is pretty easy to access: you just need to go here to the Runtime Tab, you then scroll down to the Change Runtime Type. You click on it, and a pop-up will appear, and ask you to select the hardware accelerator: you then select GPU.
After clicking, you save and the colab will automatically restart. So now, let’s run the previous cells again, and still, my tensor is on CPU, but now a GPU is available, and so we can allocate the tensor to the GPU by applying the to(‘cuda’) method on the tensor. Let’s wait a few seconds. And there we go! We’ve got a tensor which is exactly the same as before but now the tensor is on cuda, namely on GPU.
But wait a second, what is that zero that comes after cuda? Well, that’s the GPU id. More precisely, a torch Tensor constructed with the cuda device is equivalent to cuda:X, where X is the result of the torch dot cuda dot current_device method. And, indeed, if you run that method on a cell, you see that you get a zero! This is the first available cuda device: note that colab has just one GPU, so you will always have this number equal to zero.
There are many ways of building a tensor. For instance, you can build a tensor from scratch with either zeros or ones, as follows. In case you need a 2x2 tensor made of all zeros, just pass a tuple made of (2,2) inside the zeros method. Or you can do that with ones to get a tensor with all ones.
Now let’s store a new tensor inside the variable new_tensor. This is the result of the torch dot ones we showed before. We can access the dtype of this new tensor, and this is float32. Remember that associated to a dtype we have a tensor type: we can check the type of this tensor by applying type to new_tensor. And we see that we have a FloatTensor.
We can even create a new tensor starting from an existing one - in our case the new tensor - using the new_tensor method. This method requires some data to be passed, say a list of lists.
Let’s take new_tensor and apply the new_tensor method, passing my_data. Not surprisingly, I still get a 2 by 2 matrix, but look at the type now: instead of having int64, we have floats! This tensor has somehow hereditated the type of the new_tensor that we created from the ones, so this is a nice feature if we want to hereditate some features from existing tensors.
Let’s store this result inside the variable updated_tensor. Then I can access the dtype here, and indeed you see that it is of type float32.
Maybe you’ve already got the answer to this, but why should I use the new_tensor method?
Well, Tensors are arrays that contain data placed on either CPU or GPU memory, and they have some properties. Sometimes you do not want to change those properties, but you just want to hereditate such properties while updating the data. The new_tensor method allows us to reach this goal with a simple line of code.
The dtype is still float32, although the new data is made of integers. Is there any way to define a specific dtype when creating zeros or ones? The answer is yes. We just need to specify the dtype inside the call.
I am going to create a new tensor, stored in the variable another_tensor, which is a tensor made of all ones. But now I am going to specify the dtype argument inside the call, and this is torch dot int8 in this case.
If you run the cell and you inspect the variable, we get a tensor made of all ones, of type integer. Now, if you apply new_tensor to another_tensor, passing my_data, you get the same type as another_tensor.
Another interesting thing about the new_tensor method is that you can also specify the device: so we pass the argument device as equal to cuda, and we can check that we are in the GPU. Also, if we check the device, we see that it is indeed on cuda and we also have the index, which specifies the machine id - here we have just one GPU, so the index is always zero.
Also, another interesting thing is that you can even create a new tensor from the one you just created, but with different data. To create a tensor with a similar type to another tensor but with a different size, use tensor dot new underscore star creation operations.
So, for instance, we are going to create new data, and we create a list of lists made of three lists - this is going to be a 3 by 2 matrix. Then, I am going to call the new_tensor method on another_tensor, which is a 2 by 2 matrix, passing the new data, that is a 3 by 2 matrix. As a result, we get a 3 by 2 matrix, but with the same properties as the ones observed in another_tensor. It’s nice, isn’t it?
Finally, we can create a tensor with random values using the torch dot rand method, as follows. We create a variable called shape containing the desired shape, in our case 2 by 3. Then, we call the rand method by passing the shape, and we store this inside the my_tensor variable. We can inspect it to check that it is a 3 by 2 matrix of floats. This is because the rand method is going to generate elements between 0 and 1.
Obviously, you can play with it, but basically, it is pretty much the same thing as generating some random numbers in Python (or with NumPy) and passing them to the tensor. But since we are working with PyTorch, it is better to use this wrapper to generate some toy data
We can create a new tensor directly from another tensor using the class of like operations, as follows. Pick the new tensor. This retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden. To create a tensor with the same size and similar types as another tensor, use torch dot rand_like tensor. Also, if we add the specification dtype equal to torch dot float, this is going to override the datatype of the new data. We get a new tensor with the same shape as my_tensor but with a new type given by the dtype argument.
And that concludes this lecture on tensors, in which we saw how to create tensors in different ways. In the next lecture, we are going to see how to create a tensor using NumPy arrays, and we will check a few operations between tensors.
Andrea is a Data Scientist at Cloud Academy. He is passionate about statistical modeling and machine learning algorithms, especially for solving business tasks.
He holds a PhD in Statistics, and he has published in several peer-reviewed academic journals. He is also the author of the book Applied Machine Learning with Python.