Python - Under the Hood

The course is part of this learning path

Start course

Continuing on from Practical Data Science with Python, this Course explores a variety of Python features in a practical, hands-on way. It starts by looking at Python functions and you are given a guided walkthrough of a range of functions relating to data science. It then moves on to dictionaries in Python and how to create them. After that, you'll be guided through flow control in Python, loops, and finally, you'll be shown a technical demonstration in Python looking at classes, variables, and stringification.

Learning Objectives

The main objective of this Course is to enhance your knowledge of Python, and learn about Python functions, loops, dictionaries, and flow control.

Intended audience

This Course is intended for IT professionals who already have a good knowledge of Python and who want to enhance that knowledge from a data science perspective.



Hello and welcome back. Let's now take a look under the hood of Python. If I print out simple, what I get is this thing here. How in the world does a class know what to do when I ask it to print? It's because there's a default way that a class is told to print itself out. I can define this behavior explicitly. I can define the double underscore string method passing in, I'm going to assume, self. And I can say, when you're asked to turn yourself into a string what you're going to do is you're going to return a string, saying simple class object. It's going to say simple class object.

If I create this, if I call these things and then if I print out simple now, double underscore string special method, my simple object. So printing out the class is probably the wrong thing to do. If I instead print out my simple object, I'm getting out my string that says simple class object. Whereas before what I got was simple something, and then a big splurge of where it is in memory.

That's the default behavior of any class defined. I can overwrite that so that I can, when I ask this to turn itself into a string, which is essentially what I've done here.

So what is printing? Printing is taking an object, asking it to turn itself into a string if it isn't already one, and then passing that string out to the standard output so it displays on a screen somewhere. I just define that behavior. I can define things, like how my object would add itself to another one of itself. The reason you can add two strings together is someone has written an add method that simply defines what to do when it's given another simple. And this would tell my object what to do if it was asked to add itself to something else. I would return self.value plus another .value, for example.

This is what I could define addition to look like for my simple objects. If I have a look at my simple object, my simple object plus another, which is another simple object. Then we have to just change that to self. Create these objects. What I've done is I've added these two together using the addition operator. All that means is I've written, okay when you're asked to add yourself, get your value, get the value of the other thing and then add them. So I've just concatenated two strings together, for example.

So all object behavior in Python is just someone writing exactly what to do when asked to do certain things. I can define this for greater than, less than, minus, power of, division, everything. So the initialization method, the first parameter always says for this object the next one would be something like, for example, my value.

So you would pass in as another argument whenever I want to create a simple object, I want my internal value to be given by my value. So then now, whenever I had to create a simple object, I need to pass in an argument, which would be value one. And I can have another that is value two. Now I can create these. I updated this one anyway, so it's still changed. If I add them together, then et cetera, et cetera, et cetera.

So this is now how I can pass data into an object, this is how I would get a specific string, how to get a specific integer, how I get a specific dictionary. I just define how to pass things in and what to do with the things that I pass in.

One tip for this is that if I put in def and then an underscore and then tab, I get an idea of the sort of things I can define for this.

So I've got string, size of, you can implement how an object should calculate its length. So you've got less than, you've got less than or equal to. Just defines how objects can be less than or equal to. You've got all of these various special methods, and more beyond that. So in the actual online Python documentation, again there will be a list of them somewhere.

So now let's define a class which describes a pen. So the class is going to be called pen, pen should have a single attribute defined on initialization, called length. And what does that mean? It means we're going to have something coming in here that defines the length of the pen.

An internal variable called page, which is a list. So I could define an internal variable here, .internal is equal to an empty list, for example. This doesn't need to be given in, this is me defining internally. I have an internal variable. A method draw, which is going to append length times a hyphen to the list page. So it's called page here. And what does that mean? I'm going to have a method which simply gets the list and appends to the list a hyphen times whatever the length of the pen is.

So I hope this example isn't too contrived, I'm trying to keep it based in the real world. And then we have a string implication method, so this is string implication and it returns the page as a string. So it would just be str, brackets, and then the name of the list page. Okay, so we're going to have a look at this then. We want to define a class which describes a pen. Every object just describes something. I want a class called pen, nice and simple.

What if I said what I want a pen to have? So it should have a single attribute defined on initialization, called length. So this tells me that a pen should have, when I create it, a single value called length. And I define this on instantiation or initialization. So I would say self.length is going to be given by the value being passed in, which is length. But I could call this self.i_length for my internal length value.

So, what we're writing here is that whenever I create a pen, it's going to be given a single value and that value is going to be something to do with the length of this pen, aka the number of hyphens it should draw when it adds to the list. And an internal variable, called page which is a list.

Now, this I haven't said anything about yet. It's about defining this on initialization. That means I want to be equal to just an empty list when I create it. It is a list, I'm creating a list ready to be drawn on, for example. So I want to method draw, a method simply means a new function definition that's going to called draw. It always needs to have self, but it doesn't need anything else thankfully. And all this is going to do is it's going to take the page and it appends to the page the value of length, self.length times the hyphen.

So, self tells me where to go looking. I'm going to look for the internal variable page. Once I've got this, I'm going to call the append method, which belongs to all this. And what I want to append to this list is the value of length, so how long this pen is, times the number of lines and dashes that I want to draw.

So the only difference between this a normal function is having to have self at the start of everything to just point me in the right direction. If I didn't have self anywhere here, I would get an error when it was run because it would say, well there's nothing called page, there's nothing called length, there's nothing called this, that or the other. So this is the method draw which appends links to this page. And then our final bit is the stringification method.

So stringification, the special double underscore str method which again is always going to take in self, no other arguments. And what this is going to do is it's going to return. I'm going to do it simply as the internal list page as a string, this just turns the list into a string. Now stringification always has to return a string, you'll get shouted at if you try and return an integer or something like that. Its sole purpose is returning a string.

So now if I want to create this pen, hopefully if I run this, nothing goes horribly wrong. If I want to create this pen, which is going to be a pen object with length seven, I want to draw three times pen.draw. So yeah, pen.draw, I'm going to do that three times. If I just copy this three times. And then I want to print the pen object. Print pen. So because I have a length of seven, I'm going to get seven hyphens every time I draw added to this list. I do that three times, I now have three lots of seven hyphens added to this list. And when I print the pen out, because I've told my object exactly how to print itself, or turn itself into a string, I'm going to get this list of hyphens.

This is, admittedly, a rather complex part of Python. But it's massively useful when it comes to thinking about pretty much how anything works. And specifically any model or any data pipeline that we define is going to be done using classes and things like that. So it's basically just good to get an idea of A, how Python actually works under the hood, and B, how we can make objects and things.

About the Author

Delivering training and developing courseware for multiple aspects across Data Science curriculum, constantly updating and adapting to new trends and methods.

Covered Topics