1. Home
  2. Training Library
  3. Programming
  4. Programming Courses
  5. Introduction to Third-Party Package Management in Python

Introduction to PIP and the Python Package Index (PyPI)

Introduction to PIP and the Python Package Index (PyPI)
Overview
Difficulty
Beginner
Duration
51m
Students
9
Description

This content is designed to introduce new developers to managing third-party packages using the Python package index and a tool called Pip. This course is part of a series of content designed to help you learn to program with the Python programming language. 

Learning Objectives

  • The Python Package Index
  • The tool used to manage packages called pip
  • The tool used to create isolated Python environments named venv
  • Considerations to be made before installing a third-party package

Intended Audience

This course was designed for new developers wanting to learn Python. 

Prerequisites

  • That you’re familiar with the Python runtime
  • That you’re familiar with the Python language syntax
  • This course assumes that you’re not yet familiar with how to install and manage third-party Python software
Transcript

Python’s standard library includes many different object types - used for a variety of purposes. And these libraries serve as a great starting point for building our own apps. 

In addition to the object types included in the standard library, we can also use code created by other developers. If another developer was to share their code with us we could use it in the same way we use the standard library. 

Python has existed for decades and over the years the community of Python developers have created many thousands of modules and a mechanism to share them. The Python community maintains a system known as the Python package index. 

The Python package index is a repository of Python code. Let’s break this down a bit. Recall that in Python a module is a collection of related code used for some shared purpose. And a package is a collection of modules used for a shared purpose. 

When referring to the Python package index the term package doesn’t strictly refer to Python packages. Rather the term package in the Python package index references a “Distribution package” consisting of a collection of files. 

While a package from the index typically will be a python package or module it could also be other required files. For example, there might be directories containing text-based templates, images, etc. 

The world of computer sciences has a tendency to overload words with multiple contextual meanings. This is one of those cases. So, just know that there is a distinction between a distribution package and a Python package that you’d import.

When we talk about packages in the context of the package index, know that you’re talking about a higher-level concept than python’s importable packages. We’re talking about a mechanism for distributing Python packages, modules, and other required files.

The Python package index consists of two parts. There’s the repository of packages - and a tool for downloading and installing these packages. We can view the repository using the pypi.org website. The website allows us to search for Python modules across different use cases. And an application called pip is used to download and install packages.

If the package includes Python packages and or modules then we can import those into our code; giving us an easy way to gain new functionality without having to develop the code for ourselves. This is the value of the Python package index. There are many thousands of packages published to the index. There are packages containing code to solve all kinds of problems. 

For example, there are text-based template engines; web application servers; natural language processing libraries; video game engines; and so much more. 

Let’s review the website used to view the package index and pip the installer tool. We’ll start with the website.

Now, depending on when you’re watching this, the UI may have changed. However, that shouldn’t matter too much. Because the core functionality will be the same. The functionality to focus on is the ability to review package details. 

With this UI, we can search for a package directly or browse by category. Its search capabilities feel a bit basic. However, this search is useful if you already know the name of the package you need. 

The browse functionality is a bit more useful for exploring packages using filters. However, even this is a bit limiting. The search isn’t really the functionality worth reviewing. We’ll talk more about locating useful packages later. For now, we’ll focus on what I would consider to be the most useful aspect of this website: the detail pages for packages. 

Check this out. This is the detail page for a package called Rich. Let’s review some of the interesting bits of the detail page.

This section here includes the command to run to install this package. This also includes a useful copy button to copy the command to the clipboard. 

On the left, we have links to determine which content is displayed in this center section. By default, we see the project’s description. Which in this case is well written. With a high-level overview of the project along with code samples and images. 

Take a mental note here. This is an example of good documentation used to introduce this project. Documentation is at least as important as the code. So, use this as a minimum guide for when the time comes to create your own project intro documentation. 

Okay, getting back on track. Using these other links we can toggle this center content to display the release history and any downloadable files. Next, we have links to the project’s homepage and documentation. The home page will often link to the source code for the project. Or a page which includes a link to the code. 

These statistics here are pulled from the website where the code for this project resides. If you’re familiar with github then these will make sense to you. If you’re not familiar with GitHub, it’s a social coding platform designed to track changes in code. And these details relate to features of that site that we won’t get into in this content.

This next section defines the license for this project. It’s important to know that code is licensed by its creators. And the licenses define the limitations for using the code. 

For example: 

  • Can the code be used in a commercial app? 
  • Can the code be modified?
  • Can the code be redistributed?
  • Etc…

There are many different licenses. Some are very permissive and allow code to be used as-is however we choose. Others are very restrictive. If you’re developing code for a large company, there’s a good chance they already have a list of allowed or disallowed license types. 

For everyone else, take care to understand the limits of a license before using the code. This is a topic that can become confusing. However, there are a few websites that attempt to explain software licenses more clearly. I recommend checking these out and gaining at least a high-level sense for some of the common categories. 

The section following the license is a list of the project’s maintainers. Followed by the classification tags for the project. 

Okay, so this is a high-level review of the website. It’s useful for reviewing details about a package stored in the Python package index. It helps us as developers to know at a glance all the high-level details about the package. It also includes some basic mechanisms for locating a package.

Let’s shift our focus to talk about installing one of the packages from the index. This site is a listing for thousands of packages that we can download and use in our code. And the tool we use to install packages is called pip. Which stands for package-installer-for-python.

Pip is built into many modern versions of Python. Making it available without having to install anything extra. Pip can be run from the command line and used to install, update, and remove packages from the package index. We’ll cover how to use pip in another lesson. 

For now, just knowing that pip is a tool used to download and install packages from the Python package index is good enough. Let’s shift gears to talk about an incredibly important aspect of installing these packages. And that’s security. 

The code in these packages is what would be considered untrusted code. I want you to add that phrase to your developer vocabulary. Untrusted code is basically any code that we haven’t thoroughly reviewed to ensure its trustworthiness. 

Imagine downloading a package from the package index and running the code. Then minutes later all of your files start disappearing. The code that exists inside these packages could be intentionally or accidentally dangerous. 

The code could include bugs that result in unexpected files being deleted. Or it could be malicious; it’s not unheard of for malware to exist in packages from the index.

Now, the point isn’t to scare you away from using these packages. Rather, it’s to suggest that we require an additional layer of consideration for how we select packages. Which we’ll talk about this in another lesson. However, I wanted to call attention to the need for security before demonstrating how to install packages. 

Okay, this seems like a natural stopping point. Here are your key takeaways for this lesson:

  • The Python Package Index is a repository of community-created Python packages.

  • Each of the packages listed falls under a license. 

    • Licenses specify how developers can use a package

  • The package installer for Python is named pip.

    • Pip is a tool used to install, update and remove packages. 

  • The code on the package index should be considered untrusted code.

    • Untrusted code is code that we haven’t fully reviewed

    • Which means it could contain buggy or malicious code.

Alright, that's all for this lesson. Thanks so much for watching. And I’ll see you in another lesson!

About the Author
Students
98626
Labs
27
Courses
45
Learning Paths
55

Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.

Covered Topics