Securing the Deployment Pipeline
The course is part of these learning paths
This course explores how to secure your deployment pipelines on GCP. We will cover the four main techniques to securely build and deploy containers using Google Cloud and you will follow along with guided demonstrations from Google Cloud Platform so that you get a practical understanding of the techniques covered.
If you have any feedback relating to this course, please contact us at email@example.com.
By completing this course, you will understand:
- The advantages of using Google managed base images
- How to detect security vulnerabilities in containers using Container Analysis
- How to create and enforce GKE deployment policies using Binary Authorization
- How to unauthorized changes to production using IAM
This course is intended for:
- Infrastructure/Release engineers interested in the basics of building a secure CI/CD pipeline in GCP
- Security professionals who want to familiarize themselves with some of the common security tools Google provides for container deployment
- Anyone taking the Google “Professional Cloud DevOps Engineer” certification exam
To get the most out of this course, you should be familiar with:
- Building CI/CD pipelines
- Building containers and deploying them to Kubernetes
- Setting up IAM roles and policies
So at this point, I've introduced a few tools that you can use to build a more secure pipeline. But before I wrap things up, I wanna give you a practical demonstration of how they can all work together. I've already created a very simple build and deployment pipeline. All I had to do was create a new Cloud Source Repository and then I added a Cloud Build trigger.
You can see the Cloud Build trigger is called Build-helloworld2-docker-image and the repository is called helloworld2. The trigger will fire on any updates to the master branch. Now, when that happens, it looks for a cloudbuild.yaml file in the repository and it will execute any commands inside. This file is what actually defines the steps of my pipeline.
You can see that there are just three steps. The first step builds the new container image from my helloworld2 app. The second step pushes the new image to my Artifact Registry. And the third step will deploy the new image to Cloud Run. So the basic idea is a developer can test the code after pushing any new changes. The application is a very simple Python Flask app. And it basically just serves up a static webpage.
Now, once the developer checks in any changes, it will automatically build the image, store the image in Artifact Registry and then deploy that image to Cloud Run and the dev can verify that everything works. I've also created a staging and production Kubernetes cluster, but deployment to those is still manual. Like I said, this is a very simple pipeline. In the real world, you probably will use something with a lot more steps and with more automation, but this is just enough to demonstrate what I want.
I've also grouped some of my browser tabs under DEV. Now, these include the Cloud Build trigger, my Artifact Registry, and then the Cloud Run execution of my app which you can see is currently at version 1.0. I also have some PROD tabs and these contain my production and staging clusters, as well as my deploy production app which you can see is also currently at version 1.0.
Next, let me demonstrate how an update actually works. I will create and push a new version 1.1, and you will see it automatically build and deploy. After I verify everything works, I'll go ahead and deploy the new version to the production cluster. Now, I'm gonna be fast forwarding a lot because some of these steps take a while to complete. So now that I've pushed some new changes, you can see the Cloud Build trigger fire off a new build job. That's going to create a brand new image. And finally, that image will be deployed to Cloud Run.
And there we go, version 1.1 is up and running in my development environment. I still have version 1.0 deployed to my production cluster, so let me update that as well. I'm going to undeploy the old version and deploy the new version 1.1. So as you can see, it's very simple, but it works.
So now that you understand how the pipeline works, let's try to improve security. The first thing I wanna do is to see if there are any security vulnerabilities in the current application. I've a very simple app that just serves up a static HTML page. So you'd think there wouldn't be too many problems. But an easy way to verify would be for me to enable automatic scanning on Container Analysis.
Now, there are already a couple of container images in my registry that have not been scanned. I will go ahead and enable automatic scanning and you can see the results. Now, if you're still using Container Registry instead of Artifact Registry, don't worry. It works exactly the same way, and you can enable it using the same options in Container Registry.
So Container Analysis is now enabled. You might expect it to go ahead and start scanning all the images, but that's not how it works. Every time you scan an image, it costs money. So even though I enabled automatic scanning, it won't start scanning old images. Anything I push from now one will be scanned, however. This is useful in case I have many, many, many images already in the repo and I don't wanna pay to scan them all. If I really wanna scan an old image, I can just re-push it.
So now that automatic scanning is enabled, I'll make a change to the app and you can see what happens. With the new update, an automatic scan has been started. In a short while, I'll show you the results. You might be surprised at the number of vulnerabilities it found. Even in this simple helloworld app, it found over 400 issues. Let me click on that number to show you the details.
It looks like there are critical and high issues present. That's not ideal. I can get further details by clicking on the individual issue. Luckily, it appears that some of these have fixes available. Now, I realize one of the problems here was that I did not include any update commands in my Dockerfile. Let me add code to do an apt update and full-upgrade. Then we can see if that will fix some of these issues. Well, even with the full-upgrade command, I still have a lot of vulnerabilities. Plus, it did not eliminate the critical and high issues, which are the ones I actually am concerned about.
Now seems like a good time to switch to a Google-managed image. My Dockerfile is currently using the Python image. I picked it because I knew I needed Python, but I guess I should have chosen more carefully. Let's look and see if Google has a Python image I can use instead. Yeah, here is one. Let's try using this one and see if there's a difference. You can see that the Google image helped immensely. I no longer have any critical or high issues. There's definitely a difference. Remember, Container Analysis is looking for vulnerabilities in Linux. It's not gonna detect every possible issue.
For example, if I add an insecure method to my Flask app, Container Analysis is probably not gonna see that. So in a real production environment, I would also wanna do my own security testing as well. But at this point, I've greatly improved the security of my container images. I'm now starting out with a recently patched base served by a trusted host. I'm also automatically scanning and identifying any known security vulnerabilities, so if there's a problem, I know about it and can fix it.
So now I can distinguish between safe and unsafe images, but nothing is currently preventing me from deploying the unsafe ones to production. Recall that I was able to deploy version 1.1 of my app, even though it has a lot of security vulnerabilities. For this next part, I will show you how to enable Binary Authorization on the production cluster.
As you can see, turning on Binary Authorization is pretty simple. However, I'm still not done. So far, all I have done is told my cluster to follow the default policy. I should look at what the default policy actually is. The default policy is set to allow all images. So right now, I can still deploy any image. It's important to note that you need to enable Binary Authorization on the cluster and create a custom policy. Skipping either of these steps will prevent you from actually changing anything. So to demonstrate how we would block something, let me go ahead and say block all images.
Now, even though I selected block all images, there are still some exceptions specified. By default, it's going to allow Google hosted images to be deployed, but that's fine. I wanna demonstrate blocking my own custom images, so this policy will still work. And there we go. The images from my Artifact Registry are now all blocked. Let me verify. I'll remove the previous deployment and then I'll try to redeploy. There we go, blocked by policy.
Blocking every image isn't a very realistic policy, so let me show you now how to block some, but allow others. In order to approve certain images, I need to create something called an attestor. An attestor will allow me to sign an image and mark it as safe. So here is what I need to do. In order to allow signed images to be deployed with Binary Authorization enabled, I need to create an attestor.
In order to create an attestor, I need a key. In order to create and store a key in GCP, I need a keyring. So I need to create a keyring first to hold the key. Then I need to create a key in the keyring. Then I can use the key to create an attestor. And finally, I can update the Binary Authorization policy to accept images signed by the new attestor.
First, let me create the keyring. Next, I will create a key. Now, I can create a new attestor. And finally, I can update the policy to allow images approved by the attestor. Once you have an attestor, there are a few ways to actually sign an image. The easiest way is to just run a command in the terminal. This command requires that I pass in a bunch of values, such as project ID, the artifact I wanna sign, and the attestor. I also need to pass in the keyring, key name, and even the key version.
So now that I've signed the image, let's see if I can actually deploy it. Now that I have signed the image, it can successfully be deployed to my production cluster. By extending this principle, I can do some interesting things. I can create an attestor for the QA team to manually certify all images that pass testing. I can create a security attestor that would be used to sign images that passed an automated security scan using the results from Container Analysis.
Kubernetes clusters can be configured to require images to be signed by one or multiple attestors. I could also chain clusters and attestors together. For example, when a new image is created, it would have to pass the automated security scan before being allowed into the testing cluster. Then the image would need to be signed off by QA before being allowed into the staging cluster. And finally, the image would have to be signed by the operations team in order to be allowed into production. In this way, you can easily build and enforce as many automated and manual checkpoints as you desire. Your pipeline is no longer a suggestion, it is a requirement.
Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.
Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.
When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.