The course is part of these learning paths
Course Introduction and Overview
Production and Course Conclusion
This course provides an introduction to how to use the Kubernetes service to deploy and manage containers.
Be able to recognize and explain the Kubernetes service
Be able to explain and implement a Kubernetes container
Be able to orchestrate and manage Kubernetes containers
This course requires a basic understanding of cloud computing. We recommend completing the Google Cloud Fundamentals course before completing this course.
Updates: At 5:53 Adam says Heapster needs to be installed for autoscaling. Kubernetes has deprecated Heapster in version 1.11 and will retire Heapster in version 1.13. Heapster has been replaced by the metrics server, which should be used instead from version 1.11 onwards.
Hello and welcome back to the Introduction to Kubernetes course for Cloud Academy. I'm Adam Hawkins, and I'm your instructor for this lesson.
The previous lesson covered pods and service discovery, but I've got to be honest with you, we've kind of been cheating a bit so far. You're not really supposed to create pods directly. Instead, a pod is really just a building block. They should be created via higher level abstractions, such as deployments. This way, Kubernetes can add on useful features and higher level concepts.
This lesson covers deployments, scaling and rollouts. Again, I'll be honest here, my objective is to actually wow with Kubernetes features. I'll do that by showing you how to manually horizontally scale an application, also configuring autoscaling and finally, starting, stopping, and resuming rollouts. This lesson builds on the code in the previous lessons. If you jumped into this lesson, and you're not familiar with pods or services, then I recommend you take the previous lesson before continuing. Also, this is the longest lesson in the course, but it's also my personal favorite because we're covering the material that got me the most excited about Kubernetes. Best of all, there's not many slides. Just hands-on work making the magic happen. Let's dive in by converting existing pods to deployments.
Start by creating a new namespace.yml for this lesson, just like we've done in the last lesson. A deployment is a template for creating pods. A template is used to create replicas. A replica is just a copy of a pod. Applications scale by creating more replicas. This will be more clear when you see the YAML files and as we demonstrate more features throughout this lesson.
Create a new data-tier-deployment.yml file with your editor. The apiVersion is actually different this time. This is the first time we used non-stable APIs. Deployments are currently in the extension API as of Kubernetes 1.5. But do not let the beta tag scare you. Deployments are solid, and are production-ready. Here, kind is set to Deployment. The labels are omitted because we do not need them on the deployment itself. Next comes the spec. The deployment spec contains deployment-specific settings, and also a pod template. The replicas key sets how many pods to create for this particular deployment. Kubernetes will keep this number of pods running. Set the value to 1 because there cannot be multiple redis containers. The previous pod configuration can be copied into the template. This is the same metadata and spec from the previous lessons. Note that we do need labels here to associate pods with the service.
We can complete the same process for the app and support tiers. Also set replicas to 1 for both cases. I'll fast-forward to the editing process. It's just creating the new files with the same data inside the template. So sit back, relax, and let's fast-forward through this.
Now create everything with kubectl. You can reuse the services from the past lesson. Just make sure you specify the correct namespace.
Once you're done, get the deployments. Kubectl displays three deployments and their scaling information. Note that they all show one replica right now. So remember that horrible scenario I described at the end of the last lesson with peppering v1 and all that onto the end of the pods? Well we can see how deployments solve the same problem by asking K8s for the pods.
Note that each pod has a salt at the end of it. The deployment has added uniqueness information automatically to identify pods of a particular deployment version. We can see how this works by running more than one replica. Alright, time to have some fun. Let's go a bit crazy by scaling up the support tier. And by crazy, I mean incrementing that counter really, really fast. I know, crazy, right?
Kubectl includes the scale command for modifying replica counts. The scale command is doing the same thing you'd do by editing the file, and then running kubectl apply. It's just optimized for this one off-use case. Now we can check the pods again to see what happened. Note that the support tier pod showed two of two ready. This is because replicas replicate pods, not individual containers inside of a pod. So if you need to scale an individual container separately from others, then you must create a separate pod. Deployments ensure that the specified number of replicas are kept running. So we can test this by killing some pods and watching K8s bring them back to life.
Alright, so K8s can resurrect pods and make sure the application runs the intended number of applications. Nice to know that goes according to plan.
Let's go even more crazy. Scale up the app server. Look at you, just scaling containers like nothing. Kubernetes did all the heavy lifting for us. It's probably even done more than you think. Here's the rundown: K8s has created an internal load balancer for the application service, also associated all containers with that load balancer, seamlessly horizontally scaled an entire application tier, monitored the pods to restore dead pods, and also done all of the networking. This is really powerful stuff. K8s even has more tricks up its sleeve. We just scaled up the application quite a bit. Naturally, it will be hard to actually use this many containers. Really, do you think this counter application can put any load on a CPU? This scenario is just like most real-world applications. They do not see constant load, thus they don't need a constant number of containers. Instead, their load varies according to a few factors. It's just more efficient to use autoscaling.
Kubernetes supports CPU based autoscaling. The kubectl autoscale command creates a horizontal pod autoscaler resource. Autoscaling works by providing a target CPU percentage minimum and maximum number of replicas. Kubernetes will increase or decrease the number of replicas according to the CPU usage. There are two prerequisites for this feature. First, Heapster must be running on the cluster, and pods must request resources. Let's quickly touch on these concepts. Heapster is a metric collection tool for Kubernetes. It runs as a pod in your cluster and auto-discovers resources and reports data via syncs such as logs, InfluxDB, and more. Resource requests are requests for a certain amount of CPU or memory from the cluster. Do not get this confused with the existing resource term. Resource refers to CPU or memory in this context. Requests specify the minimum allocation, and limits specify the maximum allocation. This is not necessarily a hard rule. You have to consider the context, differentiate between compute, aka CPU and memory, and K8s, aka pods versus etc resources.
Installing Heapster is the first step. Heapster is available through a Minikube addon. Refer to the Heapster docs if you are not using Minikube. The minikube addon command can enable a new addon. This command creates the Heapster deployments and other Kubernetes resources. It may take a few minutes for Heapster to boot fully the first time. Here we can watch the pods in a kube-system namespace until it's ready.
You can test with these two commands. These commands are essentially Unix Top, but for Kubernetes. We don't need them right now, but I imagine you can see how useful these commands are in the real world. You'll know it's working once you have the metrics coming in. Note that it may take a few minutes for metrics to report in after the first boot. Remember to use the watch command to continually check the status if you need to. We're ready to move on to the deployment once Heapster is up and running.
Reopen app-tier-deployment.yml in your editor. We'll request a small amount of CPU for this exercise. At value 100m refers to 0.1 CPU. You may also specify integers as well. Again, refer to the official Kubernetes docs for more info on how CPUs are measured. Kubernetes will only scale this pod on nodes with at least 0.1 CPUs remaining. Our Minikube has a single node. This means we can use at most 10 replicas before exhausting the CPU. Set replicas to 8 for now to keep the deployment from exhausting the CPU.
Save to file and apply the changes. This completes the prereqs. Now check the deployment status before we enable autoscaling. Note the desired and current numbers. Then enable autoscaling.
The min and max options set lower and upper bounds on running replicas. The CPU percentage option sets the target CPU percentage. Kubernetes will increase the number of replicas when the average CPU usage across the pod is greater or equal to 70 percent. Conversely, Kubernetes will decrease the number of replicas when the average CPU usage across the pod is less than 70 percent.
Now we can watch the deployment until the autoscaler kicks in. Well would you look at that, the counts updated. K8s does not disappoint. Pretty slick, huh? This command created a horizontal pod autoscaler resource. We can find it via kubectl. Kubectl also accepts shorthand notation for resource types. Now it would be painful to type out horizontal pod autoscaler many times. Hell, I have a hard problem actually saying that. So we can use hpa instead.
We could run kubectl get for a full list of shorthand notations. Kubectl displays target accounts in the current metrics. This is enough information for a quick readout on pod capacity. It's also useful when combined with kubectl top. Unfortunately, the autoscale command can only be invoked once because it does not support updating an existing resource. We can overcome this error with kubectl edit. Kubectl edit opens up the specified resource in your editor, then applies the changes. It's the same as editing files yourself and calling kubectl apply.
Let's experiment by increasing the minimum pods. Now you can watch the deployment until the autoscaler bumps the minimum number of replicas. This wraps up the scaling tour of my friend, officially acquired container scaling skills. Let's shift focus to deploying code or configuration changes.
A Kubernetes rollout is the process of updating or replacing replicas with replicas matching the new deployment template. Changes may be configuration, such as environment variables or labels, or also code changes by updating the image key. In a nutshell, any change to the deployment's template will trigger a rollout. We've actually already used deployments without even knowing it. Deployments have different rollout strategies. Kubernetes uses rolling updates by default. Replicas are updated in groups instead of all at once until the rollout completes. Kubectl includes commands similar to rollout status, you can pause, resume, and rollback them. Let's see how these work.
First, delete the existing autoscaling configuration. We'll need many replicas to catch rollouts in action. We don't want the autoscaler running at the same time in this exercise. Next, open the example app tier with kubectl edit. Set a large number of replicas. It'll be easier to see the rollout in action with a large number of replicas. Also remove the resource request, otherwise Kubernetes will not be able to schedule all replicas on our puny single-node Minikube cluster. Now watch the deployment until all the replicas are ready.
Time to trigger a rollout. Remember that any change to the deployment's template triggers a rollout. Open the app-tier deployment with kubectl edit. Add a new environment variable. Any name value will do. This is just a non-functional change for us, but it still demonstrates the functionality. We can immediately watch the rollout status with kubectl if we're fast enough. Kubectl rollout status streams progress updates in realtime. You'll see new replicas coming in and old replicas going out. Repeat this exercise until you see the entire flow. Experiment with the number of replicas, max surge, and max unavailable as you please. Refer to the docs for how these settings control rollout speed.
Rollouts may also be paused and resumed. Here, I've split my window into two. I'll control the deployment from the left pane and watch the status on the right. I've added another environment variable already to the YAML file, so we can apply this change to trigger a new rollout. We can pause the rollout when it's in progress. Now the rollout is paused, just like a VHS tape, but pausing is not immediate. Replicas that were created before pausing will continue. However, there will be no new replicas created after the rollout has paused. Here we can try a few things at this point. One thing we'll do is probably inspect the pods before deciding to continue or rollback. Let's say that everything is a-okay and opt to continue. The rollout picks up just where it left off and goes about its business. So now consider you found a bug in this new version and need to rollback. Kubectl rollout undo is just the ticket. You may also rollback to a specific version. Use kubectl rollout history to get a list of all versions, then pass the specific revision to kubectl rollout undo.
This concludes our exercise with deployments and rollouts. This has been our longest lesson, but I think it's also been the coolest. Deployments and rollouts are very powerful constructs. Their features cover a large swath of use cases. Personally, I'm really excited about what strategies the future holds for us. Is anyone else excited about perhaps a canary strategy, or even blue-green? It's a great time to be Kubernetes users, and it's only getting better.
Let's reiterate what we covered in this lesson. We learned how to manage applications using deployments, horizontally scale deployments with replicas, configure horizontal autoscaling, and most importantly and perhaps even the coolest, start, pause, resume, and undo rollouts.
There's still so much more we can do with deployments. Rollouts depend on container status. K8s assumes that created containers are immediately ready and the rollout should continue. This does not work in all cases. We may need to wait for the web server to accept connections. Here's another scenario, consider an application using RDMS. The containers may start, but it will fail until a database and tables are created. These scenarios must be considered to build reliable applications. This is where probes and init containers come into the picture. We'll integrate probes and init containers in the next lesson. See you there.
About the Author
Adam is backend/service engineer turned deployment and infrastructure engineer. His passion is building rock solid services and equally powerful deployment pipelines. He has been working with Docker for years and leads the SRE team at Saltside. Outside of work he's a traveller, beach bum, and trance addict.