The course is part of these learning pathsSee 9 more
Kubernetes has become one of the most common container orchestration platforms. Google Kubernetes Engine makes managing a Kubernetes cluster much easier by handling most system management tasks for you. It also offers more advanced cluster management features. In this course, we’ll explore how to set up a cluster using GKE as well as how to deploy and run container workloads.
- What Google Kubernetes Engine is and what it can be used for
- How to create, edit and delete a GKE cluster
- How to deploy a containerized application to GKE
- How to connect to a GKE cluster using kubectl
- Developers of containerized applications
- Engineers responsible for deploying to Kubernetes
- Anyone preparing for a Google Cloud certification
- Basic understanding of Docker and Kubernetes
- Some experience building and deploying containers
Now I want to show you how to identify and fix any issues with your GKE clusters. The first step to fixing any problems with Kubernetes is to look at the logs. Let me show you how to do that.
First, log into the web console. Then navigate to the Kubernetes Engine page. Here you will see a list of your clusters. If you are having problems with a particular cluster just click on the name and then you can click on the “Logs” tab here. This will provide you with a simplified list of events that have occurred. You can expand each entry to get more details. You can also filter by level of severity or search for text strings to help narrow down the list. You can also access logs for the autoscaler here as well.
Now as I said, this is a very simplified view. You can get even more details by jumping to Logs Explorer by clicking on this link here. This gives you an expanded set of records and filters. You can drill in pretty deep to get lots of extra details. I am not going to go through all the options in Logs Explorer, but at least you now know where to find it. If something is not working, you should be able to use this to locate any error messages and identify what is breaking. Now by default, this will capture any problems with your cluster. And if your containers are outputting their logs to STDOUT and/or STDERR, then those should appear in Logs Explorer as well. So you can use this same tool to debug issues with your clusters and your workloads. Logging for your clusters is enabled by default, but it is possible to disable it. So if for some reason your cluster logs are empty or missing, make sure to review your cluster settings.
Ok, so this is fine for when you know something is broken. But it’s not always obvious when a cluster is having an issue. That is what monitoring is for. Let me show you how to find that. So let me close the Logs Explorer tab, and return back to the Clusters page. If I click on this “Operations” button at the top right of the screen, I get a new popup. This popup includes the logs that we already saw. But it also includes Metrics, Events, and Alerts. Metrics will show you some basic charts that cover things like memory and CPU utilization. Events will show you any detected problems from the logs. And alerts are custom thresholds that you set to trigger a notification. This gives us a nice general overview, but we can get even further details by clicking on the Cloud Monitoring button.
This gives you an expanded dashboard that provides lots of details about your cluster. We can see information about our nodes, pods, and containers. We also have a bunch of filters to help identify any possible problems. And you can look at a specific cluster or you can view all clusters. So this view will help you spot if something is using too much memory or has a large amount of errors in the logs. Just like logging, cluster monitoring is enabled by default. But it can be disabled if you wish. So, if this information for your cluster is blank or missing, double-check your cluster settings to enable that.
I also previously mentioned alerts. You can set those up by clicking on “Alerting” here. Just pick the metrics you wish to monitor. You can look at metrics for your clusters, containers, nodes or pods. And then just set the thresholds for when you wish to be notified. So I can set up an alert if my memory spikes too high, or if a pod starts spewing out a huge amount of errors. Then I can get an email or some other sort of notification to let me know. Now, I’m not going to go through the entire process, since that is covered in another course. But this gives you a basic idea of how to monitor your clusters and set up alerts for Kubernetes.
In an ideal situation, you will set up alerts that will help identify any issues. And then when you have an issue, you can search through the logs to figure out what is causing it and how to fix it. And that should cover the basics of troubleshooting any issues you have with Kubernetes Engine.
Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.
Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.
When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.