In this lesson you will learn how to diagnose database issues using Google’s Cloud Monitoring and Cloud Logging services.
Learning Objectives
- Manage and minimize your system downtime
- Optimize the performance of your Google databases
Intended Audience
- Database administrators
- Database engineers
- Cloud architects
- Anyone preparing for a Google Cloud certification
Prerequisites
- Some experience working with databases
- Access to a GCP account
Setting up a production database on Google Cloud Platform can be a lot of work. But that is really only the first step. Your next challenge is maintenance. Over time, as your database grows, so too will your potential problems. You will need to be able to quickly fix any issues that might pop up. So, in this lesson, I am going to show you how to identify any potential problems with your databases.
The tool I am going to be showing you is called Cloud Monitoring. Cloud Monitoring does exactly what the name says. It allows you to monitor your cloud resources by creating dashboards to track how well everything is performing. It also can be used to create alerts, so that you are instantly notified whenever something goes wrong.
So, to access Cloud Monitoring, you first need to log into the Google web console. Then do a search for “monitoring”. So here is the main monitoring page. You can see some suggestions here for getting started. These steps are all optional, however, I am going to take you through most of them.
First I want to show you the dashboards. A dashboard is a group of charts, graphs, and widgets. It is used to graphically represent the health and performance of one or more of your services. Google provides a number of default dashboards, but you can also create custom ones as well. My current project contains a Cloud SQL database, so I have access to a few Cloud SQL dashboards already. Let’s go ahead and look at one of them.
So this dashboard lets me track the number of queries, pages reads, and writes, network connections, CPU utilization, and a bunch of other stuff. All this information is graphed over time, and I can choose the exact time range that I am interested in.
Typically you are going to use this to look for strange spikes or massive drops. You also could use this to identify any long-term trends. Like perhaps you notice that your CPU utilization is slowly increasing week after week. Now ideally you should be using dashboards to identify potential issues before they impact your uptime. But you can use them for other things as well. You can figure out exactly when an issue started, and how long it lasted. You can identify patterns like maybe your database performance drops every Wednesday morning. Things like that.
Now the default dashboards are fine, but they won’t always show you everything you need. So most of the time, you want to create your own. Let me show you how to do that next. You either need to click on “Create Dashboard” here. Or you can navigate to the Dashboards page and click on this button here.
This area on the right here is basically a blank canvas. You just drag and drop whatever graphs you want onto it. Let me start by adding a line graph. By default, it is set to track the CPU utilization for one of my virtual machines. In this case, I want to monitor my Cloud SQL instance. So I need to change the resource type here. And then I need to pick a specific metric. I will go ahead and pick CPU utilization.
Ok, so now this chart is tracking the CPU utilization for my SQL database. I can further customize it by using the option on the left. But let’s not make this demo too complicated.
I will show you how to add additional charts. This time I will add a stacked bar chart. Again, I need to update the resource and metric type. So this chart is going to track my Cloud SQL connections.
Alright, that is how to create your own dashboards. Once you are satisfied with the results, you can click up here to save it. And it looks like my option is grayed out because I have autosave enabled. So now I just have to close the editor. And this is what my new dashboard looks like.
If I want to pull this dashboard up in the future, I just need to return to “Dashboards”. And here it is. I should probably give it a different name, so let me do that. I am going to call this “Custom Cloud SQL Dashboard”. And there you go.
So dashboards are cool and everything. But you probably don’t want to stare at them all day looking for problems. Instead, it would be better if you could be automatically notified when something is amiss. Now that is what alerts are for.
So next, I am going to show you how to create an alert policy. An alert policy allows you to be notified when one or more of your metrics exceeds a set threshold. Basically, you can get an alert whenever something goes too high or too low for too long.
You create a policy by clicking on “Create Alert Policy”. Then you have to choose the metric to monitor. So I am going to pick “Cloud SQL metrics”. And then I will create a trigger based upon “CPU utilization”.
So this graph is showing my current utilization. It looks like it has been hovering around 6-7% for the last hour. I can also expand the time range if I wish. Now I can see that it has gone as high as 20% at one point. Once you have selected the correct metric, you just click on the “Next” button down here.
On this screen, you have to set the threshold. The threshold determines when the alert will be triggered. So if I want to be alerted when the CPU hits 75%, I just enter that here. And you can see that the chart on the right has been updated. The blue line shows my current CPU utilization. And the red line shows the threshold. I can change this threshold if I want. So let me bump it down to 65. Once you are happy with your threshold, click on the “Next” button.
This next screen will determine how you receive your alert. You just have to pick an appropriate channel. Right now I just have one option available. But you can set up additional ones if you wish. Just click on “Manage Notification Channels”. So you can set up a pager alert. You can get a Slack message. Here is my option for getting an email. There is also SMS. Or you can publish a message to Pub/Sub. They even include a generic webhook option, in case you wish to create something custom.
Now like I said, you can pick one or multiple channels. So if you wanted, you could receive an email, a text, and a Slack message. Once you have selected your channels, scroll down. And here is where you name your policy. After that, click on next.
This last page will let you review all your selections. Once you are satisfied, click on “Create Policy”. And there you go. Now I should get an email every time my Cloud SQL CPU utilization goes above 65%.
All the policies you create will be saved here in “Alerting”. Incidents will contain a list of all your recent alerts. And policies will contain a list of your recent alert policies. Ok, so now you know how to create an alert.
Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.
Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.
When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.