Please note: this course has been outdated and replaced with two new courses: An Overview of Amazon CloudWatch and Building CloudWatch Dashboards
CloudWatch is a monitoring service for cloud resources in the applications you run on Amazon Web Services. CloudWatch can collect metrics, set and manage alarms, and automatically react to changes in your AWS resources. Amazon Web Services Cloudwatch can monitor AWS resources such as Amazon EC2 instances, DynamoDB tables, and Amazon RDS DB instances. You can also create custom metrics generated by your applications and services and any log files your applications generate. You’ll see how we can use Amazon CloudWatch to gain system-wide visibility into resource utilization, application performance and operationally you’ll use these insights to keep applications running smoothly. This course includes a high-level overview of how to monitor EC2, monitor other Amazon resources, monitor custom metrics, monitor and store logs, set alarms, graph and view statistics, and how to monitor and react to resource changes.
- Systems Admins
- Operational Support
- Solution Architects working on AWS Certification
- Anyone concerned about monitoring data or AWS recurring billing
- AWS Console Login
- General knowledge of how to launch an Elastic Compute Cloud (EC2) instance on either Linux or Windows
- View CloudWatch Documentation at https://aws.amazon.com/cloudwatch/
- An operational EC2 (Windows/Linux)
- Monitor EC2 and other AWS resources
- Build custom metrics
- Monitor and store log information from Linux instances
- Set alarms for metrics to take action on an instance or auto-scaling group
- Create a dashboard to monitor EC2 instances
- React to load to trigger auto scaling horizontally within AWS.
This Course Includes
- Over 90 minutes of high-definition video
- Console demos
What You'll Learn
- Course Intro: What to expect from this course
- Getting Started: How to launch an EC2 instance
- Building a Dashboard: How to take the metrics from the instance and create a dashboard
- Monitoring EC2 Instances: How and why you should be monitoring the environment in Amazon Web Services
- Sending Log Files to Cloudwatch: A lesson on the importance of sending log files to Cloudwatch
- Alarms: How to specify alarms
- Course Conclusion: Course summary
Welcome back. I'm your instructor Michael Brian, and in this section, we'll be looking at CloudWatch, Monitoring EC2 Instances in More Detail.
As we found out by default, Amazon CloudWatch doesn't provide us key metrics that we need to manage our servers. Two of the things that we've already discovered include disk utility information, including space available and total size, and also, we have no idea what the memory utilization of our server is. Both of these items are necessary to manage your environment and plan for the future.
To get started with this activity, we're going to need to give our little instance that we created a set of credentials, otherwise known as an IAM role. In this demonstration, we're gonna assign credentials to our EC2 instance. The reason that we're going to do this is because without specific permission granted to our instance, you cannot send custom metric information from the instance we've created to the Amazon API, which then reports that data into CloudWatch.
So, the key thing to understand here is that we're going to create a role that has permission to forward custom metric data from the instance we created to CloudWatch. It will do this through the Amazon API. Largely, this will be transparent to us once the correct permissions are assigned, however, if you fail to assign permissions, you can go through all the other setup steps necessary to push custom metrics to CloudWatch, and you will be unsuccessful, as the data will not send or be received by CloudWatch.
In this demonstration, we are going to monitor memory and disk utilization. As always, it's an excellent idea and good discipline to check the Amazon white papers online before you begin using a service or services. In this case, Amazon has published an excellent white paper entitled Monitoring Memory and Disk Metrics for Amazon EC2 Linux Instances. For the activity we're about to embark upon, this is an excellent white paper that provides us step by step instructions how to set up the required packages for each flavor of Linux.
In our case, we're selected the Amazon Linux AMI, but I would call your attention to the fact that almost all the types of Linux that are available in Amazon cloud have been represented here, and there are easy copy and paste scripts, as seen here, which will help us install the required prerequisites to perform custom metrics push via the API to CloudWatch. This is an insufficient permission to push data, or custom metrics, from our instance to CloudWatch.
Let's begin by launching the EC2 console. And let's look at our instances. As you can see, the current server has no IAM role attached. The first thing we're going to need to do is to create a role. To do that, we'll click on the orange cube in the upper left, and this will take us to the AWS services screen. We'll then look for security identity and compliance, and select IAM. We'll then select roles, and create a new role.
We then need to select a role type, in this case, we require an AWS service role. The very top role, Amazon EC2, allows EC2 instances to call AWS services on your behalf, and this is precisely what we want to do. In order to assign permissions to a role, it's necessary to attach a policy. In this case, we'll be able to use an existing policy without having to write one ourselves. We'll attach the CloudWatch full access role. If you're having difficulty finding the CloudWatch full access role, you can type CloudWatch in the search box, and it will filter out all the roles available for CloudWatch.
In this case, again, we're just going to select CloudWatch full access. In the lower right of your screen, you'll click next step. We'll need to set a role name for this role. In this case, I'm going to call it Cloud Academy. You can provide a description, and it's wise to do so. Assigning a role description will allow you to remember why you created it in the first place. In this case, I'll leave the role description and create the role. As you'll notice, we still don't have our new role with CloudWatch full access permissions assigned to this server.
Recently, Amazon has provided us the ability to add or change an IAM role on an instance. To do this, you'll need to select the instance, make sure that it has the light blue highlight, and then click on actions, instance settings, and then attach or replace an IAM role. We'll select the role, and we'll click apply. You should receive a green box with a check mark that says the IAM operation succeeded.
In my experience, it's not unusual to see an error message initially, if this happens, click retry, and it should apply. Once we've attached the IAM role, click the close button. This will return you back to the EC2 console screen, and you'll note that the IAM role, Cloud Academy, has been applied to our instance. You can now click on the hyperlink, and it will take you to the role and you can verify the current policy, the only policy attached, which is CloudWatch full access. If you need to see the JSON of the policy, you can do that by clicking show policy.
So, you'll see that with the resource CloudWatch, we have the ability effect, allow, resource, all. It's now time to come back to the Monitoring Memory and Disk Metrics for Amazon EC2 Linux Instances page. We have some prerequisites we need to address. The Amazon CloudWatch monitoring scripts for Amazon EC2 Linux-based instances demonstrate how to produce and consume Amazon CloudWatch custom metrics. These sample Perl scripts comprise a fully functional example that reports memory, swap, and disk space utilization metrics for Linux instance. You can also download Amazon CloudWatch monitoring scripts for Linux from the Amazon sample code library.
It's important to note that Amazon calls out that these scripts are examples only, and that standard Amazon CloudWatch uses charges apply for custom metrics. You'll need to check the pricing per region before you deploy these. We'll go ahead and copy this command. We'll switch back into terminal and paste the command. And we'll enter yes to all the packages. It will quickly install, and then we can get started on our next step.
We're now going to download, install, and configure the monitoring scripts. We'll go ahead and copy this Perl command and run it at the command prompt. Once that's complete, we'll run the following commands to actually install the monitoring scripts we've downloaded.
Let's take a moment to review what we've done thus far. We identified that our Amazon EC2 instance was not provided with an IAM role during our deployment. We did that on purpose so that we could take a look at the new feature in EC2 that allows us to add or replace a role to an EC2 instance. We created an IAM role specific to CloudWatch, and then we checked the policy to see that our IAM role has full access to CloudWatch.
It's often now the best idea to give something full access. You'll notice in this white paper that Amazon specifies quite specifically the actual permissions at a granular level that you would need. You might have noticed the very next command tells us to copy AWScreds.template to AWScreds.com, and then add the following content to this file, the AWS access key ID and AWS secret key. This would be the case if you were using user-based administration to perform your permissions.
In our case, you'll remember that we created a role, so fundamentally we have a maintenance free approach by using a role versus assigning a user where the credentials could be changed periodically. In this case, if the role were no longer valid for our server, there's no need to go into the actual server or instance and modify the AWS creds comp file, we would just make the change in the console to the role that we have assigned, which in my case in Cloud Academy, and it would be applied automatically.
One of the things we installed earlier in the lesson was the mon-put-instance-data Perl script. You'll see that this script collects memory, swap, and disk space utilization data on the current system. It then makes a remote call to Amazon CloudWatch to report the collected data as custom metrics. You'll see here that this script has many options, and you'll utilize them by actually calling them with the flags, known here as names, and they'll provide the data as shown in the description in this table. You'll see that there's a whole lot of information that we can get about this particular instance or server that was otherwise unavailable in CloudWatch by default.
The first thing that we're going to do is we're going to perform a simple test run without posting any data to CloudWatch. We'll do this by calling the mon-put-instant-data Perl script with the flag's mem util, verify, and verbose. You'll note that the flag tack tack verbose is the command that tells the Perl script not to report the data to CloudWatch. This is very important. If you're to copy and paste this and try to use this particular test in a production situation, the flag tack tack verbose will prevent the data from being posted to CloudWatch.
Let's go ahead and try it. We'll copy the command, go into the terminal, which I've cleared, we'll paste it, and we'll run the command. Let's study what's occurred. We've run the Perl script mon-put-instance-data with the flags mem util, verify, verbose. You'll notice that we get an instant result. Memory utilization on this server is 8.59 percent. The script attempted to use the credential file AWScreds.com. No credential methods were specified in that, so it then defaulted to try to use an IAM role.
The IAM role assigned to this server is Cloud Academy. As you'll recall, we assigned the CloudWatch rights to the IAM role Cloud Academy in this demonstration. Then, the next thing provided to us is the actual payload. You'll see the time stamp and the dimensions of value, name, instance ID. The value in this case is the 8.592 percent, and that would actually be put to CloudWatch using the IAM role to get permissions, in this case, Cloud Academy. We validated that this transaction can complete successfully, though, with the flag tack tack verbose, no actual metrics were sent to CloudWatch. We want to actually send data to CloudWatch.
In this case, we'll use the next example on white paper. We'll copy and paste it, and we're going to achieve the collection of all available memory metrics and send them to CloudWatch. Again, I'll highlight the fact that we're not using the flag tack tack verbose this time, because we do intend to put the metrics to CloudWatch. We'll go ahead and run this command. You'll notice that instead of reporting the metrics to the screen and the methods it tried to use to send the memory data, it actually reports that we've successfully reported the metrics to CloudWatch, and we're given a reference ID for the transaction.
The one problem with what we've just completed is that we've sent CloudWatch metrics one single time. Typically, that's not very valuable in production. We want to see the metrics over time. There are a variety of ways you can accomplish this, but the most common is to edit the crontab file. Again, the purpose of this is to schedule this task. In our case, we're going to do it every minute. We'll switch to the et cetera directory, and we'll sudo-nano-crontab. If you're unfamiliar with crontab, this is essentially a simple scheduling engine within Linux.
You'll see here that under example of job definitions, each asterisk or position relates to a time, with the first position being minute, second, hour, third, day of the month, four, month, and fifth, day of the week. By manipulating these, you could essentially schedule anything, any time you want. Notice that in all the examples, the pound sign or hashtag occurs before the line. This is commenting out that line so that it's not considered in this file. I've gone ahead and created a crontab entry that will run every one minute.
Notice, I've done this by saying star slash one for one minute. This will run this task every minute of the day, every hour, every day of the month, every month, every day of the week. We'll be needing to specify a user for which this will run. In this case, just to make it simple, I've assigned the user route. The actual command is the Python script mon-put-instance-data. We're running with the home directory slash AWS script stash mon directory, slash the mon-put-instance-data Python script. And then, the variables we want, mem util, mem used, and mem available.
Admittedly, there are other ways to accomplish this task. You could write a script and then call the script from contab, or there are other strategies and other software applications you can use to schedule a job, however, this is included as part of Linux, it's a generally accepted scheduling engine, and it's quite simple to implement. After you've made your changes in here, make sure that you save when you exit.
I'm now back in the Amazon services console. Under management tools, let's select CloudWatch. To see the metrics that we've now sent to CloudWatch, on the left hand side, click metrics. We notice that a custom name space named Linux system is available with three metrics. We'll click that, click instance ID, and we will see that memory used, memory utilization, and memory available are now available. We can click one of these, and we'll see the trend graph beginning to form.
As you could see, we have a few points already available. We can also change the graphing option to number, either by selecting the tab, or we now have a convenient drop down option. The other thing that we can do is we can conveniently assign this new metric to a dashboard that we've created in an earlier lesson. All we have to do is click actions, add to dashboard, and then select the Cloud Academy dashboard I created earlier. We'll add this metric to the dashboard, and you'll see that we have the memory used metric now on our dashboard. Let's go ahead and click save dashboard, and let's repeat that process just for practice.
Under metrics, we have several metrics that we're now pushing from our instance to CloudWatch. I want to include memory utilization and memory available. With all three selected, we'll go to actions and click add to dashboard. We'll make sure that it's the correct dashboard, and click add to dashboard. We now have memory available, memory used, and memory utilization in a single block. Because the memory used block we made earlier is redundant, we'll delete it.
One thing I like to do is always have trend lines so that I can look back over time. The easiest way to do this for the block that we just created is simply to duplicate it. And then, we'll edit the block, click graph options, click line, and we'll update the widget.
At this point, you wanna remember to save your dashboard, but we still haven't answered anything about the metrics related to our disk. To get our disk metrics, we're going to simply add a line to crontab. So, we'll sudo-nano-crontab, and what I'm going to do is add a line right underneath the previous line we have.
Now, I should point out, this actually isn't necessary. You could simply add the additional variables to the line above, I've only added an additional line here to keep it separated for illustrative purposes. The important thing about setting up disk monitoring is you must specify the path of the disk. The variable tack tack disk-path is not optional. If you leave it out, CloudWatch will not know what disk you want to monitor, and it will monitor none.
After that, the variables that are selected are disk space utilization, disk space used, and disk space available. Once you've typed this in, you can save it, and we'll go back to the CloudWatch console. These are considered custom metrics, just as the memory metrics that we've already set up and added to our dashboard. So, you'll find them by going to the metrics page.
Once you're on the metrics page, click the all metrics tab, then click all, and under the custom name space you should see six metrics under Linux system, that's because I've specified three memory metrics and three disk metrics, totalling six. If you only see three metrics, you may need to be patient, as it will not update until CloudWatch receives data from our instance about the new disk metrics. In my case, I did pause the video and wait for the metrics to show up, it did take a couple of minutes. We'll click on Linux system, and then file system, instance ID, and mount path.
In here, we'll find that we have disk space used, disk space available, and disk space utilization. We'll select all three of these metrics and we'll see that we're using 1.01 gig, we have 6.65 available, and we're using 13 percent of the disk.
Let's now go to our dashboard. The dashboard I'm using is called Cloud Academy, and we're going to add a widget. I'm going to add a number widget, and I want to configure it. I'm going to select Linux system, file instance ID and mount path, and I will select disk space used, disk space available, and disk space utilization. All of those are things I want to know. I'm going to create the widget, and now I have a much more comprehensive view of this server.
I now am tracking my CPU utilization, I can see my disk read bytes, write bytes. I can see my utilization over time on a custom scale. I can now see my current memory utilization, and I'm tracking memory available versus the memory in use. I'm also able to track my current disk space available, disk space used, and disk space utilization.
Now that CloudWatch is tracking all of this information, we can come in and check on the server any time we want by simply launching the dashboard. We could also create custom permissions to allow a customer to enter the environment and check on the dashboard, potentially restricting those permissions so they can't make any changes to the environment, or limited changes as it makes sense in your production environment. You'll note that this process and using CloudWatch will prevent you from locally logging into the server to sift through log files, or look at H-top, or another Linux based tool to find out some of this information. In many environments, we've used this strategy to limit the number of people who have local access to our Linux servers, particularly with web servers.
The amount of information that we can extract into CloudWatch is essentially limitless. If there's a metric that we can track, we can push it to CloudWatch and track it within the dashboard, or we could even push custom log files, or any log file on the server, and we'll look at that shortly. You may want to use this strategy in any variety of production situations. It's beyond the scope of this course to figure out all of those situations, but it is something to consider as you take this information and apply it in your environment.
This was a lengthy lesson, and I encourage you to go back and review it several times. We provided an instance the ability to talk to CloudWatch by creating a specific IAM role that allowed it to communicate to the Amazon API and push data to CloudWatch. We also set up memory and disk monitoring on our EC2 instance. We also reported all those metrics to CloudWatch, and added them to our dashboard.
When you're ready, join me in the next lesson where we'll talk about sending log files to CloudWatch.
Network engineer and program analyst.