Monitoring and debugging
This course enables you to identify and implement best practices for monitoring and debugging in AWS, and to understand the core AWS services, uses, and basic architecture best practices for deploying apps on AWS.
In the first course enables you to identify and implement how to use Amazon CloudWatch to monitor and problem solve environments and applications
In the second course we review some of the AWS sample questions to help us identify and problem solve question scenarios to help us prepare for sitting the Certified Developer exam.
If you have thoughts or suggestions for this course, please contact Cloud Academy at firstname.lastname@example.org.
In this lecture, we'll explore how we can monitor the status and health of AWS resources using CloudWatch. When we use CloudWatch, we can monitor how the EC2 Compute instances, that are running our apps are behaving. We can also monitor and track other AWS resources, that are active in our accounts, and if there's a particular monitoring metric that you're after, and it doesn't exist, you can create a custom metric that suits your requirements. CloudWatch is also useful for both monitoring and storing the logs, generated by your AWS resources. You can use CloudWatch to create alerts to automatically notify you of events that you feel might concern you, through either email, or SMS, or any other channel. CloudWatch will, by default, present its data to you through really, really well organized graphs. There are many ways to access CloudWatch metrics. You can go here, for example, in the EC2 console, and check what CloudWatch has provided, in relation to specific instances. Now, for the purpose of this lecture, I launched the SharePoint infrastructure with a CloudFormation template, that's been provided by AWS. If we select an instance, and then click on its Monitoring tab, we will be able to see its Monitoring data. We can see all the standard things that are automatically monitored by AWS, and notice, that there aren't all that many categories currently being monitored. Things like memory utilization and free disk space are not monitored by default. Now, we call each of these data categories metrics. There are three metrics. Status Check Failed monitors the instance itself, to see if it's working as it should, and if it can be reached through the network. System Check Failedis related to the AWS system. If, for example, a physical server or availability zone goes down, then this count would change. Check Status Failedis related to both. If any status check fails, it will also show up here. You can see more details by clicking on an individual metric. By default, CloudWatch monitors instances in five minute intervals, however you can change this by enabling Detailed Monitoring. Let's go to the CloudWatch console to see what else we can do. CloudWatch monitors many AWS services by default. We can check all of the metrics for this region here. So, for example, these are the metrics for our S3 bucket. For each AWS service, there will be defined metrics. We can select them and have them visualize with a graph showing the values of the metric over a period of time that we select. We can also see time values by passing the mouse on the graph line. Now, all metrics are organized under categories called Dimensions, and Dimensions usually have names, that include the name of the service that is being monitored. CloudWatch can also be used to monitor anything specific that we want. We can send requests for custom metrics to it, via the command line interface, or through an SDK. Now these metrics will appear here, organized by Dimensions that we would need to set. We will be able to interact with them in the same way that we do with the standard metrics, and with the same capabilities to generate graph information. Now, CloudWatch also stores logs, and this can be very useful for collating logs from many instances into a single place or to archive logs after an instance is deleted. Now, CloudWatch can also be used for creating alarms. When you create an alarm, it monitors a given metric. When that metric reaches a defined threshold, the state of the alarm is changed to Alarm. Once the metric returns to normal, the state automatically changes back to OK. The coolest part is that you can specify actions to be performed during these changes of state. So let me create an alarm, using the EC2 console, to show you that I can do it from more than one place. Select the instance, and to go Monitoring. Select the metric, and configure the threshold. We can also set actions like Send a Notification or, in case of EC2 alarms, we can trigger actions on the EC2 instance itself. We can recover the instance, stop the instance, terminate the instance, or reboot the instance, and this can be helpful to ensure high availability, and also for scripting and cost-saving purposes. Now, let's check it out on the CloudWatch console, where we can see our alarm, and also customize it even more. To make a change, select the alarm and click Modify. Here we have a much more complete set of actions. We can send more than one notification, for example, or if we are using Auto Scaling, we can add some scaling actions, or we can just update the threshold for the alarm. Another useful feature is the ability to check the alarm state history. We just have to select the alarm, and go to the History tab. Here we can see the latest 50 state changes of the alarm, and also the reasons behind each state change. Very useful.
About the Author
Head of Content
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.