The course is part of this learning path
The services within the AWS Management Fundamentals course focus on maintaining and monitoring AWS applications and systems, to ensure they are compliant, properly configured, operating at required utilization thresholds, and protected from any potential external threats.
This course covers a range of different services, including:
- Describe the basic functions that each service in this course performs within a cloud solution
- Recognize basic components and features of each AWS management service in this course
- Understand the role each service plays to maintain a properly operating application on AWS
This course is designed for:
- Anyone preparing for the AWS Certified Cloud Practitioner exam
- Managers, sales professionals, and other non-technical roles
Before taking this course, you should have a general understanding of basic cloud computing concepts. If you are familiar with common compliance requirements for IT systems, this will also help.
If you have thoughts or suggestions for this course, please contact Cloud Academy at email@example.com.
Hello, and welcome to this lecture, where I shall provide an overview of the Amazon CloudWatch service.
The primary function of Amazon CloudWatch is to provide a means of monitoring your resources that you're running within AWS via a series of metrics which are individual to each service that you are using. This allows you to quickly react to events, and diagnose, and dynamically adjust any availability or scalability issue that you might be experiencing.
Each service and resource sends data to your CloudWatch dashboard as metrics. These metrics vary depending on the service. For example, when monitoring EC2 you could have metrics such as CPUUtilization and NetworkIn. Whereas, for the S3 service your metrics could be BucketSizeBytes and NumberOfObjects. The metrics are very dependent, as each service is used differently, and as such contains different metric variables. Amazon CloudWatch also offers the ability of creating custom metrics for your applications if you need to measure specific components of your infrastructure.
CloudWatch offers you two modes of recording your metric data, these being basic monitoring and detailed monitoring. Basic monitoring is the default monitoring type when configuring Amazon CloudWatch, which records metrics every five minutes. However, if you want to do a more precise timeframe for your monitoring, then you would use detailed monitoring.
Detailed monitoring for instance types ensures the metric data is recorded at one minute intervals, as opposed to five minutes with basic monitoring. However, detailed monitoring comes at an additional cost. It's important to remember that any data captured by Amazon CloudWatch is retained for two weeks, even if your AWS resources have been terminated.
It's great that CloudWatch is constantly monitoring the environment, but we need to create alarms to respond to events that occur within your environment and across your resources. You should think of alarms as predefined thresholds. Let's take a look at an example of how you could use these alarms. Let's say we have an application server, and when this server reaches 90% CPU utilization, a sub-process in our customer application stops responding. As CloudWatch already tracks CPU utilization we can use this information by creating an alarm to notify us of the impending disaster.
As a result, we could configure an alarm to trigger at 75% CPU utilization. Where the response of this trigger alarm would be to launch another server to even the load. This alarm could have an auto scaling action assigned to it to launch its additional server automatically for you. Or the alarm could be configured to send you a message when the alarm is triggered via the simple notification service, SNS.
An alarm has three possible states, the first one being 'OK'. This simply means that the metric associated of the alarm is within the predefined threshold. In our example on the previous slide, a CPU usage of 50% would have an 'OK' state, as it had not reached the 75% threshold limit.
'Alarm', this status means that the metric is outside of the threshold level, and the alarm is activated. Therefore, in our example, a CPU usage of 75% would have triggered the alarm.
And, finally, 'insufficient data'. This indicates that the metric has not collated enough available data to determine the alarm state. This is usually the case when the alarm has just been configured, and when you are waiting for metric data to be sent to CoudWatch.
CloudWatch also provides a repository for logging. This is a very effective way to capture all of the logs across your application and web servers. For example, let's say you manage a popular WordPress site with 12 front end web servers. You then experience an unexpected event which requires you to view the logs. Wouldn't it be nice to simply go to a single place to look at the system and log data of all your servers? With CloudWatch logging you can direct all of the service and applications to send their logs to CloudWatch, allowing you to review them from a single place. You can then also export the logs, and use your favorite third-party tool to perform additional detailed analysis.
By utilizing Amazon CloudWatch and its available features, it gives you the ability to ensure the following points. That your customers are receiving a good end user experience. That you understand how changes are impacting the overall performance of your environment. You are scaling your environment efficiently. Effective root cause analysis of service interruptions, and you can identify how to resolve future problems.
That now brings me to the end of this short lecture providing an overview of Amazon CloudWatch.
About the Author
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 80+ courses relating to Cloud reaching over 100,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.