Monitoring AWS Infrastructure with CloudWatch

CloudWatch uses the mountains of data constantly being generated by your AWS resources to help you monitor and understand what’s really going on.

Closely monitoring your infrastructure is an integral part of any cloud deployment, and AWS CloudWatch offers a rich set of tools to help. The basic function of any monitoring tool is to collect and help you visualize data so you can take quick and effective action. We should give the same priority to application and server monitoring that we do to High Availability for our applications.

CloudWatch provides infrastructure level monitoring and, to some extent, application monitoring. You can access CloudWatch either from the AWS Console or through API’s and the command line.

CloudWatch monitors metrics describing the behavior of core AWS services within your account. These metrics give you the state of your AWS infrastructure and performance. Every Metric can be made to trigger an alarm, which send notifications to specified end users through AWS’s Simple Notifications Service (SNS).

Cloudwatch offers two levels of monitoring: basic (no charge) and detailed. Basic metrics for some services are automatically applied, and seven pre-selected metrics are freely available for EC2 instances should you choose to enable them. Basic monitoring will generally provide checks every five minutes.

Detailed monitoring offers increased checking at a frequency of every minute, and costs $3.50 per instance per month. We will familiarize you with some of CloudWatch’s great features.

Cloudwatch

AWS resources covered by CloudWatch:

  • Amazon Ec2
  • EBS Volumes
  • AutoScaling Groups
  • Elastic load Balancers
  • Amazon Route 53
  • RDS DB instances
  • DynamoDB tables
  • ElastiCache clusters
  • RedShift clusters
  • SQS queues
  • SNS topics
  • Storage Gateways
Cloudwatch-Enabled Services
Cloudwatch-Enabled Services

CloudWatch features:

CloudWatch – Auto Scaling integration

Auto Scaling lets you automatically scale your servers up and down according to need. You can scale based on schedule, demand, or server hardware utilization. CloudWatch metrics like CPU usage and network utilization can be used to trigger scaling events. For example, suppose your application is running on two instances: you can, say, require one instance to terminate whenever your CPU utilization drops below 60%.

Reboot failed EC2 instances

We’ve shown how CloudWatch can send you notifications using AWS’s SNS. It can also be told to automatically reboot a failed EC2 instance on a failed status check due to loss of network connectivity, system power, or other software/hardware issues.

Integrate CloudWatch with third party Monitoring and Logging tools

You can integrate CloudWatch with third party monitoring tools like Copperegg, stackdriver, and New Relic. These third-party monitoring tools provide very fine performance monitoring, giving you a clear view of the status of your system processes. You can feed CloudWatch metrics to these services, when can then be displayed on a custom dashboard.

Create Custom Metrics

Besides the default CloudWatch metrics like CPU Utilization, Network traffic, and disk read/writes, you may want to monitor more metrics, like memory utilization. You can easily define your own custom metrics. Once these metrics are available in CloudWatch, you can create alarms that trigger new actions.

AWS provides some additional monitoring scripts for adding custom metrics. When you install the scripts you can choose to report any desired combination of the following metrics:

  • Memory Utilization – Memory allocated by applications and the operating system, exclusive of caches and buffers, in percentages.
  • Memory Used – Memory allocated by applications and the operating system, exclusive of caches and buffers, in megabytes.
  • Memory Available – System memory available for applications and the operating system, in megabytes.
  • Disk Space Utilization – Disk space usage as percentages.
  • Disk Space Used – Disk space usage in gigabytes.
  • Disk Space Available – Available disk space in gigabytes.
  • Swap Space Utilization – Swap space usage as a percentage.
  • Swap Space Used – Swap space usage in megabytes.

VPC Flow Logs

Flow Logs have been available on AWS for only a couple of months. You can tell Flow Logs to track all inbound and outbound traffic moving through selected interfaces attached to your VPC. VPC flow Logs make it much easier to debug issues, like why you are not able reach particular instance.

You can also create CloudWatch metrics and alarms tied to network Flow Logs.

Written by

My professional IT career began nine years back when I was just out of my college. I worked with a great team as an infrastructure management engineer, managing hundreds of enterprise application servers. I found my passion when I got the opportunity to work with Cloud technologies: I'm addicted to AWS Cloud Services, DevOps engineering, and all the cloud tools and technologies that make engineers' lives easier. Currently, I am working as a Solution Architect in SixNines IT. We are an experienced team of engineers that have helped hundreds of customers move to the cloud responsibly. I have achieved 5 AWS certifications, happily helping fellow engineers across the globe through my blogs and answering questions in various forums.

Related Posts

— November 28, 2018

Two New EC2 Instance Types Announced at AWS re:Invent 2018 – Monday Night Live

Let’s look at what benefits these two new EC2 instance types offer and how these two new instances could be of benefit to you. Both of the new instance types are built on the AWS Nitro System. The AWS Nitro System improves the performance of processing in virtualized environments by...

Read more
  • AWS
  • EC2
  • re:Invent 2018
— November 21, 2018

Google Cloud Certification: Preparation and Prerequisites

Google Cloud Platform (GCP) has evolved from being a niche player to a serious competitor to Amazon Web Services and Microsoft Azure. In 2018, research firm Gartner placed Google in the Leaders quadrant in its Magic Quadrant for Cloud Infrastructure as a Service for the first time. In t...

Read more
  • AWS
  • Azure
  • Google Cloud
Khash Nakhostin
— November 13, 2018

Understanding AWS VPC Egress Filtering Methods

Security in AWS is governed by a shared responsibility model where both vendor and subscriber have various operational responsibilities. AWS assumes responsibility for the underlying infrastructure, hardware, virtualization layer, facilities, and staff while the subscriber organization ...

Read more
  • Aviatrix
  • AWS
  • VPC
— November 10, 2018

S3 FTP: Build a Reliable and Inexpensive FTP Server Using Amazon’s S3

Is it possible to create an S3 FTP file backup/transfer solution, minimizing associated file storage and capacity planning administration headache?FTP (File Transfer Protocol) is a fast and convenient way to transfer large files over the Internet. You might, at some point, have conf...

Read more
  • Amazon S3
  • AWS
— October 18, 2018

Microservices Architecture: Advantages and Drawbacks

Microservices are a way of breaking large software projects into loosely coupled modules, which communicate with each other through simple Application Programming Interfaces (APIs).Microservices have become increasingly popular over the past few years. The modular architectural style,...

Read more
  • AWS
  • Microservices
— October 2, 2018

What Are Best Practices for Tagging AWS Resources?

There are many use cases for tags, but what are the best practices for tagging AWS resources? In order for your organization to effectively manage resources (and your monthly AWS bill), you need to implement and adopt a thoughtful tagging strategy that makes sense for your business. The...

Read more
  • AWS
  • cost optimization
— September 26, 2018

How to Optimize Amazon S3 Performance

Amazon S3 is the most common storage options for many organizations, being object storage it is used for a wide variety of data types, from the smallest objects to huge datasets. All in all, Amazon S3 is a great service to store a wide scope of data types in a highly available and resil...

Read more
  • Amazon S3
  • AWS
— September 18, 2018

How to Optimize Cloud Costs with Spot Instances: New on Cloud Academy

One of the main promises of cloud computing is access to nearly endless capacity. However, it doesn’t come cheap. With the introduction of Spot Instances for Amazon Web Services’ Elastic Compute Cloud (AWS EC2) in 2009, spot instances have been a way for major cloud providers to sell sp...

Read more
  • AWS
  • Azure
  • Google Cloud
— August 23, 2018

What are the Benefits of Machine Learning in the Cloud?

A Comparison of Machine Learning Services on AWS, Azure, and Google CloudArtificial intelligence and machine learning are steadily making their way into enterprise applications in areas such as customer support, fraud detection, and business intelligence. There is every reason to beli...

Read more
  • AWS
  • Azure
  • Google Cloud
  • Machine Learning
— August 17, 2018

How to Use AWS CLI

The AWS Command Line Interface (CLI) is for managing your AWS services from a terminal session on your own client, allowing you to control and configure multiple AWS services.So you’ve been using AWS for awhile and finally feel comfortable clicking your way through all the services....

Read more
  • AWS
Albert Qian
— August 9, 2018

AWS Summit Chicago: New AWS Features Announced

Thousands of cloud practitioners descended on Chicago’s McCormick Place West last week to hear the latest updates around Amazon Web Services (AWS). While a typical hot and humid summer made its presence known outside, attendees inside basked in the comfort of air conditioning to hone th...

Read more
  • AWS
  • AWS Summits
— August 8, 2018

From Monolith to Serverless – The Evolving Cloudscape of Compute

Containers can help fragment monoliths into logical, easier to use workloads. The AWS Summit New York was held on July 17 and Cloud Academy sponsored my trip to the event. As someone who covers enterprise cloud technologies and services, the recent Amazon Web Services event was an insig...

Read more
  • AWS
  • AWS Summits
  • Containers
  • DevOps
  • serverless