CloudWatch uses the mountains of data constantly being generated by your AWS resources to help you monitor and understand what’s really going on.
Closely monitoring your infrastructure is an integral part of any cloud deployment, and AWS CloudWatch offers a rich set of tools to help. The basic function of any monitoring tool is to collect and help you visualize data so you can take quick and effective action. We should give the same priority to application and server monitoring that we do to High Availability for our applications.
CloudWatch provides infrastructure level monitoring and, to some extent, application monitoring. You can access CloudWatch either from the AWS Console or through API and the command line.
CloudWatch monitors metrics describing the behavior of core AWS services within your account. These metrics give you the state of your AWS infrastructure and performance. Every Metric can be made to trigger an alarm, which sends notifications to specified end users through AWS’s Simple Notifications Service (SNS).
Cloudwatch offers two levels of monitoring: basic (no charge) and detailed. Basic metrics for some services are automatically applied, and seven pre-selected metrics are freely available for EC2 instances should you choose to enable them. Basic monitoring will generally provide checks every five minutes.
Detailed monitoring offers increased checking at a frequency of every minute, and costs $3.50 per instance per month. We will familiarize you with some of CloudWatch’s great features.
AWS resources covered by CloudWatch:
- Amazon Ec2
- EBS Volumes
- AutoScaling Groups
- Elastic Load Balancers
- Amazon Route 53
- RDS DB instances
- DynamoDB tables
- ElastiCache clusters
- RedShift clusters
- SQS queues
- SNS topics
- Storage Gateways
CloudWatch – Auto Scaling integration
Auto Scaling lets you automatically scale your servers up and down according to need. You can scale based on schedule, demand, or server hardware utilization. CloudWatch metrics like CPU usage and network utilization can be used to trigger scaling events. For example, suppose your application is running on two instances: you can, say, require one instance to terminate whenever your CPU utilization drops below 60%.
Reboot failed EC2 instances
We’ve shown how CloudWatch can send you notifications using AWS’s SNS. It can also be told to automatically reboot a failed EC2 instance on a failed status check due to loss of network connectivity, system power, or other software/hardware issues.
Integrate CloudWatch with third-party monitoring and logging tools
You can integrate CloudWatch with third-party monitoring tools like Copperegg, stackdriver, and New Relic. These third-party monitoring tools provide very fine performance monitoring, giving you a clear view of the status of your system processes. You can feed CloudWatch metrics to these services when can then be displayed on a custom dashboard.
Create Custom Metrics
Besides the default CloudWatch metrics like CPU Utilization, Network traffic, and disk read/writes, you may want to monitor more metrics, like memory utilization. You can easily define your own custom metrics. Once these metrics are available in CloudWatch, you can create alarms that trigger new actions.
AWS provides some additional monitoring scripts for adding custom metrics. When you install the scripts you can choose to report any desired combination of the following metrics:
- Memory Utilization – Memory allocated by applications and the operating system, exclusive of caches and buffers, in percentages.
- Memory Used – Memory allocated by applications and the operating system, exclusive of caches and buffers, in megabytes.
- Memory Available – System memory available for applications and the operating system, in megabytes.
- Disk Space Utilization – Disk space usage as percentages.
- Disk Space Used – Disk space usage in gigabytes.
- Disk Space Available – Available disk space in gigabytes.
- Swap Space Utilization – Swap space usage as a percentage.
- Swap Space Used – Swap space usage in megabytes.
VPC Flow Logs
Flow Logs have been available on AWS for only a couple of months. You can tell Flow Logs to track all inbound and outbound traffic moving through selected interfaces attached to your VPC. VPC flow Logs make it much easier to debug issues, like why you are not able to reach a particular instance.
You can also create CloudWatch metrics and alarms tied to network Flow Logs.
If you’re interested to learn more about Amazon CloudWatch, the Cloud Academy’s Getting Started to CloudWatch Course is your go-to course. Watch this short video taken from the course.
New Content: AWS Terraform, Java Programming Lab Challenges, Azure DP-900 & DP-300 Certification Exam Prep, Plus Plenty More Amazon, Google, Microsoft, and Big Data Courses
This month our Content Team continues building the catalog of courses for everyone learning about AWS, GCP, and Microsoft Azure. In addition, this month’s updates include several Java programming lab challenges and a couple of courses on big data. In total, we released five new learning...
Where Should You Be Focusing Your AWS Security Efforts?
Another day, another re:Invent session! This time I listened to Stephen Schmidt’s session, “AWS Security: Where we've been, where we're going.” Amongst covering the highlights of AWS security during 2020, a number of newly added AWS features/services were discussed, including: AWS Audit...
AWS re:Invent: 2020 Keynote Top Highlights and More
We’ve gotten through the first five days of the special all-virtual 2020 edition of AWS re:Invent. It’s always a really exciting time for practitioners in the field to see what features and services AWS has cooked up for the year ahead. This year’s conference is a marathon and not a...
WARNING: Great Cloud Content Ahead
At Cloud Academy, content is at the heart of what we do. We work with the world’s leading cloud and operations teams to develop video courses and learning paths that accelerate teams and drive digital transformation. First and foremost, we listen to our customers’ needs and we stay ahea...
Excelling in AWS, Azure, and Beyond – How Danut Prisacaru Prepares for the Future
Meet Danut Prisacaru. Danut has been a Software Architect for the past 10 years and has been involved in Software Engineering for 30 years. He’s passionate about software and learning, and jokes that coding is basically the only thing he can do well (!). We think his enthusiasm shines t...
New Content: AWS Data Analytics – Specialty Certification, Azure AI-900 Certification, Plus New Learning Paths, Courses, Labs, and More
This month our Content Team released two big certification Learning Paths: the AWS Certified Data Analytics - Speciality, and the Azure AI Fundamentals AI-900. In total, we released four new Learning Paths, 16 courses, 24 assessments, and 11 labs. New content on Cloud Academy At any ...
New Content: Azure DP-100 Certification, Alibaba Cloud Certified Associate Prep, 13 Security Labs, and Much More
This past month our Content Team served up a heaping spoonful of new and updated content. Not only did our experts release the brand new Azure DP-100 Certification Learning Path, but they also created 18 new hands-on labs — and so much more! New content on Cloud Academy At any time, y...
AWS Certification Practice Exam: What to Expect from Test Questions
If you’re building applications on the AWS cloud or looking to get started in cloud computing, certification is a way to build deep knowledge in key services unique to the AWS platform. AWS currently offers 12 certifications that cover major cloud roles including Solutions Architect, De...
Overcoming Unprecedented Business Challenges with AWS
From auto-scaling applications with high availability to video conferencing that’s used by everyone, every day — cloud technology has never been more popular or in-demand. But what does this mean for experienced cloud professionals and the challenges they face as they carve out a new p...
Constant Content: Cloud Academy’s Q3 2020 Roadmap
Hello — Andy Larkin here, VP of Content at Cloud Academy. I am pleased to release our roadmap for the next three months of 2020 — August through October. Let me walk you through the content we have planned for you and how this content can help you gain skills, get certified, and...
New Content: Alibaba, Azure AZ-303 and AZ-304, Site Reliability Engineering (SRE) Foundation, Python 3 Programming, 16 Hands-on Labs, and Much More
This month our Content Team did an amazing job at publishing and updating a ton of new content. Not only did our experts release the brand new AZ-303 and AZ-304 Certification Learning Paths, but they also created 16 new hands-on labs — and so much more! New content on Cloud Academy At...
Blog Digest: Which Certifications Should I Get?, The 12 Microsoft Azure Certifications, 6 Ways to Prevent a Data Breach, and More
This month, we were excited to announce that Cloud Academy was recognized in the G2 Summer 2020 reports! These reports highlight the top-rated solutions in the industry, as chosen by the source that matters most: customers. We're grateful to have been nominated as a High Performer in se...