Instructor: Carlos Rivas
Overview of the AWS Health Dashboard
The Health Dashboard is divided into 2 main sections:
- Events that affect everyone (top left), and
- Events that affect your account’s resources, right below that.
Let’s go over each one.
Service Health
First, you have Open and recent issues, this is where you can see current issues happening in the AWS Platform. More often than not, this option will show as disabled if there’s nothing of interest going on.
Service history on the other hand, will show a historic view of issues. This is really helpful if something happened over a weekend or holiday and you want to get details about which services and regions were affected.
Let’s look at an example of a possible outage:
Each one of these tickets will have: A header, showing the latest status , in this case “resolved” and a short description of the issue. In this case, “Increased API Error rates.”
You want to pay close attention to the affected services, in this case it’s a list of 20 services and chances are, that if you were using one of these services when this event happened in the US-EAST-1 regions, your application would have experienced similar problems.
Information like this is useful to shorten troubleshooting times and also to consider multi-region solutions if your business suffers a significant impact by an issue like this, in this particular AWS region.
Your account health
If we switch over to Your account health, this is where the Health Dashboard becomes really useful, because it correlates AWS Global issues with the resources and service that you are currently using. This way, you can see if there’s any impact to your business.
For example, let’s say you are running an EC2 instance and it’s been running nonstop for 12 months…
You may go here under the Scheduled events tab ( or, you may get it in an Email from AWS) and see something like this:
Essentially, this means that the physical hardware running your EC2 server may need to be taken down for repairs, upgrades or simply maintenance. – The solution is simple by he way: simply stop and restart your virtual EC2 instance and it will come online on a different physical computer, therefore allowing AWS to perform maintenance without further interruption to you or any other customers.
Integration with EventBridge
It’s totally understandable if you don’t want to have to manually visit a web page to find out if there’s an outage affecting your AWS infrastructure, for this, there’s a solution: EventBridge.
EventBridge can be used to monitor and react to AWS Health Dashboard events, and take certain actions including:
- Sending a notification to the Ops team
- Identifying affected resources, and
- Executing custom lambda functions to perform pretty much any task, such as creating a Zendesk or JIRA ticket related to an AWS Scheduled maintenance event.
We will be looking at this in more detail, but here’s a pattern to catch Events related to notifications, scheduled changes or issues sent to your account via the Health Dashboard.
{
"detail": {
"eventTypeCategory": [
"issue",
"accountNotification",
"scheduledChange"
],
"service": [
"AUTOSCALING",
"VPC",
"EC2"
]
},
"detail-type": [
"AWS Health Event"
],
"source": [
"aws.health"
]
}
With this pattern in EventBridge you can quickly react to potential issues without human intervention and notify the right folks in order to decide what to do. Also note the Service filter here that includes AUTOSCALING, EC2 and VPC. This is important because if you are not using AWS S3 -for example- you don’t want to send out alerts if this service won’t impact you directly.
This section provides detail on the AWS management services relevant to the Solution Architect Associate exam. These services are used to help you audit, monitor and evaluate your AWS infrastructure and resources. These management services form a core component of running resilient and performant architectures.
Want more? Try a lab playground or do a Lab Challenge!
Learning Objectives
- Understand the benefits of using AWS CloudWatch and audit logs to manage your infrastructure
- Learn how to record and track API requests using AWS CloudTrail
- Learn what AWS Config is and its components
- Manage your accounts with AWS Organizations, including single sign-on with AWS SSO
- Learn how to carry out logging with CloudWatch, CloudTrail, CloudFront, and VPC Flow Logs
- Understand how to design cost-optimized architectures in AWS
- Learn about AWS data transformation tools such as AWS Glue and data visualization services like Amazon Athena and QuickSight
Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.
To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.
Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.
He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.
In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.
Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.