Reviewing Past Issues in the Health Dashboard
Start course
7h 20m

This course provides detail on the AWS Management & Governance services relevant to the AWS Certified DevOps Engineer - Professional exam.

Want more? Try a lab playground or do a Lab Challenge!

Learning Objectives

  • Learn how AWS AppConfig can reduce errors in configuration changes and prevent application downtime
  • Understand how the AWS Cloud Development Kit (CDK) can be used to model and provision application resources using common programming languages
  • Get a high-level understanding of Amazon CloudWatch
  • Learn about the features and use cases of the service
  • Create your own CloudWatch dashboard to monitor the items that are important to you
  • Understand how CloudWatch dashboards can be shared across accounts
  • Understand the cost structure of CloudWatch dashboards and the limitations of the service
  • Review how monitored metrics go into an ALARM state
  • Learn about the challenges of creating CloudWatch Alarms and the benefits of using machine learning in alarm management
  • Know how to create a CloudWatch Alarm using Anomaly Detection
  • Learn what types of metrics are suitable for use with Anomaly Detection
  • Create your own CloudWatch log subscription
  • Learn how AWS CloudTrail enables auditing and governance of your AWS account
  • Understand how Amazon CloudWatch Logs enables you to monitor and store your system, application, and custom log files
  • Explain what AWS CloudFormation is and what it’s used for
  • Determine the benefits of AWS CloudFormation
  • Understand what the core components are and what they are used for
  • Create a CloudFormation Stack using an existing AWS template
  • Learn what VPC flow logs are and what they are used for
  • Determine options for operating programmatically with AWS, including the AWS CLI, APIs, and SDKs
  • Learn about the capabilities of AWS Systems Manager for managing applications and infrastructure
  • Understand how AWS Secrets Manager can be used to securely encrypt application secrets

All right, let's go ahead and take a look at the Health dashboard. We'll go ahead and type health and it should be the first one that pops up here. We will be in this landing page. The first item on your menu here is going to be service health, and this is going to cover AWS global infrastructure. So, you're going to get a listing of pretty much everything that's going on in the entire AWS platform regardless of availability zone or region. Hopefully, there are no recent issues when you're looking at this screen. This is going to be the case most of the time. But let's go ahead and click 'Service history' here, and you're going to get a listing of AWS services by region, and of course, various dates going back as far as 12 months. In fact, I selected a date here, I'm going to type it 2021. I believe it was December, 07. Let's take a look. Yes, yes, this is the date that I was looking for, you can see that it says Amazon API Gateway, Northern Virginia, which I believe is US-East-1 and it has an issue here.

We're going to click on this. And looking at the activity, you can see it says 3:00 PM, 4:00 PM, 5:00 PM. So, it was quite a long outage. Of course, in the end, it was resolved and it says here elevated errors and latencies. In event like this, what's going to happen is because AWS uses their own services. So, if API Gateway is down, you might say, well, I'm not using API Gateway in my own AWS account, but some other services that depend on API Gateway may also be affected and in that case, you may see issues in your account for that reason. So, always a good idea to check in here first, okay? Now, if you don't want to be looking around global issues, you can go to the second portion of the Health dashboard which is your account health. This is going to be that information, but it's kind of co-related to your own environment so that you don't have to waste time looking around. It's like am I affected directly or not, right? So, here we are, this is going to be your own account. It looks exactly the same, it says no recent issues because nothing is going on affecting you at this time.

But if we go to event log, which is the equivalent of the history, we can see, let's click on one of these, Operational Issue, EC2(Ohio), which I believe is us-east-2. So, in this case, there was a small situation here and it shows you the name of the service. In this case, EC2, the status, if it was being worked on, it would say anything other than close here, and of course, the affected region. Now, the key here is to click on affected resources. If I had anything in this region, us-east-2 in this case, it would show up here saying, hey, you know, this resource that you have in this particular account is affected by a situation that's going on. By the way, if you're running an EC2 and you get an alert like this, that something is happening, all you need to do is go ahead and stop it and restart your instance because what's going to happen is the instance is going to come online in a different physical box. Of course, it's going to maintain the same availability zone and region, it's just a different physical machine that probably and hopefully doesn't have the issue that they're working on at the time. So, keep that in mind, that's a quick solution when you run into an EC2 issue because of maintenance or outage. One more thing to look at here, it would be your organization health.

Of course, this is not useful to everybody, you would need to have an AWS organization set up or you have multiple accounts such as an account for development, production, tests and so on. And if you have that kind of set up, you can use this view, it will show you the status of all your accounts in a consolidated way. So, it's a really helpful thing to have, this way you don't have to keep switching from dev, to prod, to test just looking for issues. In this case, you get the global view from a single dashboard. And that's going to be it for the dashboard overview. Again, really helpful tool when you're trying to figure out an issue with your AWS resources and you can quite figure out what's going on. So, be sure to check here first.


About the Author
Learning Paths

Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.

To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.

Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.

He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.

In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.

Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.