image
Reviewing Past Issues in the Health Dashboard

Contents

Start course
Difficulty
Intermediate
Duration
19m
Students
13
Ratings
5/5
starstarstarstarstar
Description

This course looks at the AWS Health Dashboard, a tool that can help you plan and work around issues in AWS. These events include outages, scheduled maintenance, and service degradation events. So, let’s get you prepared for these events with this awesome tool.

Learning Objectives 

By the end of this course, you should have a greater understanding of the AWS Health dashboard, its features, and how to integrate it into your high-availability solutions. Some of the key points we’ll be covering in this course include the following:

  • Service history and open issues
  • Service events specific to your AWS accounts
  • EventBridge integration for the Health Dashboard
  • Enterprise-level offerings (such as the AWS Health API)

Intended Audience

  • Systems Administrator
  • DevOps Engineer
  • AWS Student learning for certification reasons

Prerequisites 

  • Have a general understanding of Amazon EventBridge and AWS Infrastructure
  • General knowledge about AWS services currently in use by your organization that could be impacted during scheduled maintenance and outages

 

Transcript

All right, let's go ahead and take a look at the Health dashboard. We'll go ahead and type health and it should be the first one that pops up here. We will be in this landing page. The first item on your menu here is going to be service health, and this is going to cover AWS global infrastructure. So, you're going to get a listing of pretty much everything that's going on in the entire AWS platform regardless of availability zone or region. Hopefully, there are no recent issues when you're looking at this screen. This is going to be the case most of the time. But let's go ahead and click 'Service history' here, and you're going to get a listing of AWS services by region, and of course, various dates going back as far as 12 months. In fact, I selected a date here, I'm going to type it 2021. I believe it was December, 07. Let's take a look. Yes, yes, this is the date that I was looking for, you can see that it says Amazon API Gateway, Northern Virginia, which I believe is US-East-1 and it has an issue here.

We're going to click on this. And looking at the activity, you can see it says 3:00 PM, 4:00 PM, 5:00 PM. So, it was quite a long outage. Of course, in the end, it was resolved and it says here elevated errors and latencies. In event like this, what's going to happen is because AWS uses their own services. So, if API Gateway is down, you might say, well, I'm not using API Gateway in my own AWS account, but some other services that depend on API Gateway may also be affected and in that case, you may see issues in your account for that reason. So, always a good idea to check in here first, okay? Now, if you don't want to be looking around global issues, you can go to the second portion of the Health dashboard which is your account health. This is going to be that information, but it's kind of co-related to your own environment so that you don't have to waste time looking around. It's like am I affected directly or not, right? So, here we are, this is going to be your own account. It looks exactly the same, it says no recent issues because nothing is going on affecting you at this time.

But if we go to event log, which is the equivalent of the history, we can see, let's click on one of these, Operational Issue, EC2(Ohio), which I believe is us-east-2. So, in this case, there was a small situation here and it shows you the name of the service. In this case, EC2, the status, if it was being worked on, it would say anything other than close here, and of course, the affected region. Now, the key here is to click on affected resources. If I had anything in this region, us-east-2 in this case, it would show up here saying, hey, you know, this resource that you have in this particular account is affected by a situation that's going on. By the way, if you're running an EC2 and you get an alert like this, that something is happening, all you need to do is go ahead and stop it and restart your instance because what's going to happen is the instance is going to come online in a different physical box. Of course, it's going to maintain the same availability zone and region, it's just a different physical machine that probably and hopefully doesn't have the issue that they're working on at the time. So, keep that in mind, that's a quick solution when you run into an EC2 issue because of maintenance or outage. One more thing to look at here, it would be your organization health.

Of course, this is not useful to everybody, you would need to have an AWS organization set up or you have multiple accounts such as an account for development, production, tests and so on. And if you have that kind of set up, you can use this view, it will show you the status of all your accounts in a consolidated way. So, it's a really helpful thing to have, this way you don't have to keep switching from dev, to prod, to test just looking for issues. In this case, you get the global view from a single dashboard. And that's going to be it for the dashboard overview. Again, really helpful tool when you're trying to figure out an issue with your AWS resources and you can quite figure out what's going on. So, be sure to check here first.

 

About the Author
Avatar
Carlos Rivas
Sr. AWS Content Creator
Students
280
Courses
10
Learning Paths
1

Software Development has been my craft for over 2 decades. In recent years, I was introduced to the world of "Infrastructure as Code" and Cloud Computing.
I loved it! -- it re-sparked my interest in staying on the cutting edge of technology.

Colleagues regard me as a mentor and leader in my areas of expertise and also as the person to call when production servers crash and we need the App back online quickly.

My primary skills are:
★ Software Development ( Java, PHP, Python and others )
★ Cloud Computing Design and Implementation
★ DevOps: Continuous Delivery and Integration