Understanding SLAs Conceptually
Understanding SLAs Conceptually
6h 2m

This section of the AWS Certified Solutions Architect - Professional learning path introduces the AWS management and governance services relevant to the AWS Certified Solutions Architect - Professional exam. These services are used to help you audit, monitor, and evaluate your AWS infrastructure and resources and form a core component of resilient and performant architectures. 

Want more? Try a Lab Playground or do a Lab Challenge!

Learning Objectives

  • Understand the benefits of using AWS CloudWatch and audit logs to manage your infrastructure
  • Learn how to record and track API requests using AWS CloudTrail
  • Learn what AWS Config is and its components
  • Manage multi-account environments with AWS Organizations and Control Tower
  • Learn how to carry out logging with CloudWatch, CloudTrail, CloudFront, and VPC Flow Logs
  • Learn about AWS data transformation tools such as AWS Glue and data visualization services like Amazon Athena and QuickSight
  • Learn how AWS CloudFormation can be used to represent your infrastructure as code (IaC)
  • Understand SLAs in AWS

A service level agreement is basically a handshake. You agree to use a company’s services and they agree to a certain level of performance for those services. If they don’t meet the level of performance they publicly stated they would meet, you then get compensated. 

So, let's break down both your agreement and the service's commitment. 

The first thing we’ll talk about is the service’s commitment to a certain level of performance. With AWS SLAs, the performance we’re talking about here refers to availability - not durability or even reliability. Availability measures the amount of time that a service is available for you to use. And the way AWS indicates availability for SLAs is through uptime. 

For every paid, generally available AWS service, AWS publishes a percentage of uptime that usually looks like a percentage with a bunch of 9s in it. For example, let’s take the AWS Key Management Service. AWS publishes a commitment that they would do their best to achieve at least 99.999% of uptime - that’s 5 nines.

This percentage dictates how much uptime, and therefore, how much downtime a company might experience in any given billing cycle if they were to use this service. Let’s take a look at a table that has a few examples of percentages of uptime and the degree of downtime it corresponds with over a year, a month, a week, and a day. 

The first row is the key management service SLA - five 9s, which corresponds to about 5.256 minutes of downtime per year, and .438 minutes per month. This is very high uptime and generally speaking, this is the goal we’d all like to achieve, as this would certainly make any boss proud. 

Now let’s go a little lower to 99.99%, which is four 9s. This corresponds to about 52 minutes of downtime per year, and 4.38 minutes of downtime per month. Still relatively high uptime, and the boss is still probably happy. 

Then we have the lowest uptime in the chart, which is 99% or two 9s. This corresponds to 87.6 hours of downtime per year, and 7.3 hours of downtime per month.  This is significantly more downtime and I think the boss probably wouldn’t be so proud of this metric. 

So for every AWS service SLA, you’ll see an uptime that looks like the number in the first column, which informs how much downtime you’ll have. And as a result, this additionally informs how happy or unhappy your boss will be. 

So what happens when the service or AWS doesn’t meet this level of expectation and you have more downtime than you expected? Well, you are entitled to compensation through the form of service credits. Let’s see how this works using AWS Lambda as an example.

Lambda commits to a monthly uptime percentage of 99.95% for each region for any billing cycle. Let’s say they miss that number - how much service credit do you get? Well, it’s dependent on how much they miss this number and how much you spend on the service for the billing cycle. 

For example, let’s say you spent $60 on AWS Lambda in the us-east-1 Region.

The service then goes down for more than 5% that month. In that case, the SLA then dictates you would get 100% of your money, in this case $60, in service credits that you can apply to future charges for AWS Lambda in the us-east-1 region. 

If Lambda is down between 1 and 5% of the month, you get 25% of your money back, in this case, $15 in service credits. 

And if the service is down between .05% and 1% of the month, you get 10% of your $60, which is $6 in service credits.  

Now is service credit the same thing as cash in your pocket? Nope, it’s kind of like store credit, you get your money back but it applies to future bills that you owe for the same service. You cannot transfer them or apply them to any other AWS account. So if your friend has a huge Lambda bill in their AWS account, you cannot use your $6 in service credits to help them out. 

So how do you get these service credits? Surely AWS automatically provides them when a service does not meet their guaranteed percentage of uptime, right? Well, not exactly. Instead, they wait for you to notice and submit a case to AWS Support. And it’s not enough to say “hey, your service was down, please provide me service credits.” Instead, you have to provide proof. For example, for AWS Lambda, your support case must include the specific dates, times, and availability for each 5-minute interval with less than 100% availability in that AWS region for that billing cycle. You also have to submit logs that detail the errors for your claimed outage. 

Not exactly a trivial amount of information to provide AWS. 

The last thing we’ll talk about here is your part of the handshake. You, as a customer enter into a customer agreement with AWS when you use their services. This customer agreement is boring, but certainly worth a read, as it does mention some information about SLAs. For example, it states that AWS can change, add or remove SLAs. If the SLA changes for the worse, AWS will provide 90 day notice, but it is still your responsibility to check the AWS site regularly for modifications to the SLAs. If you continue to use the service after the new SLA is effective, AWS takes that as your agreement to the terms.

In summary, the AWS SLAs are pretty generous, as the services are expected to meet a high level of availability. However, if they do go down for a long period of time, it can be time-consuming to gather proof that it was, in fact, AWS that was down and not your own software or internet connection. When you provide that proof and if AWS finds the proof to be sufficient, you’ll get service credits that will apply for your bill for the service during the next billing cycle. That’s it for this one - see you next time! 

About the Author
Learning Paths

Danny has over 20 years of IT experience as a software developer, cloud engineer, and technical trainer. After attending a conference on cloud computing in 2009, he knew he wanted to build his career around what was still a very new, emerging technology at the time — and share this transformational knowledge with others. He has spoken to IT professional audiences at local, regional, and national user groups and conferences. He has delivered in-person classroom and virtual training, interactive webinars, and authored video training courses covering many different technologies, including Amazon Web Services. He currently has six active AWS certifications, including certifications at the Professional and Specialty level.