How to Architect with a Design for Failure Approach

Introduction to designing for failure

Read video transcript

The gold standard for high availability is five 9's, meaning guaranteed uptime 99.999% of the time. That means just five and a half minutes of downtime throughout an entire year. Achieving this kind of reliability requires some advanced knowledge of the many tools AWS provides to build a robust infrastructure.

In this course, expert Cloud Architect Kevin Felichko will show one of the many possible alternatives for creating a high availability application, designing the whole infrastructure with a Design for Failure. You'll learn how to use AutoScaling, load balancing, and VPC to run a standard Ruby on Rails application on an EC2 instance, with data stored on an RDS-backed MySQL database, and assets stored on S3. Kevin will also touch on some advanced topics like using CloudFront for content delivery and how to distribute an application across multiple AWS regions.

Who should take this course

As an intermediate/advanced course, you will need to have some experience with EC2, S3 and RDS, and at least a basic knowledge of AutoScaling, ELB, VPC, Route 53 and CloudFront.

Test your knowledge of the material covered in this course: take a quiz.

Welcome to the series titled AWS Regions and Availability Zones: How to Architect with a Design for Failure Approach. In this course, we will walk through just one of the many how tos aimed at building a flexible, available and highly resilient application in the Amazon web services environment. To get the most out of this series, it is recommended that you have a basic understanding of AWS. Let's get started.

But before we get into our high availability goals and application architecture, we need to get level on our terminology. As of this recording, there are 10 AWS regions. Eight of which are available to the general public, one that is limited to government agencies in the United States and another that is in limited preview. A region is a geographically separate location which hosts multiple availability zones. Pricing in AWS is based on the region you choose.

Choosing a region is an important decision that should be based off of factors relevant to your AWS usage. Typically this will be in a region where the majority of your user base will be located. Within each region there are distinct data centers known as availability zones.

Availability zones in the same region can share AWS resources such as databases, snap shots, queues and more.

Because availability zones within the same region can share resources, it makes designing for high availability within a region pretty easy.

It is important to note that an availability zone can become too congested for more resources to be launched. When AWS locks down an availability zone, new accounts in the region do not have an option to select that zone and existing customers will be prevented from launching new resources. Because of this possibility, we will need to ensure our applications are not availability zone dependent. A virtual private cloud, or VPC for short, is a logically isolated section of AWS. This is your network in the cloud that you have full control over. You launch and configure resources within a VPC.

It is important to note that VPCs can span availability zones but not regions. VPCs are one of the AWS services that can be used free of charge. RDS, or relational database service, is a service that makes it easy to create and manage database instances. It strips away the typical database administrative task by handling the backups using a backup schedule you choose and a maintenance window you specify. For most database engines, high availability is already baked in making it a no-brainer for many organizations that do not have an on-site DBA. Currently, RDS supports a variety of database engines such as MySQL and Microsoft SQL Search to name just two. CloudFront is AWS's content delivery network designed to speed up requests for end users. This is accomplished through a series of origin and edge locations scattered around the world. Content can be pulled from EC2 instances, S3 buckets, and custom resources. CloudFront can be used to cache static and dynamic content for a specific amount of time. Auto Scaling is a service that adjusts the number of application instances based on min/max desired values and/or application load. For example, you can configure a new instance of your application to be created when the CPU load of your instances is sustain 80% utilization over a five minute period. Or terminate an instance when the same CPU is running at 30% utilization or lower over a 20 minute period. The goal of auto scaling is to reduce human intervention while providing the most cost-effective and best user experience possible. It is one of the AWS services that does not cost extra. You only pay for the resources launched by auto scaling. Simply put, Route 53 is a domain name service. It offers many high availability features such as latency-based routing, health checks and failover policies.

Our high availability design goal for this series is five nines of availability. That is roughly five and a half minutes of down time per year. It is considered the gold standard of high availability applications. In order to hit our five nines goal, we need to design a self healing architecture, build a fault tolerant infrastructure, eliminate all single points of failure and provide a graceful degradation of services during outages. This can be accomplished with many of the AWS service offerings. We have some challenges we have to identify before we begin. These challenges center around moving our high availability design to multiple regions. When initiated, services are scoped for the region they are launched within. What may be incredibly simple to configure and use within a single region can quickly turn into a headache when spanning multiple regions.

This will most likely force changes to the hosted application. Adding to the complexity is that some services are not available in all regions. At the time of this recording, simple email service, or SES for short, is available in US-east-1 but not in US-west-1. Obviously those quite the conundrum. Do you use a different service? Maybe you use a VPN to access the other regions' SES? Or maybe you select some other approach entirely? Therefore it is extremely important to be aware of regional limitations when making design decisions. Latency and data synchronization pose their own sets of challenges. When data has to be shared across regions, it suffers from delays as it traverses continents and oceans. A newly inserted record that is immediately available in the US east region cannot be seen for some time in the Singapore region. Therefore, an application's data access routines have to be designed with an intelligent read/write methodology in mind. There's no single right way to do this. It depends on the complexity and goals of the individual application. Our sample application will offer just one possible solution to this dilemma.

Let's take a look at the application. Our application will be the sample from a great book called "Rails Tutorial" by Michael Hartl. It is a simple Twitter-like application built in Ruby On Rails 4. It runs in a Linux environment using MySQL's state of store. Without tweaking the application we will leverage the built-in power of AWS to create a highly available system. Our architecture will look like this. This architecture diagram shows a pretty standard high availability design for web applications. Users will direct their browser to our URL.

CloudFront will deliver the user a cached version of the object requested. If this does not exist, the request will be forwarded to the elastic load balancer in our US-east-1 region's subnets. If US-east-1 is unavailable, CloudFront will fall back to the EU-west-1 region. Regardless of the region, the elastic load balancer will direct traffic to one of the availability zones. Our web EC2 instance will build the response using RDS if the request involves a database store. When the response works its way back out, CloudFront will cache the response for the next request to the same URL. Lessons in this series will call back to this diagram as needed to show what we are targeting. A few notes before we start building our solution. We are not going to go over security best practices in this series. That will be demonstrated in a different series.

However, we have not skipped security in building our environment. We're just not showing all of the steps necessary.

Remember, the security of your account and your applications running within it are very important and should not be taken lightly. We are selecting the low cost, low configuration options. In a true production environment, we would not select T1 micro instances for our RDS instance. Our focus is on high availability. We have pre-built and configured our AMI to save on time.

This is just one of many possible high availability scenarios. AWS has an architecture blog that shows different scenarios for different uses. The concepts we go over in this series are the same regardless of our design decisions.

Everything we are doing is from the AWS console. You can also do this from the AWS command line interface. Lastly, we could use Elastic Beanstalk or Opsworks to get us most of the way to a high availability solution. While this is great to have, we believe it's important to build from scratch to facilitate an understanding of the core concepts. Now that we have introduced you to the series, let's build our solution. On to the next lesson, building our VPC environment.