How to design high availability and fault tolerant architectures
Designing solutions for elasticity and scalability
Designing Resilient Architectures. In this module, we explore the concepts of business continuity and disaster recovery, the well-architected framework and the AWS services that help us design resilient, fault-tolerant architectures when used together.
We will firstly introduce the concepts of high availability and fault tolerance and introduce you to how we go about designing highly available, fault-tolerant solutions on AWS. We will learn about the AWS Well Architected Framework, and how that framework can help us make design decisions that deliver the best outcome for end users. Next, we will introduce and explain the concept of business continuity and how AWS services can be used to plan and implement a disaster recovery plan.
We will then learn to recognize and explain the core AWS services that when used together can reduce single points of failure and improve scalability in a multi-tier solution. Auto Scaling is a proven way to enable resilience by enabling an application to scale up and down to meet demand. In a hands-on lab we create and work with Auto Scaling groups to improve add elasticity and durability. Simple Queue service increases resilience by acting as a messaging service between other services and applications, thereby decoupling layers, reducing dependency on state. Amazon Cloudwatch is a core component of maintaining a resilient architecture - essentially it is the eyes and ears of your environment, so we next learn to apply the Amazon CloudWatch service in a hands-on environment.
We then learn to apply the Amazon CloudFront CDN service to add resilience to a static website that is served out of Amazon S3. Amazon Cloudfront is tightly integrated with other AWS services such as Amazon S3, AWS WAF and Amazon GuardDuty making Amazon CloudFront an important component to increasing the resilience of your solution.
- [Instructor] The Amazon Elastic Load Balancer. There are currently three types of load balancer you can deploy, a Classic, the Application and Network Load Balancers but more on these later. Let's get the basics of what the ELB does for us first in the context of the Solution Architect's Associate Certification. The Amazon Elastic Load Balancer is an effective way to increase the availability of a system. You get improved fault tolerance by placing your compute instances behind an elastic load balancer as it can automatically balance traffic across multiple instances and multiple availability zones to ensure that only healthy EC2 instances retrieve traffic and most important for the EC2 instances is that elastic load balancers can balance load across multiple availability zones by using cross-zone load balancing. Now, the Elastic Load Balancer is a managed service and so, availability and scalability is managed for us just like with Amazon S3 and the Amazon Simple Queue Service. An elastic load balancer can be internal or external facing and load balancers ensure that requests are distributed equally to your backend instances regardless of the availability zone in which they are located. So, when combined with elastic load balancer's built-in fault tolerance, the elastic load balancing service can ensure your application runs in spite of any issues within a single availability zone. Applications can take advantage of this to become fault tolerant and self-healing. Okay, this is really important. The elastic load balancer does not terminate or start instances. Okay? The elastic load balancer does not manage any actual scaling. That is is done by auto scaling. Okay? An elastic load balancer detects the health of an instance by listening on a specified port and if the load balancer does not receive confirmation from an instance that it's running through that specified port, then the load balancer will direct traffic to another instance. Cross-zone load balancing can also reduce the likelihood of client caching of DNS information that can result in requests being distributed unevenly. However, a sticky session may be something that you do want to support. The elastic Load Balancer supports the ability to stick user sessions to specific EC2 instances using cookies. Traffic will be routed to the same instances as the user continues to access your application. Sticky sessions are one possible way to maintain client state across a fleet of service where session data is not being managed by something like ElastiCache or say DynamoDB. When the elastic load balancer detects unhealthy EC2 instances, it no longer routes traffic to those unhealthy instances. If all of your EC2 instances in a particular availability zone are unhealthy, but you have set up EC2 instances in multiple availability zones, elastic load balancing will route traffic to your healthy EC2 instances in those other zones. Another benefit of load balancers is that they can manage your secure socket layer connections. The classic in application load balancers improve availability and durability by allowing you to SS off load. So, the Elastic Load Balancer supports SSL termination including the offloading of the SSL decryption, the management of the SSL certificate which you can do from inside your load balancer using Amazon Certificate Manager and the encryption of backend instances using an optional public key authentication should you require it. So, ELBs basically as a family support HTTP, HTTPS, TCP and SSL and the HTTPS request uses the SSL protocol to establish secure connections over the HTTP layer. Now, you can also use the SSL protocol to establish secure connections over the TCP layer with the Classic and Network Load Balancers. If the front end connection uses TCP or SSL, then your backend connections can use either TCP and SSL as well. If the front end connection uses HTTP or HTTPs, then your backend connections can use either HTTP or HTTPS. There are currently three types of elastic load balancer. The Classic Load balancer, the Network Load Balancer and the Application Load Balancer. So, what is the difference I hear you ask? The Network Load Balancer is designed for connection-based load balancing. So, until now, it's quite new, until now we had anticipated extremely spiky workloads or even instantaneous failover between regions required. You'd ask AWS to provision the load balancer in preparation for the spike or surge in traffic. So, this meant the load balancer was pre-warmed for you by AWS which required a few steps like logging a support ticket etc. etc. So, the Network Load Balancer reduces some of these dependencies. The Network Load Balancer has been designed to handle sudden and volatile traffic patterns, so it's ideal for load balancing TCP traffic. It is capable of handling millions of requests per second while maintaining low latencies and without the need to be pre-warmed before traffic arrives. With the network Load Balancer we have a simple load balancing service specifically designed to handle unpredictable burst TCP traffic. The Network Load Balancer makes available a single static IP address per availability zone and operates at the connection level which is layer four routing in bound connections to AWS targets. Now, those targets can be EC2 instances, containers or an IP address and the Network Load Balancer is tightly integrated with other AWS managed services such as auto scaling, ECS which is Amazon Elastic Container Service and CloudFormation. It also supports static and elastic IP addresses and load balancing to multiple ports on the same instance, big tick. So, the best use cases for the Network Load Balancer are when you need to seamlessly support spiky or high-volume inbound TCP requests, when you need to support a static or elastic IP address and if you are using container services and/or want to support more than one port on an EC2 instance. The Application Load Balancer is arguably the most protocol-oriented load balancing service because the service enforces the latest SSL/TLS ciphers and protocols. It is ideal for negotiating HTTP and HTTP requests. The Application Load Balancer also operates at the request level which is layer seven but provides more advanced routing capabilities than the Classic and Network Load Balancers. Additionally, it's support for host-based and path-based routing, X forwarded or for headers for example or server name indications which is SNIs and sticky sessions makes the Application Load Balancer ideal of balancing loads to micro services and container-based applications. Another good reason why it's a great choice for containers, the Application Load Balancer enables load balancing across multiple ports on a single Amazon EC2 instance. Now, this is really powerful when you are using ECS which is the Elastic Container Service as you can specify a dynamic port in the ECS task definition. So, this creates an unused port on the container when an EC2 instance is scheduled and the ECS scheduler automatically adds the task to the load balancer using this port which is one less thing for you to worry about. The best use case for the Application Load Balancer, containerized applications, micro services when you don't need to support a static or elastic IP address. If you do, then you would want to use the Network Load Balancer for that. Now, that brings us to our third choice, the Classic Load Balancer, our old friend. It is still a great solution and if you just need a simple load balancer with multiple protocol support, the Classic Load Balancer is perfect. It supports many of the same layer four and layer seven features as the Application Load Balancer, sticky sessions IPv6 support, monitoring, logging and SSL termination and both the Classic and Application Load Balancers support offloading SSL decryption from application instances, the management of SSL certificate and encryption to backend instances with the option of public key authentication. So, one plus with the Classic Load Balancer is that it permits flexible cipher support which allows you to control the ciphers and protocols the load balancer presents to clients. So, this makes the Classic Load Balancer a great choice if you have to use or are limited to use a specific cipher. Best use cases for the Classic Load Balancer? Simple load balancing or flexible cipher support. So, do all these cost the same I hear you mutter? Costs do vary per region, so always check the AWS pricing page before using or changing a load balancer. Currently all three load balancers attract a charge for each hour or partial hour the load balancer is running but both the Application and the Network Load Balancers both also incur an additional charge for the number of load balancer capacity units or LCUs used per hour. Now, this cost is very well explained on the AWS load balancer pricing page. Not something you need to really be an expert in for the Associate exam. So, the Classic Load Balancer as it sounds classic and simple has just a simple charge for each gigabyte of traffic transferred through the load balancer. So, each load balancing use case is gonna be unique but here are a few rules of thumb I like to use when considering which one to choose. If you need to support a static or elastic IP address, use the Network Load Balancer. If you need control over the SSL cipher, use the Classic Load Balancer. If using container services and specifically Amazon Elastic Container Service, use the Application Load Balancer or the Network Load Balancer and if you need to support SSL offloading, use the Application Load Balancer to the Classic Load Balancer. Okay, let's take a look at this sample question shall we? So, the question is your web application front end consists of multiple EC2 instances behind an elastic load balancer. You configured your elastic load balancer to perform health checks on these EC2 instances. Lots of keywords in there already, aren't there? If an instance fails to pass health checks, which statement will be true? So, the keywords we've got in this question are application front end, multiple EC2 instances, and it's behind an elastic load balancer. Better still we're told that we've configured the elastic load balancer to perform health checks on those EC2 instances. Now, if an instance fails to pass a health check, which statement will be true? First option, the instance is replaced automatically by the ELB. Now, if we think back to when we were talking about health checking, remember we talked about how elastic load balancers detect the health of an instance but it's the actual auto scaling group that will add or remove the instances, so if an elastic load balancer does a health check and determines that the instance isn't healthy, then it simply re-routes traffic to another healthy instance. So, it doesn't do any replacing, that's done by the auto scaling group. So, we discount option A. Option B, the instance gets terminated automatically by the elastic load balancer. Once again, it's outside the realm of what the elastic load balancer does. It really does simply check for healthy instance and route traffic to those that are sending back healthy signals, so it's not responsible for stopping or starting or terminating. Again, that's the role of the auto scale group and the auto scale launch config which has all of the parameters for what that machine will be and what format it will be if it does get started by the auto scale group. So, we'll discount option B too. Option C, the elastic load balancer stops sending traffic to the instance that failed its health check. Now, that so far sounds like the closest option we have to what we think elastic load balancer's role is. It's failed a health check and therefore it's just simply going to stop sending traffic to that instance. So, we'll earmark that one as a possible answer, and a correct answer here. Let's look at option D. The instance get quarantined by the elastic load balancer for root cause analysis. Well, this is quite an interesting option. It would be fantastic if we had a service that could do that. First of all, the idea that we could quarantine instances, that would be interesting. I'm not sure how we would determine if it was quarantined or not, what the level of performance requirement would be and how long it would stay under quarantine for and root cause analysis is something that doesn't generally get done by software and I don't think it's something that could be realistically done by an elastic load balancer. So once again, it comes back to what is the elastic load balancer's job? It is simply to detect which instances in your group are healthy and then route traffic to the healthy one. So, that, while it's a nice aspirational idea to have root cause analysis done by elastic load balancers, I wouldn't want to choose that as an option. So, I think our correct option for this sample question is option C. Okay, that concludes our elastic load balancing lecture.
Head of Content
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.