Designing for high availability, fault tolerance and cost efficiency
AWS Services That Enable High Availability
Knowledge Check Point
The AWS exam guide outlines that 60% of the Solutions Architect–Associate exam questions could be on the topic of designing highly-available, fault-tolerant, cost-efficient, scalable systems. This course teaches you to recognize and explain the core architecture principles of high availability, fault tolerance, and cost optimization. We then step through the core AWS components that can enable highly available solutions when used together so you can recognize and explain how to design and monitor highly available, cost efficient, fault tolerant, scalable systems.
- Identify and recognize cloud architecture considerations such as functional components and effective designs
- Define best practices for planning, designing, and monitoring in the cloud
- Develop to client specifications, including pricing and cost
- Evaluate architectural trade-off decisions when building for the cloud
- Apply best practices for elasticity and scalability concepts to your builds
- Integrate with existing development environments
This course is for anyone preparing for the Solutions Architect–Associate for AWS certification exam. We assume you have some existing knowledge and familiarity with AWS, and are specifically looking to get ready to take the certification exam.
Basic knowledge of core AWS functionality. If you haven't already completed it, we recommend our Fundamentals of AWS Learning Path. We also recommend completing the other courses, quizzes, and labs in the Solutions Architect–Associate for AWS certification learning path.
This Course Includes:
- 11 video lectures
- Detailed overview of the AWS services that enable high availability, cost efficiency, fault tolerance, and scalability
- A focus on designing systems in preparation for the certification exam
What You'll Learn
|Lecture Group||What you'll learn|
Designing for High availability, fault tolerance and cost efficiency
Designing for business continuity
How to combine AWS services together to create highly available, cost efficient, fault tolerant systems.
How to recognize and explain Recovery Time Objective and Recovery Point Objectives, and how to recognize and implement AWS solution designs to meet common RTO/RPO objectives
|Ten AWS Services That Enable High Availability||Regions and Availability Zones, VPCs, ELB, SQS, EC2, Route53, EIP, CloudWatch, and Auto Scaling|
If you have thoughts or suggestions for this course, please contact Cloud Academy at email@example.com.
- [Instructor] Hi, Cloud Academy ninjas. We get a lot of questions about bastion hosts, and NAT instances, what's the difference, and how do you maintain high availability, et cetera? So I thought I'd do a quick chalk talk on these two. Let me explain, first, bastion hosts. A bastion host generally enables you to connect into instances in your VPC. A bastion does this by acting as a jump host, so you connect to the bastion host via a secure protocol, such as SSH or RDP. And once authenticated on that bastion host, you can then connect, or jump, to other instances in your VPC. So why would you want to do this? Well, we don't want to have all our resources in the VPC accessible from the internet, right? So it's generally a best practice to restrict access to resources, by placing them in private subnets, or by limiting access via security group rules. Now keep in mind and remember that a subnet that doesn't have an internet gateway, or a route to that internet gateway, is a private subnet. So private subnets and security groups block public access to resources, they do that very well. But what if we want an authenticated user, or someone known to us, to be able to access resources within a private subnet? Now remember, a public subnet is a subnet that has an internet gateway, and a route to that internet gateway. Which is why a bastion host is so useful. An bastion host generally resides within a public subnet, and has ingress rules for SSH or RDP protocols. This means we can connect to the bastion host using one of these secure protocols, and then if we're authenticated correctly, and assuming the bastion host has the correct routes enabled, we will then be able to connect to other resources within the VPC from that bastion host. So a bastion host allows secure connections into your VPC. Now Linux bastion hosts are generally deployed in two availability zones, so it supports immediate access across the VPC, and you can configure a number of bastion host instances when you launch them. You can deploy an Auto Scaling group to ensure that the number of bastion host instances always matches the desired capacity you specify during the launch. And you can set that to be a maximum of one, and a minimum of one, if you just want to ensure a host is available for high availability. Now the bastion host ports need to be limited to allow only the necessary access to that host. So for Linux bastion hosts, that's going to be TCP port 22, for SSH connections, and that will typically be the only port allowed. So you can create limited access to only your IP address, for example. And you can do this by creating a security group, and associating the bastion instances with that security group. You can and should associate an Elastic IP address with the bastion instance, that way, the existing Elastic IP addresses are reassociated with the new instances if an instance is terminated, and the Auto Scaling group launches a new instance in its place. This ensures that the same trusted Elastic IP addresses are used at all times, and it makes it easier to allow these IP addresses from on-premise firewalls. Now, a NAT instance, on the other hand, generally enables hosts in a private subnet within your VPC, outbound access to the internet. So a bastion host allows inbound access to known IP addresses and authenticated users, a NAT instance allows instances within your VPC to go out to the internet. Now as you can imagine, this is a really useful service. And you can end up with a number of instances within that private subnet, using your NAT instance to download patches, get updates, or to just synchronize files or time. So updates and patching often occurs at specific times, when things are released. So if you have 10 instances, all going out and hitting the same Github expository or patch site, that's likely to create quite a lot of burst network traffic. Now for the network instance, we have a number of options. First, we can create your own NAT instance from an AMI. You can find those AMIs inside Marketplace, or you might find some public ones, pre-baked public AMIs, very useful. Or, you can use the AWS NAT gateway service. So the NAT gateway service is a managed service that you pay for by the hour. So the benefit of using the NAT gateway service, over creating your own NAT instance, is that the NAT gateway service is designed to be highly available. You don't have to worry about that. So let's delve into this a little further. A NAT instance will generally have two network interfaces. One network interface might be bound to an IP address in the private subnet, and the other network interface might be bound to an IP address in the public subnet, or just as likely, be bound to an Elastic IP address, or EIP. So these network interfaces should and would operate independently of each other under any other given scenario. However, with a NAT instance, the network address translation table on that instance is configured to allow traffic from one of the IP address ranges to be translated to an IP address associated with the other network interface. So the NAT instance forwards traffic from instances in the private subnet to the public EIP interface, which in turn, has egress out via our internet gateway. So the NAT instance maintains some state, so it will also send any responses you get back through the public domain, to your private instances. When traffic goes to the internet, the source IPv4 address is replaced with the NAT's device address, and similarly, when the response traffic goes to those instances, the NAT device translates the address back to those instances' private IPv4 addresses. So here's how it might look inside a numbers. We might have a VCP of 10.0.0.0, with a sider of 16, and a public subnet of 10.0.0.0/24. We'd have a route in that public subnet of 0.0.0.0.0 to id-idw, which is our internet gateway. So this directs traffic from the hosts within our public subnet out to the internet gateway, and thereby out to the public domain. Now the NAT instance might have an Elastic IP address of 188.8.131.52, it will sit inside this public subnet. Now a private subnet might have a subnet mask of 10.1.0.0/24, and let's say it's got a route of 0.0.0.0.0, pointing to NAT-GATEWAY-ID. So this directs traffic outbound from our hosts within the private subnet, to the NAT gateway, where the translation occurs, and traffic is forwarded out through our internet gateway. So that's how it basically passes stuff around, which is very, very useful. Now both bastion hosts and NAT instances tend to be single or small instances, right? Which means the instance running either of these services can become a single point of failure. So if we're designing a network to be highly available, we need to think through how we can maintain availability if our host becomes unavailable. Now as a rule of thumb, to enable high availability, I would configure the bastion instance in an Auto Scaling group, and I'd specify the Auto Scaling group to include multiple AZs. And I'd set a minimum size of one, and a maximum size of one. Now that way, the Auto Scaling group will launch a new instance of the bastion host if the running instance fails. We'd want to ensure our Auto Scale launch configuration provisioned a new instances with the correct ENI, or NAT configuration, and we'd also want to disable source and destination checking on the Elastic network interface. Now EIP is optional, but absolutely preferred. I mean, we could register the instances with Route 53, and use a domain name record rather than an Elastic IP address. However, our public IP address is easier to manage and configure, all right. Now just to do a quick summary, a bastion host acts as a jump host, or a reverse proxy, and a jump host will generally be an instance running in a public subnet within your VPC. The bastion host will allow you to connect to it via a secure protocol, like SSH, or RDP, if we're using Windows. And it will then allow you to jump to another instance within your VPC. A NAT instance, on the other hand, or NAT gateway, generally allows traffic to flow out of your VPC, an example being, perhaps, a database server that needs to connect to a Github repository to get a patch or an update. Now you can create your own NAT instance from a pre-baked AMI, or you can just simply take the easy route, and use the AWS NAT gateway service, which is designed to provide high availability, and something you pay for by the hour. NAT devices are not supported for IPv6 traffic. If you have an IPv6 network interface, you need to use egress only internet gateways instead of IPv6. Okay, so there's a link here to a post that we did on this, if you want to know more. And I hope that solves the problem for you.
- Amazon SQS versus Amazon SNS - What is the Difference?
- Autoscale Limits - what are they and how do I change them if I need to?
About the Author
Head of Content
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.