1. Home
  2. Training Library
  3. Amazon Web Services
  4. Courses
  5. Designing Multi-Tier Architectures

Design Components - Auto Scaling

Start course
star star star star star-half


Course Introduction
Domain One of The AWS Solution Architect Associate exam guide SAA-CO2 requires us to be able to Design a multi-tier architecture solution so that is our topic for this course.
The objective of this course is to prepare you for answering questions related to this domain. We’ll cover the need to know aspects of how to design Multi-Tier solutions using AWS services.

Learning Objectives
By the end of this course, you will be well prepared for answering questions related to Domain One in the Solution Architect Associate exam.

Architecture Basics 
You need to be familiar with a number of technology stacks that are common to multi-tier solution design for the Associate certification- LAMP, MEAN, Serverless and Microservices are relevant patterns to know for the exam. 

What is Multi-Tier Architecture?
A business application generally needs three things. It needs something to interact with users - often called the presentation layer -  it needs something to process those interactions - often called the logic or application layer - and it generally needs somewhere to store the data from that logic and interactions - commonly named as the data tier.

When Should You Consider a Multi-Tier Design?
The key thing to remember is that the benefit of multi-tier architecture is that the tiers are decoupled which enables them to be scaled up or down to meet demand. This we generally call burst activity and is a major benefit of building applications in the cloud

When Should We Consider Single-Tier Design?
Single tier generally implies that all your application services are running on the one machine or instance. Single-Tier deployment is generally going to be a cost-effective and easy to manage architecture but speed and cost is about all there is for benefits. Single tier suits development or test environments where finite teams need to work and test quickly. 

Design a Multi-Tier Solution 
First we review the design of a multi-tier architecture pattern using instances and elastic load balancers.  Then we’ll review how we could create a similar solution using serverless services or a full microservices design.

AWS Services we use 

The Virtual Private Cloud
Subnets and Availability Zones 
Auto Scaling 
Elastic Load Balancers 
Security groups and NACLs
AWS CloudFront 
AWS WAF and AWS Shield 

Serverless Design 
AWS Lambda 
Amazon API Gateway 

Microservices Design 
AWS Secrets Manager 

Sample Questions
We review sample exam questions to apply and solidify our knowledge. 

Course Summary 
Review of the content covered to help you prepare for the exam. 




Auto scaling is a core component of increasing durability and availability on AWS. Auto scaling enables you to provision based on actual demand, rather than on estimated demand. Auto Scaling has health check in built in, and it can drop EC2 instances that are not responding, and replace them with newly spun up healthy versions. All of this functionality works across multiple availability zones within a region, helping you achieve high availability with minimal manual intervention. 

Auto scaling should be used when you want to ensure that you have at least say one EC2 instance running at all times, or perhaps one or two EC2 instances in one or two of your availability zones always running. Another important use case is when you need to scale based on increased demand or when you know of an upcoming event where your EC2 instances may be under a heavy workload. You can also ensure you scale back down to a normal EC2 presence and not waste money on idle resources. Auto scaling makes horizontal scaling easy based on demand or predetermined schedules. You can increase or decrease the number of EC2 instances running based on CloudWatch metrics. In the example on my screen, the auto scale policy will add an instance if the average CPU utilization equals or exceeds 70% for more than 300 seconds, which is five minutes. And the other part of the policy will remove instances from the auto scale group, if the average CPU for the group equals or is less than 20% for five minutes. The launch configuration. Your group uses a launch configuration as a template for it's EC2 instances. When you create a launch configuration, you can specify information such as the AMI ID, instance type, key pair, security groups, and block device mapping for your instances. Auto scaling group enables you to specify a minimum, maximum, and desired number of EC2 instances in that group. A scaling plan tells auto scaling when and how to scale. You can base a scaling plan on the occurrence of specified conditions. Dynamic scaling, for example.

 Consider an auto scaling group that has two availability zones. A desired capacity of two instances, and scaling policies that increase and decrease the number of instances by one when certain thresholds are met. When the threshold for the scale out policy is met, the policy takes effect and auto scaling launches a new instance. The auto scaling group now has three instances. When the threshold of the scale and policy is met, the policy takes effect and auto scaling terminates one of the instances. If the group does not don't have a specific termination policy assigned to it, auto scaling uses the default termination policy. Auto scaling selects the availability zone with two instances, and terminates the instance launched from the oldest launch configuration. If the instances were launched from the same launch configuration, then auto scaling selects the instance that is closest to the next billing hour and terminates that. This helps you maximize the use of your EC2 instances, while minimizing the number of hours you are billed for Amazon EC2 usage. If there is more than one unprotected instance, closest to the next billing hour, auto scaling selects one of these instances at random. 

All right, so how about creating your own termination policy? You have the option of replacing the default policy with a customized one. When you customize the termination policy, auto scaling first assesses the availability zones for any imbalance. If an availability zone has more instances than other availability zones that are used by the group, then auto scaling applies your specified termination policy on the instances from the imbalanced availability zone. If the availability zones used by the group are balanced, then auto scaling applies the termination policy that you specified. Now by default, instance protection is disabled. Instance protection does not protect auto scaling instances from manual termination through the Amazon EC2 console, terminate instances command, or the terminate instances API. So auto scaling enables horizontal scaling, i.e. adding more instances rather than a vertical scaling which traditionally tends to be increasing the size or capacity of an instance or machine. 

When an auto scaling rule is applied, the auto scaling configuration plan, you define, launches new instances based on their auto scale configuration. So auto scaling can be scheduled to scale up or down based on estimated usages, or you can set alarms to scale the size of an auto scale group based on CPU, network, or memory usage. So when any instance in your group reaches a certain threshold, an alarm will be created and the auto scaling group configuration will launch more instances to meet that demand. This is way more resilient. So if you have a question that mentions high availability, then most likely it is going to need a decoupled multi-tier architecture, and it is going to need to be able to scale parts of that application quickly to meet burst activity. So before we move into design mode, a few questions you need to ask yourself if you're presented with a scenario. Does it need to be highly available? Does it need to be scalable? Does it need to be fault tolerant? And in practical terms, does the system need to always be able to answer customer requests? And if so, then it needs to have a multi-tier architecture.

About the Author

Learning paths48

Head of Content

Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe.  His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups. 

Covered Topics