Exam Prep - Auto Scaling
Start course
3h 8m

Course Description

The AWS exam guide outlines that 60% of the Solutions Architect–Associate exam questions could be on the topic of designing highly-available, fault-tolerant, cost-efficient, scalable systems. This course teaches you to recognize and explain the core architecture principles of high availability, fault tolerance, and cost optimization. We then step through the core AWS components that can enable highly available solutions when used together so you can recognize and explain how to design and monitor highly available, cost efficient, fault tolerant, scalable systems.

Course Objectives

  • Identify and recognize cloud architecture considerations such as functional components and effective designs
  • Define best practices for planning, designing, and monitoring in the cloud
  • Develop to client specifications, including pricing and cost
  • Evaluate architectural trade-off decisions when building for the cloud
  • Apply best practices for elasticity and scalability concepts to your builds
  • Integrate with existing development environments

Intended Audience

This course is for anyone preparing for the Solutions Architect–Associate for AWS certification exam. We assume you have some existing knowledge and familiarity with AWS, and are specifically looking to get ready to take the certification exam.


Basic knowledge of core AWS functionality. If you haven't already completed it, we recommend our Fundamentals of AWS Learning Path. We also recommend completing the other courses, quizzes, and labs in the Solutions Architect–Associate for AWS certification learning path.

This Course Includes:

  • 11 video lectures
  • Detailed overview of the AWS services that enable high availability, cost efficiency, fault tolerance, and scalability
  • A focus on designing systems in preparation for the certification exam

What You'll Learn

Lecture Group What you'll learn

Designing for High availability, fault tolerance and cost efficiency 

Designing for business continuity 

How to combine AWS services together to create highly available, cost efficient, fault tolerant systems.

How to recognize and explain Recovery Time Objective and Recovery Point Objectives,  and how to recognize and implement AWS solution designs to meet common RTO/RPO objectives

 Ten AWS Services That Enable High Availability Regions and Availability Zones, VPCs, ELB, SQS, EC2, Route53, EIP, CloudWatch, and Auto Scaling 

If you have thoughts or suggestions for this course, please contact Cloud Academy at


Okay Cloud Academy ninjas, let's review auto scaling. So the elements of an auto scaling group that are mandatory are a minimum size, and a launch configuration. Okay? Minimum size, launch configuration. Health checks and desired capacity are optional. Alright, but you have to have a minimum size and launch configuration. You only need the launch configuration name, the AMI, and the instance type to create an auto scaling launch configuration. Things like identifying the key pair, the security group, and the blocked device mapping are all optional elements when you're doing your launch configuration. Few common errors we have when we're talking about auto scaling. A classic one is the default limit for launch configurations. It is 100 launch configurations per region. So if you've reached this limit, your call to create launch configuration is going to fail. Okay? Now you can check and update this limit by running the AWS auto scale describe account limits at the command line. It will tell you what you current limit is. Also, the default number of instances that you can launch per region is another classic way of stopping or stalling your auto scaling, right? The current default number of instances you can launch per region is 20. So if an auto scaling group or auto scaling launch configuration exceeds that number, your auto scaling group will stop expanding. Now you can increase their EC2 limit by logging a ticket with AWS support and requesting it to be higher. Elastic load balances just basically check the health of an instance, and if the instance is not healthy they will stop routing traffic to it. They do not terminate the instance. That job is done by the auto scale group. Right? The ELB just checks the health, if it's healthy it sends traffic, if it's not healthy it sends it elsewhere. Auto scaling is designed to scale out based on an event like increased traffic. And it's designed to scale in whenever traffic drops off. So it can be as cost effective as possible right? However, you can also use it to create steady state workloads that need a consistent number of EC2 instances. And you can use auto scaling to monitor and keep that specific number of EC2 instances running using the minimum and maximum value settings. Now schedule scaling is where you set a date or time combination and have auto scaling increase the auto scale group based on predictive traffic patterns. So for example, on Friday, people are doing a lot of business reports, or expected traffic spikes is high due to you having a retail site, you can set scheduled tasks through upload the number of instances for that particular period. Now just keep in mind that a auto scaling launch configuration cannot add an already running instance to an auto scaling group. Okay, we can never review auto scaling enough, so let's just quickly review the methods that are supported by auto scaling. So remember, launch adds a new EC2 instance to the auto scaling group, increasing it's capacity. So if you suspend launch, this effects your other processes. As an example, you can't return an instance in a standby state to serve if launch process is suspended. Because the group can't scale. Another action is terminate, removes the EC2 instance from the group decreasing it's capacity. And if you suspend terminate, then that also effects other processes. Heath check, the very, very important health check, checks the health of the instances. And auto scaling marks an instance as unhealthy if Amazon EC2 or elastic load balancing tells auto scaling that the instance is unhealthy. Now this process can override the health status of an instance that you've set manually. Replace unhealthy terminates instances that are marked as unhealthy, and then creates new instances to replace them. And this process works with the health check process and uses both the terminate and launch processes. Another one is availability zone rebalance or, AZ rebalance. And that balances the number of EC2 instances in the group across the availability zones available in your region. So if you remove an availability zone from your auto scaling group, or an availability zone becomes unhealthy or unavailable, auto scaling launches new instances in the unaffected availability zone before terminating the unhealthy or unavailable instances. Let's just underline that and repeat. Okay? So availability zone becomes unhealthy or unavailable for whatever reason, auto scaling launches new instances in the unaffected availability zone first before terminating the unhealthy or unavailable instances. So when the unhealthy available availability zone returns to a healthy state, auto scaling automatically redistributes the instances evenly across the availability zones in the group. Alright? So if you suspend the AZ rebalance and a scale out or scale in event occurs, auto scaling still tries to balance the availability zones. So during scale out for example, auto scaling launches the instance in the availability zone with the fewest instances. Now if you suspend the launch process, the AZ rebalance doesn't launch new instances or terminate existing instances. And that's because AZ rebalance terminates instances only after launching the replaced instance. So if you suspend terminate your auto scaling group can grow up to 10% larger than it's maximum size because auto scaling allows this temporarily during the rebalancing activity. So if you suddenly see your auto scaling group going above what your set criteria is then that can be a possible reason for why. If auto scaling cannot terminate instances, your auto scaling group could remain above it's maximum size until you resume the terminate process. So if you got a larger auto scaling group than you should have, possibly you've suspended the terminate action. Another action is alarm notification, and this accepts notifications from Cloud Watch that are associated with the group. And if you suspend alarm notification auto scaling doesn't automatically execute policies that would be triggered by that alarm. So if you suspend, launch, or terminate for example, auto scaling would not be able to execute scale out or scale in policies. Scheduled actions, perform scheduled actions that you create. And if you suspend, launch or terminate again, scheduled actions that involve launching or terminating instances are going to be affected. Another key method is the add to load balancer, and this adds instances to the attached load balancer or target group when they're launched. Now if you suspend the add to load balancer method, auto scaling launches the instances but does not add them to the load balancer or target group. Now if you resume the add to load balancer process, auto scaling then resumes adding instances to the load balancer or target group when they are launched. However, auto scaling does not add the instances that were launched while this process was suspended. You've got to register those instances manually.

About the Author
Learning Paths

Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built  70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+  years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.