EC2 Auto Scaling
Start course
2h 49m

This section of the Solution Architect Associate learning path introduces you to the core computing concepts and services relevant to the SAA-C03 exam. We start with an introduction to the AWS compute services, understand the options available and learn how to select and apply AWS compute services to meet specific requirements. 

Want more? Try a lab playground or do a Lab Challenge

Learning Objectives

  • Learn the fundamentals of AWS compute services such as EC2, ECS, EKS, and AWS Batch
  • Understanding how load balancing and autoscaling can be used to optimize your workloads
  • Learn about the AWS serverless compute services and capabilities

Hello and welcome to the first of the lectures that will be covering EC2 Auto Scaling. So what exactly is EC2 Auto Scaling? Put simply, Auto Scaling is a mechanism that automatically allows you to increase or decrease your EC2 resources to meet the demand based off of custom defined metrics and thresholds. 

In AWS, there is EC2 Auto Scaling which focuses on the scaling of your EC2 fleet, but there's also an Auto Scaling service. This service allows you to scale Amazon ECS tasks, DynamoDB tables and indexes, in addition to Amazon Aurora replicas. For this course I will just be focusing on EC2 Auto Scaling. Let's look at an example of how EC2 Auto Scaling can be used in practice. 

Let's say you had a single EC2 instance acting as a web server receiving requests from the public users across the Internet. As the requests and demand increases, so does the load on the instance. Additional processing power will be required to process the additional requests and therefore the CPU utilization would also increase. To avoid running out of CPU resource on your instance, which would lead to poor performance experienced by your end users, you would need to deploy another EC2 instance to load balance the demand and process the increased requests. With Auto Scaling, you could configure a metric to automatically launch a second instance when the CPU utilization got to 75% on the first instance. By load balancing traffic evenly, it would reduce the demand put upon each instance and reduce the chance of the first web server failing or slowing due to high CPU usage. Similarly, when the demand on your web server reduces, so would your CPU utilization. So you could also set a metric to scale back. In this example, you could configure Auto Scaling to automatically terminate one of your EC2 instances when the CPU utilization dropped to 20% as it would no longer be required due to the decreased demand. 

By scaling your resources back helps to optimize the cost of your EC2 fleet as you only pay for resources when they are running. Through these customizable and defined metrics, you can increase, scale out, and decrease, scale in, the size of your EC2 fleet automatically with ease. This has many advantages and here are some of the key points. Firstly, automation. As this provides automatic provisioning based off of custom defined thresholds, your infrastructure can elastically provision the required resources, preventing your operations team from manually deploying and removing resources to meet demands put upon your infrastructure. Greater customer satisfaction. If you are always able to provision enough capacity within your environment when the demand increases, then it's unlikely that your end users will experience performance issues, which will help with user retention. And cost reduction. With the ability to automatically reduce the amount of resources you have when the demand drops, you will stop paying for those resources. You only pay for an EC2 resource when it's up and running, which is based on a per second basis. When you couple Auto Scaling with an Elastic Load Balancer, you get a real sense of how beneficial building a scalable and flexible architecture for your resources can be. 

In the next lecture, I shall be explaining the different components of Auto Scaling before providing a demonstration on how to configure it.

About the Author
Learning Paths

Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.

To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.

Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.

He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.

In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.

Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.