EC2 Auto Scaling
Start course
1h 21m

This course covers the core learning objective to meet the requirements of the 'Designing Compute instances solutions in AWS - Level 1' skill

Learning Objectives:

  • Understand there are different Amazon EC2 compute families
  • Understand the different services that provide compute resources, such as AWS Lambda compared to Amazon EC2, or the Amazon Elastic Container Service, etc
  • Understand that elasticity can be achieved through AWS Auto Scaling
  • Understand the purpose of AWS Elastic load balancers



Hello and welcome to the first of the lectures that will be covering EC2 Auto Scaling. So what exactly is EC2 Auto Scaling? Put simply, Auto Scaling is a mechanism that automatically allows you to increase or decrease your EC2 resources to meet the demand based off of custom defined metrics and thresholds. 

In AWS, there is EC2 Auto Scaling which focuses on the scaling of your EC2 fleet, but there's also an Auto Scaling service. This service allows you to scale Amazon ECS tasks, DynamoDB tables and indexes, in addition to Amazon Aurora replicas. For this course I will just be focusing on EC2 Auto Scaling. Let's look at an example of how EC2 Auto Scaling can be used in practice. 

Let's say you had a single EC2 instance acting as a web server receiving requests from the public users across the Internet. As the requests and demand increases, so does the load on the instance. Additional processing power will be required to process the additional requests and therefore the CPU utilization would also increase. To avoid running out of CPU resource on your instance, which would lead to poor performance experienced by your end users, you would need to deploy another EC2 instance to load balance the demand and process the increased requests. With Auto Scaling, you could configure a metric to automatically launch a second instance when the CPU utilization got to 75% on the first instance. By load balancing traffic evenly, it would reduce the demand put upon each instance and reduce the chance of the first web server failing or slowing due to high CPU usage. Similarly, when the demand on your web server reduces, so would your CPU utilization. So you could also set a metric to scale back. In this example, you could configure Auto Scaling to automatically terminate one of your EC2 instances when the CPU utilization dropped to 20% as it would no longer be required due to the decreased demand. 

By scaling your resources back helps to optimize the cost of your EC2 fleet as you only pay for resources when they are running. Through these customizable and defined metrics, you can increase, scale out, and decrease, scale in, the size of your EC2 fleet automatically with ease. This has many advantages and here are some of the key points. Firstly, automation. As this provides automatic provisioning based off of custom defined thresholds, your infrastructure can elastically provision the required resources, preventing your operations team from manually deploying and removing resources to meet demands put upon your infrastructure. Greater customer satisfaction. If you are always able to provision enough capacity within your environment when the demand increases, then it's unlikely that your end users will experience performance issues, which will help with user retention. And cost reduction. With the ability to automatically reduce the amount of resources you have when the demand drops, you will stop paying for those resources. You only pay for an EC2 resource when it's up and running, which is based on a per second basis. When you couple Auto Scaling with an Elastic Load Balancer, you get a real sense of how beneficial building a scalable and flexible architecture for your resources can be. 

In the next lecture, I shall be explaining the different components of Auto Scaling before providing a demonstration on how to configure it.

About the Author
Learning Paths

Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.

To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.

Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.

He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.

In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.

Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.