EC2 Auto Scaling
Start course
2h 7m

This course provides detail on the AWS Compute services relevant to the Developer - Associate exam.  We shall be looking at Amazon EC2, AWS Elastic Beanstalk, and AWS Lambda.

Want more? Try a lab playground or do a Lab Challenge!

Learning Objectives


  • Understand when you use Amazon EC2
  • Learn about the components of Amazon EC2
  • How to create and deploy EC2 services
  • Understand what EC2 auto scaling is 
  • Be able to configure auto-scaling launch configurations, launch templates, and auto-scaling groups
  • The ability to explain what AWS Elastic Beanstalk is and what it is used for
  • The knowledge of the different environments that Elastic Beanstalk provides, allowing you to select the most appropriate option for your needs
  • An explanation of how to configure the service and some of the parameters that you can alter to meet your application requirements
  • The knowledge of the different monitoring options available for assessing your environment and resources health
  • Be able to explain what AWS Lambda is and what its uses are
  • Define the components used within Lambda
  • Explain the different elements of a Lambda function through its creation
  • Understand the key differences between policies used within Lambda
  • Recognize how event sources and event mappings are managed for both synchronous and asynchronous invocations
  • Discover how Amazon CloudWatch can monitor metrics and logs to isolate issues with your functions
  • Learn how to check for common errors that might be causing your functions to fail

Hello and welcome to the first of the lectures that will be covering EC2 Auto Scaling. So what exactly is EC2 Auto Scaling? Put simply, Auto Scaling is a mechanism that automatically allows you to increase or decrease your EC2 resources to meet the demand based off of custom defined metrics and thresholds. 

In AWS, there is EC2 Auto Scaling which focuses on the scaling of your EC2 fleet, but there's also an Auto Scaling service. This service allows you to scale Amazon ECS tasks, DynamoDB tables and indexes, in addition to Amazon Aurora replicas. For this course I will just be focusing on EC2 Auto Scaling. Let's look at an example of how EC2 Auto Scaling can be used in practice. 

Let's say you had a single EC2 instance acting as a web server receiving requests from the public users across the Internet. As the requests and demand increases, so does the load on the instance. Additional processing power will be required to process the additional requests and therefore the CPU utilization would also increase. To avoid running out of CPU resource on your instance, which would lead to poor performance experienced by your end users, you would need to deploy another EC2 instance to load balance the demand and process the increased requests. With Auto Scaling, you could configure a metric to automatically launch a second instance when the CPU utilization got to 75% on the first instance. By load balancing traffic evenly, it would reduce the demand put upon each instance and reduce the chance of the first web server failing or slowing due to high CPU usage. Similarly, when the demand on your web server reduces, so would your CPU utilization. So you could also set a metric to scale back. In this example, you could configure Auto Scaling to automatically terminate one of your EC2 instances when the CPU utilization dropped to 20% as it would no longer be required due to the decreased demand. 

By scaling your resources back helps to optimize the cost of your EC2 fleet as you only pay for resources when they are running. Through these customizable and defined metrics, you can increase, scale out, and decrease, scale in, the size of your EC2 fleet automatically with ease. This has many advantages and here are some of the key points. Firstly, automation. As this provides automatic provisioning based off of custom defined thresholds, your infrastructure can elastically provision the required resources, preventing your operations team from manually deploying and removing resources to meet demands put upon your infrastructure. Greater customer satisfaction. If you are always able to provision enough capacity within your environment when the demand increases, then it's unlikely that your end users will experience performance issues, which will help with user retention. And cost reduction. With the ability to automatically reduce the amount of resources you have when the demand drops, you will stop paying for those resources. You only pay for an EC2 resource when it's up and running, which is based on a per second basis. When you couple Auto Scaling with an Elastic Load Balancer, you get a real sense of how beneficial building a scalable and flexible architecture for your resources can be. 

In the next lecture, I shall be explaining the different components of Auto Scaling before providing a demonstration on how to configure it.

About the Author
Learning Paths

Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation.

To date, Stuart has created 150+ courses relating to Cloud reaching over 180,000 students, mostly within the AWS category and with a heavy focus on security and compliance.

Stuart is a member of the AWS Community Builders Program for his contributions towards AWS.

He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape.

In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community.

Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.