Predictive Scaling

Contents

AWS Auto Scaling Policies
1
Introduction
PREVIEW1m 30s
2
7
Summary
1m 8s
Start course
Difficulty
Intermediate
Duration
16m
Students
278
Ratings
5/5
starstarstarstarstar
Description

This course explores the various auto scaling policies that exist within AWS. We'll cover what each of the policies do, their strengths and weaknesses, and when best to use them. Understanding the ins and outs of these policies will help you save a lot of money and keep your customers happy by removing latency and downtime.

Learning Objectives

By the end of this course, you will understand how each of the AWS auto scaling policies works and in what situations they perform best.

Intended Audience

This course is recommended for solutions architects and developers who are working on creating highly available systems within AWS.

Prerequisites

To get the most out of this course, you should already have a basic working knowledge of AWS.

Transcript

Both of the dynamic scaling methods we talked about in the previous section are examples of reactionary scaling. The scaling system that is being used has a metric that it is trying to keep optimized in some way, be it by keeping the metric within an acceptable range or even at a particular number. When an outside force acts on the system by adding more load (a bunch of users checking their socials on their lunch break for example) that system acts in a reactionary way by adding more instances because of the load increase.

For predictive scaling the goal is that the system should be to get ahead of the load. A predictive system will try to scale out before an event happens so that you are always on target. Now the question is, how do we determine when new load is coming ahead of time?

Well, this service uses machine learning to understand your workloads. It is able to learn when your traffic normally rises and falls throughout the day. Based on that knowledge it will provision new instances just before they are needed and will start to get rid of them as traffic trails off.

This type of scaling is particularly good for cyclical traffic. The type of traffic where your users always are on at a certain time of day ( like normal business hours or nights and weekends). It also finds a good use when you have recurring on and off workloads. This might be things like batch processing or analytics that are called on a regular pattern-like basis.

Since this service uses machine learning to function, that means we will need to give it some time to learn the patterns in our traffic. The good news is that time is relative, and predictive scaling can use archived data from CloudWatch to create its scaling model. As long as there is at least 24 hours of historical data already laid out, you can start using predictive scaling.

The service can find patterns in your CloudWatch metrics all the way back 14 days in the past. And with this data, it will start to create a forecast of your system’s future needs. And this forecast data is updated daily based on the most recent CloudWatch metric data.

If this sounds interesting and you want to try it without possibly jeopardizing your user's experience, you can run the predictive auto scaling in forecast-only mode. This allows the system to make predictions based on your data without taking any actions.

Having this option lets you see how well it would be doing IF you were to let it take full control. You can see the predictions yourself and compare them to reality by checking out the graph it creates for you in the ec2 autoscaling console.

If you are happy with how the forecast looks, you can switch the predictive auto scaling into forecast and scale mode. This means it will be able to take over auto scaling functionality and provision new instances based on the forecasted model.

Something to keep in mind however is that when you use the forecast, ec2 autoscaling scales the number of instances at the start of every hour. So it might not be as real time as you might be hoping for.

You can use both predictive auto scaling and dynamic autoscaling at the same time to provide an even closer approximation for what your users might require. This will require a bit of tuning to get just right, but I think it will provide you quite a nice bit of coverage and should keep your users happy because of your high availability. This will tend to cost a little bit more, so that's up to you.

 

About the Author

William Meadows is a passionately curious human currently living in the Bay Area in California. His career has included working with lasers, teaching teenagers how to code, and creating classes about cloud technology that are taught all over the world. His dedication to completing goals and helping others is what brings meaning to his life. In his free time, he enjoys reading Reddit, playing video games, and writing books.