Autoscaling on Azure App Service
Autoscaling on Azure App Service

This brief course explores how to use autoscaling on Azure App Service to optimize the resources necessary for running your app.

Learning Objectives

  • Set up autoscaling rules using a variety of metrics and scaling actions
  • Set up scheduled autoscaling

Intended Audience

This course is intended for anyone who knows the basics of how to create an app using Azure App Service and who now wants to understand how to implement autoscaling to manage their workloads.


To get the most out of this course, you should have a basic understanding of Azure App Service. Take our Introduction to Azure App Service course if you are completely new to the service.


Welcome to “Autoscaling on Azure App Service”. To get the most from this course, you should already have some basic experience using App Service. If you don’t, then you can take our “Introduction to Azure App Service” course.

All right, let’s get started. If you need to add more resources to an app running on App Service, one way is to scale up its resources by switching its App Service Plan to a higher pricing tier. This is very easy to do, and it only takes a few seconds, but it’s not a very dynamic solution. For example, what if your app gets used a lot more on weekdays than on weekends? You wouldn’t want to scale up and scale down your app’s service plan every week.

A more elegant solution is to have Azure add more virtual machines when your app is busier and remove VMs when it’s less busy. This is known as scaling out and scaling in rather than scaling up and scaling down. It’s only available on Basic plans and higher. It’s not available on Free or Shared plans.

Keep in mind, though, that even the Basic tier probably won’t give you what you need. That’s because it only allows you to scale in and out manually. If you want Azure to scale without manual intervention, then you need autoscaling, which is only available on Standard plans and higher.

The service that actually handles autoscaling is Azure Monitor. So, you can get to the autoscale settings either by selecting “Scale out” from the menu in your Service Plan or by going to Azure Monitor and selecting “Autoscale” and then selecting your Service Plan.

The way it works is that you can create one or more scale conditions that specify exactly what will trigger Azure to add or remove virtual machines. In each scale condition, you need to add one or more rules. For example, you could say that if the average CPU percentage of the VMs in your Service Plan is over 75% for at least 10 minutes, then it should increase the number of VM instances by one.

There are a huge number of variations in the rules that you can create. First, there are different metrics to choose from. Aside from CPU percentage, there’s also memory percentage, data in, data out, disk queue length, and a wide variety of network metrics, such as “Socket count for inbound requests”.

You can also tell it to look for the minimum or maximum value of a metric rather than the average. For example, you could tell it to scale out if the CPU percentage reaches 75% at any time during a 10-minute period.

It doesn’t have to be a 10-minute period either. You can set it to aggregate the metric over any number of minutes you want. And, of course, it doesn’t need to be 75%. You can set that to whatever percentage you want, too.

Then there’s the scaling action that you want it to take. You can tell it to scale out by exactly a certain number of VM instances, such as 1 in our example, or scale out to a total number of instances, such as 3, or you can tell it to scale out by a percentage. For example, you can tell it to scale out the number of instances by 50% if the condition is met.

You can also set something called a “cool down” period. This is the number of minutes to wait after a scaling operation before it can scale again. By default, it’s set to 5 minutes. This gives the metrics a chance to stabilize again after the scaling operation.

For example, suppose you have a rule saying that if the average CPU percentage is over 70% for 5 minutes, then add an instance. Then suppose you have 2 instances, and the average CPU percentage reaches 90%, which causes it to add another instance. If you didn’t have a cool-down period, the CPU percentage might still be above 70% for a little while until the new instance has spun up and taken some load off of the other instances, which could trigger the unnecessary addition of yet another instance.

So far, I’ve only mentioned scaling out, but you can create rules for scaling in as well. They work the same way except that everything is reversed. For example, you could tell it to scale in by one instance if the average CPU percentage is below 25% during a 10-minute period.

Now, things really get interesting when you have multiple rules. For example, suppose that you have these rules:

  • Scale out by one instance if CPU utilization is above 65 percent
  • Scale out by one instance if disk queue length reaches more than 1700
  • Scale in by one instance if CPU utilization drops below 35 percent

Now suppose that your instances are at 30 percent CPU utilization, and the disk queue length is 1900 messages. Will it scale in or scale out or stay the same? The answer isn’t obvious because there are conflicting rules. The CPU utilization is below 35 percent, so maybe it will scale in by one instance. But the disk queue length is more than 1700, so maybe it will scale out by one instance. Do the two rules cancel each other out, or does one of them have priority? Well, Azure autoscaling has a very sensible policy in these situations. Scale-out rules always win over scale-in rules. That’s because if any aspect of the system needs more resources, then it should scale out. Scaling in would just make performance worse.

Okay, that was a lot of information, but believe it or not, there are a couple of other important autoscaling options that I haven’t mentioned yet: scaling to a specific instance count and scheduling.

Previously, I showed scaling based on a metric, such as CPU percentage, but it’s also possible to create a scale condition that doesn’t involve metrics. To do this, you select “Scale to a specific instance count” instead of “Scale based on a metric” when you create a scale condition. This is much simpler than the metric-based process. You just tell it the number of instances you want it to scale to.

Of course, this doesn’t really sound like autoscaling, does it? Without a metric, how will it know when to scale to a specific instance? That’s where scheduling comes in. You simply set a start time and an end time for when this condition should be in effect. If you want it to happen on a recurring basis, you can select “Repeat specific days” and tell it which days you want it to happen every week. For example, you could scale out to a specific instance count on weekdays.

It’s also possible to use a schedule for a metric-based condition. For example, you could configure it to scale based on CPU percentage only on weekends but scale out to a specific instance count during weekdays. In this example, you’d have both metric-based and specific instance-based conditions in the same Service Plan, but since you scheduled them for different days, there wouldn’t be a conflict.

And that’s it for autoscaling on Azure App Service. Please give this course a rating, and if you have any questions or comments, please let us know. Thanks!

About the Author
Learning Paths

Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).