DEMO: Implementing a VM Scale Set with Autoscaling
Start course

This course offers an in-depth look at VM scale sets, VM configuration management, VM storage options, and VM monitoring within Azure. We kick things off by looking at VM scale sets, vertical scaling, and horizontal scaling.

After that, you'll learn about the tools used for configuration management, as well as how to deploy software using VM extensions and how to deploy an Azure PowerShell DSC Configuration.

The course will then cover the wide range of VM storage options available in Microsoft Azure and show you how to use them. Finally, you'll learn about Azure Monitor, a service that allows you to monitor the performance and health of your VMs and VM scale sets.

This course is packed full of step-by-step demonstrations that you can follow along with, allowing you to see all of the above topics put into practice in real-life Azure environments.

For any feedback relating to this course, please feel free to contact us at

Learning Objectives

  • Scale VMs using scale sets and understand the difference between vertical and horizontal scaling
  • Learn about the tools used for managing VM configurations
  • Deploy software using VM extensions and PowerShell DSC
  • Understand the various VM storage options available in Azure
  • Learn about Azure Monitor and its uses

Intended Audience

  • Anyone interested in learning about scale sets, configuration management, storage, and monitoring for Azure VMs


To get the most from this course you should have a basic understanding of Microsoft Azure and of the Azure portal.


Hello, and welcome back. In this lesson here, we're going to walk through a demonstration where I'll show you how to deploy a virtual machine scale set that utilizes autoscaling.

On the screen here, you can see I'm logged in to my Azure Portal, and I'm in a resource group called VMStuff that I created ahead of this lab. To create our VM scale set, that will be load balanced by the way, what we're going to do here is click Add to kick things off here. And from the Marketplace, we are going to search for scale set. And we can see we have Virtual machine scale set here. And we'll go ahead and create the scale set.

The creation process for a VM scale set, if you look here, looks rather similar to a VM creation process, and that's largely because what we're doing is creating a VM configuration that's going to be used to spin out VMs within that scale set. So it stands to reason that it would look quite similar to a VM deployment.

On this Basics page here, what we need to do is provide some basic information for our scale set. We're going to deploy into our lab subscription and then into my VMStuff resource group. What I'll do here is give my scale set a name, and we'll deploy into East US. We're not gonna do any Availability zone here. And the image we're going to use is going to be a Windows Server 2016 data center image. We'll leave Spot instance at its default of No since we're not using spot instances, and we'll leave the default Size here of one processor and 3 1/2 gig of RAM. Let me specify my admin information here. And we'll leave the option for the Hybrid Benefit turned off. And we'll stick with the default Disk options here. You can see that the default is Premium SSD for the OS, and the Encryption type is Encryption at rest with a platform-managed key.

We'll go into Networking next here. Now in this Networking page, I need to define the virtual network that's going to host the VMs in the scale set. I can either select the dropdown here and use an existing virtual network, or if I don't have any created already, which you can see here I don't, it's going to create a virtual network for me. So I'll allow it to create the virtual network for me and then under Network interface, this Network interface is the NIC that virtual machine or virtual machines that get spun up will use to connect to our virtual network. 

Now, remember, the virtual machine configuration we're using here is essentially, I guess I could call it a gold image. This is going to be the configuration that new VMs are spun up with when we autoscale up. Whatever I configure here, this is how those machines will be configured when they're spun up when they grow through the autoscale process.

Now, what I'm going to do for this demonstration here is edit my new NIC, which it's calling VMStuff-vnet-nic01. And I'm going to give this a public IP, which is not something that's necessary because we're going to load balance our scale set, so that's how you would normally access the VMs through an application in production through the load balancer. But to demonstrate the process of autoscale, I want to be able to get into my individual VM to put a load on it so I can show you how that autoscale works.

What I'll do here is I'll edit my Network interface, and I will allow 3389 in on the public IP. And we'll OK it. So with this configuration, when our scale set is deployed, it's going to be deployed using VMs that do have a public IP attached, so I can get to them outside of the scale set.

Now, what we're going to do for this exercise is use a load balancer, so I'll turn load balancer on. And then when I do that, I have two options. I can configure an application gateway if I'm going to be using a web, if I'm gonna use this for IIS, for example, I would more than likely use an application gateway to load balance it. If I'm going to be load balancing other traffic, then I'd wanna use an Azure load balancer, which is what we're going to use here. In my dropdown here, I have the option for gateway or load balancer, so I'll leave it at Azure load balancer. And I'll accept the default load balancer name and the default backend pool. If I had an existing load balancer, I could select it from the dropdown, but you can see here we don't have any existing load balancers configured. So I'll leave that alone. And the backend pool is what's going to include the backend VMs that are part of the scale set.

Now, at this point, we can click on Next for Scaling, and this is where we can configure the autoscale options for our VM scale set. If we hover over the instance count here, we can see that this is the number of virtual machines that are initially deployed as part of the scale set. And we'll leave this at the default of two.

If we hover over here again, we can see that we can set this from anywhere from zero to 1,000. I don't want to go broke, so we're certainly not going to set it to 1,000. Two is sufficient for this demonstration. And then in the Scaling policy, we can either configure it for manual scaling or custom. The manual scale option, as you can see here in the pop-up maintains a fixed instance count. Using custom autoscale allows the scale set to scale up and down based on a schedule or on metrics. So we'll go ahead and select Custom here because manual's no fun. Now, when we select Custom here, we need to specify the minimum value for our autoscale for the specific scale set and, of course, the maximum value as well. The defaults here are one and 10, and we'll leave them at their defaults. You can look at this scaling policy, and these values as really the lower and upper ranges of where the scale set will operate. These options here for Scale out and Scale in allow us to tell Azure how to control the autoscale for this VM scale set.

Now, what I'm going to do here, the default CPU threshold here is the CPU usage percentage threshold for triggering the Scale out autoscale rule. If we hover over the icon here for the threshold in Scale in, it's the same thing but for the Scale in rules. So this percentage here tells Azure when to scale the scale set out, or add the VMs, and this threshold down here tells Azure when to remove VMs from the scale set.

Now, this duration here, you're telling Azure to look at that duration of time. If, for example, in this screen here where we're looking at a 75% CPU threshold for 10 minutes, Azure is going to look at the performance of the existing scale set. And if the scale set has a 75% utilization on the CPU for at least 10 minutes, it will add this number of VMs to the scale set. Now, what I'm going to do here for this demonstration is change this to 50% and five minutes. Now, notice, if I change this to three, it won't let me. It has to be between five and 60 minutes.

Now, what I'm telling Azure here to do is if my scale set hovers at 50% for at least five minutes, add another VM to the scale set. And we'll keep doing that. After it adds the first VM, it will again keep tracking, and if my CPU threshold is still at least 50% for another five-minute stretch, it will add another VM. And it will keep doing that up until the Maximum of VMs value here, which is 10. Now, for Scale in, I'll leave this at its default because this will work for our demonstration here. I'm telling Azure that once my CPU utilization for the scale set falls to 25%, again, for five minutes, this duration is the same, what we'll do is decrease the number of VMs by one.

We'll leave the Diagnostic logs disabled for now. We're not worried about diagnostic logs at this point. In this dropdown here, I have a couple different options for my Scale in policy. If I select the dropdown, I can see I have three options. I have the default option, which will balance my scale set across my availability zones and fault domains and then delete the VM with the highest instance ID, or I can tell it to balance across the availability zones and then delete the newest created VM. The last option here is the same thing but to delete the oldest VM. I'll just leave it at its default setting here, and then we'll click Management.

The Upgrade policy here, if we hover over the icon next to Upgrade mode, this is the setting that controls how my VM instances in the scale set are brought up to date with the latest scale set model. And what that means is if I make changes to my scale set, VMs that were deployed within that scale set before those changes need to be upgraded to include those new settings so they know how to behave.

In this dropdown, I have three options. I can make it automatic. I can manually perform the upgrades, or I can tell Azure to do upgrades in a rolling matter in batches. If we select Automatic, instances will automatically upgrade in random order. If I do it manually, those instances will remain out of date until I manually upgrade them. For this exercise, I'm going to choose the Manual option. And then the rest of the options here, this Monitoring should look familiar to you if you've deployed a VM because it's doing exactly what you would expect it to with a VM.

We also have the Identity option here for System assigned managed identity, which we'll leave off. We don't have any use for that here. And then if we hover over the icon next to automatic OS upgrades, we can see that we can enable automatic OS image upgrades for the VMs in my scale set. And what this does is allow us to automatically upgrade the OS disk for all instances within the scale set. We'll leave this off for this demonstration, and then the instance termination notification here is just the option to receive notifications through the Azure Metadata Service, letting me know that instances were terminated. I'm not worried about getting notifications here, so we'll click Next for Health, and then we have the option to configure Health monitoring on our instances.

Now, if I was going to be doing automatic OS updates or automatic VM instance upgrades, which I did not choose early on, I need to turn this on. Since I'm doing everything manually, I'll leave this off. And then we'll go ahead and click Next for advanced, and then the allocation policy here, we can see that we can set a cap on our allocation of our instances.

Basically what we can do is tell Azure, do we want to allow scaling beyond 100 instances? We're never going to get to that point so we'll leave this at No. And then the Spreading algorithm here determines how the VMs within that scale set are going to be balanced across our fault domains.

Max spreading spreads the VMs across as many fault domains as possible within each zone. With Fixed spreadings as you can see here, the VMs are always spread across exactly five fault domains. We'll leave this at the default here. We're not interested in any kind of high availability here because we're going to have maybe three VMs running at the same time.

We're using a Windows machine, so we don't have any need for Cloud init so it's not supported since it's a Linux feature. And we don't need any changes to our Proximity placement group or to the generation of our VM. We'll go ahead and click Next for Tags. We're not going to tag anything here, so we'll just go to Next, and we'll review our settings.

Let it do a validation. And at this point, we can review our settings here, and we'll go ahead and click Create. And what this is going to do is spin out the load balancer. It's going to spin up the virtual network. It's going to spin up the VMs within the scale set. It's going to configure the frontend and backend for the load balancer to allow everything to work through that load balancer. We will get a public IP that we could use.

For example, let's say we were going to put IIS on our VMs. What we would do is create the same IIS application or website on the image that we created the VM from, and that's where you would get into custom images, which we're not doing here. But let's assume we were going to put IIS on that. You would then have that application on all of your VMs, and then users would access that application through the load balancer. I'm not too interested in testing the application for this demonstration. I just wanted to show you how to deploy a basic VM scale set with autoscale.

We'll let this deployment run. This will take a little bit because, again, it's creating quite a bit of stuff. If we go back home and we go into VMStuff here, we can see we're starting to create resources. We got the virtual network, the vmstuffdiag storage account that was created when our boot diagnostics was enabled. We got the public IP. We got our load balancer and the public IP for our load balancer along with the basic network security group. I'm going to pause the video right now. We'll let this run, and then when it's up and everything's been deployed, we'll bounce back in and I'll show you how this autoscale works.

Welcome back. So the deployment of our scale set has completed. On the screen here, you can see all of the resources that were deployed as part of that VM scale set. If we click on my scale set here, we can take a look at the Overview page of the scale set, along with all of the different settings. If I click on instances, I can see the instances that are currently running. I have the MyScaleSet_3 instance, which is running and is the latest model, and we have MyScaleSet_0, which is also running, but it's currently updating.

At some point, this updating VM will actually show a running state, and there we go. So we have our scale set with two instances that are running. If we click on Overview for our scale set, we can see that our CPU is averaging about 40%. So this isn't going to kick off any new instances because our threshold is 50%.

Now, what I've done before coming back to you is I've logged in to the instance three, I think it is. Yeah, MyScaleSet_3, I'm actually logged into that via RDP. And that's over on my other screen. Let me bring it up here. And now what I'm going to do, since these are really low end VMs, it doesn't take much to tip them over to really drive up CPU utilization. What I'm going to try and do here is maybe do a Windows Defender virus scan or something on this VM on my other laptop screen here. And we'll see if we can get this to kick off another instance.

Let me go in and just offscreen here I'm searching for Defender here. So I'll bring up Windows Defender, and I'm going to kick off a full scan. And let me take a look at Task Manager on this VM over here. All right, what I'm going to do is drag my VM over just for a quick second, and you can see I have an 84% CPU utilization now going on this VM.

So what I'm going to do is drag this back over onto my other screen, and we'll wait for the threshold to pass. Now, remember, if we click on Scaling here, for our scale set, we can see that were looking for an average CPU over 50%, and that average needs to be at least five minutes.

So let's bounce back over into Instances here, and we'll do a refresh. And we can see now that we have a new instance spinning up because the utilization of our scale set over the last five minutes is averaging over 50%, and that's because of the massive CPU growth and utilization that I've kicked off by performing a scan, so it's skewing that average because we only have five minutes of time. But when it's done it has looked at my instance here, and we can see where we're over 50%. This drop here is the drop that was incurred when we spun up this new instance. And we can see it's updating, but it's still running.

Now, what I'm going to do is switch back over to my VM, and I've canceled the scan that was occurring. Now, what that's going to do is drop the CPU utilization way down because there's really nothing going on on the VM. We'll give this a few minutes and what should happen is we should start to see a reduction in the number of instances within our scale set. We'll go ahead and refresh here.

Now, you'll notice on that refresh, the status for MyScaleSet_4 has changed from Running, Updating, to Deleting. So what the scale set has detected is that drop in CPU utilization, and what it's doing is removing the newly added instance that it added to the scale set when I began that scan.

Let's try and refresh again here. And now MyScaleSet_4 is going away. And it's also taking down MyScaleSet_3. And what it's gonna do is take MyScaleSet down to the minimum of one instance because it's sensing that's all it needs to support the current load, which is none, on the scale set itself. And we'll refresh here. And if I drag over my RDP session, we can see that I've lost connectivity to MyScaleSet_3. That's because the VM has gone away. And we'll refresh one more time here. And now we can see neither 3 or 4 is running.

They're currently just going through the deletion process. And while this is deleting, I'll just reiterate here, if we click on Networking here under Settings, if we select Load balancing, we can see the load balancer that we created as part of the deployment. We can see it has a Frontend IP address. We don't have a Frontend DNS Address for it. What you would typically do in a production environment is assign a public or maybe even a private DNS entry for it, and it would point to this frontend IP address for the load balancer. And then your users would access that application through the DNS name, which points to the load balancer IP, which load balances the underlying VM instances in the scale set.

So I'll bounce back into Instances here, and we'll refresh one more time. And we can see now, we only have the one instance left in our scale set. With that, you now know how to deploy a VM scale set that's load balanced and that leverages autoscale.

About the Author
Learning Paths

Tom is a 25+ year veteran of the IT industry, having worked in environments as large as 40k seats and as small as 50 seats. Throughout the course of a long an interesting career, he has built an in-depth skillset that spans numerous IT disciplines. Tom has designed and architected small, large, and global IT solutions.

In addition to the Cloud Platform and Infrastructure MCSE certification, Tom also carries several other Microsoft certifications. His ability to see things from a strategic perspective allows Tom to architect solutions that closely align with business needs.

In his spare time, Tom enjoys camping, fishing, and playing poker.