Load Balancing
Start course

Microsoft Azure supports a variety of options for both internal and external networking. In this course, you will learn how to design a network implementation using the appropriate Azure services.

Some of the highlights include:

  • Configuring virtual networks to connect Azure resources to each other
  • Deploying public and private load balancers to distribute incoming traffic to a pool of backend VMs
  • Load balancing across multiple regions using Azure Traffic Manager
  • Connecting on-premises networks to Azure either directly using ExpressRoute or over the internet through a site-to-site or point-to-site VPN
  • Overriding system default routes to meet your own custom routing needs
  • Protecting your applications from attacks with a web application firewall
  • Using network security groups to create a demilitarized zone (DMZ)
  • Building hybrid applications that include both Azure and on-premises resources using Azure Relay
  • Copying on-premises data to Azure using Data Factory, the Self-hosted Integration Runtime, and the On-Premises Data Gateway

Learning Objectives

  • Design Azure virtual networks
  • Design external connectivity for Azure virtual networks
  • Design network security strategies for Azure
  • Design connectivity for hybrid Azure applications

Intended Audience

  • People who want to become Azure cloud architects
  • People preparing for a Microsoft Azure certification exam


  • General knowledge of IT infrastructure and networking

Scaling an application horizontally by adding more VMs is a great solution, but how do you distribute requests to all of those VMs? You have two options: Azure Load Balancer and Azure Application Gateway.

Azure Load Balancer is the right choice for the most common scenarios. It acts as a frontend that distributes incoming traffic to a pool of backend VMs. It supports TCP and UDP applications. By default, it distributes requests equally to the VMs. If the application layer served by these VMs is stateless, then this distribution method will work well. If it’s stateful, though, then you’ll need to use source IP affinity mode. In this mode, all requests from the same IP address will always go to the same VM. You should avoid designing this sort of solution, though, because if a VM goes down, then all of its clients will be down.

If you have a stateless application layer and a VM goes down, then a health probe will detect that the instance is no longer available and it will remove it from the pool. Since no client sessions are tied to that specific instance, all of their requests will go to healthy instances.

A health probe is typically an HTTP request to an instance. By default, the load balancer sends a probe every 15 seconds, but you can change the timing if you want. If your VMs are hosting something other than a web application, then you can configure the health probe to try to establish a TCP session on a specific port.

If you’re using the load balancer for an internet-facing application, then you need to assign a public IP address to it. In this scenario, it’s called a public load balancer. All of the VMs behind it still use private IP addresses, though, and all incoming traffic from the internet must go through the load balancer.

When a VM initiates an outbound connection, then the load balancer performs network address translation (or NAT). In other words, it translates the VM’s private IP address to the public IP address on the load balancer plus a port that it maps to that VM. Because it has this capability, the load balancer can be useful even in situations where you don’t need load balancing. If you need a NAT gateway on Azure, then you can just deploy a public load balancer that only manages outbound connections.

If you have an internal application that shouldn’t be exposed to the internet, then don’t assign a public IP address to the load balancer. Instead, it will use a private IP address. In this configuration, it’s known as an internal load balancer.

You can also use both types of load balancers for the same application. For example, you could use a public load balancer for the web tier and a private load balancer for the business logic tier.

The other way to provide load balancing is to use Azure Application Gateway. Unlike Azure Load Balancer, which operates at layer 4 of the network stack, Application Gateway operates at layer 7. This gives you more flexibility in how you can route requests. Instead of just routing based on IP addresses and ports, it can route based on the URL. For example, if the URL of the request begins with the path to your videos folder, then you could tell the application gateway to route the request to your pool of video servers.

You can do lots of other handy things with it too. A few examples are redirecting all HTTP requests to HTTPS, serving traffic to multiple websites, and supporting websockets.

One especially useful feature is SSL or TLS termination. Because encrypting and decrypting traffic is computationally expensive, you can offload that task to the application gateway so the web servers don’t have to do it. However, if your security requirements won’t allow any unencrypted traffic, then you shouldn’t do this, because the traffic between the gateway and the backend web servers would be unencrypted.

Speaking of security, one of the biggest reasons for using an application gateway is the built-in web application firewall. It protects your applications from common exploits, such as SQL injection and cross-site scripting attacks.

Both of Azure’s load balancing options work within a single region. If you want to provide load balancing for VMs or web apps that are distributed across multiple regions, then you’ll need to use either Azure Traffic Manager or Azure Front Door.

Azure Traffic Manager redirects traffic at the DNS level. It doesn’t act as a gateway—it simply tells the client which address it should connect to.

For example, suppose you have an application that’s deployed in 3 regions: East US, West Europe, and Southeast Asia. To minimize the network latency for every user, you could configure Traffic Manager to check where each request is coming from and direct it to the closest region. This is called performance routing. If one of the regions goes down, then Traffic Manager can redirect requests to the next nearest region. This is called priority routing.

There are also two other routing methods available. With weighted routing, it will distribute requests either evenly among the endpoints or according to weights, such as 50% to endpoint 1, 30% to endpoint 2, and 20% to endpoint 3. Geographic routing is similar to performance routing because it looks at the client’s location. The difference is that you can specify exactly which region you want a client to connect to based on their location. For example, if your European customers require that their data stays in Europe, then you could always route them to a European region.

SinceTraffic Manager doesn’t load balance within each region, in most cases, you’ll need to use it in conjunction with Azure Load Balancer, which will distribute requests to the VMs in a cluster. 

Azure Front Door is similar to Application Gateway because it works at the web layer. But like Traffic Manager, it can distribute traffic across multiple regions. If you have a multi-regional application, in most cases, you should use Front Door if it’s a web application or Traffic Manager if it’s not. It rarely makes sense to use both.

Like Application Gateway, Front Door offers TLS termination, path-based routing, and a web application firewall, but it also provides a Content Delivery Network (or CDN). It caches your web content on Microsoft’s edge network, which contains hundreds of points of presence around the world. Your application will have a faster response time because your users will retrieve the application’s cached web data from the nearest point of presence.

There’ll be plenty of application data that’s not cached in the CDN, though, so Front Door still has to distribute requests to your application’s backend web servers. Like Traffic Manager, it has 

different routing methods you can choose from for how it distributes requests to the backends.

The four methods are:

  • Latency routing, which sends requests based on the lowest latency;

  • Priority routing, which you can use as a failover mechanism if the primary region goes down;

  • Weighted routing, which lets you set a percentage of traffic for each backend to serve; and

  • Session affinity, which will always send requests from a particular user to the same backend.

Like Traffic Manager, Front Door only load balances across regions, so it needs to use another service to load balance within each region. If your web application is hosted on Azure App Service, then you don’t have to worry about it because App Service provides its own load balancing. But if your application is hosted on containers or virtual machines, then you’ll need to use Azure Application Gateway or Azure Load Balancer to distribute traffic in each region.

Finally, both Front Door and Traffic Manager will also work with non-Azure endpoints, such as applications that are on-premises or in another public cloud. 

And that’s it for load balancing.

About the Author
Learning Paths

Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).