Alibaba Server Load Balancer
This course introduces the Alibaba Server Load Balancer (SLB) service and its features, components, and settings. You'll also learn how to use SLB through a guided demonstration from the Alibaba platform.
- Get a basic understanding of Alibaba Cloud SLB
- Learn about the features, components, and additional settings of SLB
- Learn how to set up a server load balancer
This course is intended for anyone looking to use server load balancer to manage their Alibaba Cloud workloads, as well as anyone studying for the ACP Cloud Computing certification exam.
To get the most out of this course, you should have a basic understanding of the Alibaba Cloud platform.
Okay, next, let's take a deeper look at some of SLBs key components. So we'll start with the Load Balancer instance, the Listener and the backend server. These are the three critical components of Alibaba's Server Load Balancer system. The Load Balancer instance is a running copy of the Load Balancing service that's under your account that's bound to a particular region and what it does is receives incoming requests and distributes them to your backend servers.
So how does it listen for those incoming requests and how does it distribute them? Well, the rules for that are contained in the listener. Every SLB instance must have at least one listener. This is the component that decides what to do with incoming connections, where to send them and it also performs health checks on the backend servers to make sure that they're healthy and the traffic can safely be distributed to them.
The backend servers themselves are nothing but ECS instances, they are ECS virtual machines that are waiting and listening for the requests that are coming in from the Load Balancer. Okay, let's talk a little bit about cost optimization. So currently, Alibaba cloud SLB only supports purchasing in Pay As You Go mode. If you purchase what we call a public SLB instance, then you will pay for instance usage and for public network traffic, you'll be billed for the exact volume of resources that you use per hour.
There's no long-term commitment or upfront fees here. You're paid only for what you use. There's another mode as well. We have a private network SLD and that is free. So if your Server Load Balancer is internal to an Alibaba cloud VPC group. If it doesn't request, if it doesn't accept incoming connections from the internet, then it's totally free. So you can use internal Server Load Balancers for free. However, if your Load Balancer is accessible from the public internet, then you need to pay for it on a Pay As You Go basis.
There are two general instance types. There's the classic and VPC network SLB. Classic network, you don't need to worry about that. I know new instance classes support the classic network. This is something from before Alibaba implemented the VPC system, essentially a new classic network, all Alibaba cloud clients share one big private IP address space and so sometimes in our documentation or in the console, you'll see references to this old classic network system.
Whenever you're setting up a new Load Balancer though or a new ECS, you should always choose VPC network. So the Load Balancer that you want is the VPC Load Balancer and if you provide a public ID, then what you've done is set up a public server Load Balancer and you'll be charged for that. If you don't give your SLB a public IP address, then it's a private server Load Balancer and you can run it for free. One advantage of having this public private network SLB system is you can take advantage of this to create a multi-tiered architecture. So you can have multiple internal private server Load Balancers sitting in front of internal tiers of your application and then a single public Load Balancer sitting in front of your web server tier.
So in this way, you can achieve elasticity and high availability for the internal tiers of your application as well as for your external web tier. So you get a great trade off here between performance and cost. So let's talk about listeners. So listeners define the protocol and port on what's the Load Balancer listens for incoming connections. Each Load Balancer needs at least one listener to accept incoming traffic. Each Load Balancer has to have at least one listener. The listener defines three things, the routing or forwarding rules. So how traffic is distributed among backend instances, whether or not session stickiness is enabled. This is also set at the listener level. Essentially this means that an ongoing web session with a particular user will always be forwarded to the same backend server for the duration of the session, that's session stickiness and then health check configurations are also done at the listener level.
So checking whether or not the backend servers are functioning is done at the listener level. There's one thing that's not set at the listener level, that's peak bandwidth. Peak bandwidth is set at the instance level. So when you purchase your SLB instance, you're deciding what its specifications will be. How many queries per second it can respond to, how many concurrent sessions it can handle, peak bandwidth, those are all set at the instance level but the listener defines routing rules, session stickiness and health check.
So on the listener there are different types of forwarding rules or scheduling algorithms. There's three. There's round robin, weighted round robin and weighted least connections. So how do these work? In round Robin, requests are distributed evenly across all of the backend ECS servers. So if you have three servers, let's call them A, B, C and incoming requests will be distributed like this, A, B, C, A, B, C, A, B, C. So the request will always be distributed sequentially and evenly across the backend instances. In weighted round robin, again, requests are distributed sequentially but they may not be distributed evenly. With weighted round robin, you can set a weight for each backend server. This is a great thing if you have backend servers that have different hardware configurations, because you can set higher weights for servers that have more CPU and memory available, that way those servers will be allocated more connections. Then there is WLC, weighted least connections.
In addition to setting a weight for each backend ECS server, the number of active connections is also considered when new requests are distributed among the backend servers. This is good if your application has long lived sessions or connections, this will avoid a pileup. So if an instance already has multiple active old sessions, then WLC will avoid giving new sessions to that server and will give them to a different backend server that has less active connections instead.
Limitations of the listeners. You can have up to 50 listeners per Server Load Balancer instance and typically each listener corresponds to one application that's deployed on your backend ECS servers. Why is that? That's because each listener has one port number that it's listening on. Backend servers. So the backend servers that sit behind your listener are just ECS instances. These are the servers that you've added to your Server Load Balancer server pool or V server group in order to handle incoming requests, the SLB service will forward external requests to these servers according to the rules defined in your listener, and ECS instances can be registered with the same Load Balancer using multiple ports. What that means is multiple listeners can send requests to the same backend ECS instance on different ports that is allowed.
Limitations for the backend servers, no cross region deployment. So all of the backend servers behind your Load Balancer need to be in the same region as the Load Balancer. So if your Load Balancer is in Hong Kong, then your ECS servers also need to be in Hong Kong. They can't be in Singapore or Beijing. However, there's no limitation on the operating system as long as your backend servers respond to requests in the same way, it does not matter what they're running. You could have half of them running Linux with Engine X and the other half running Windows with IIS. That would be totally okay as long as they respond to requests in the same way.
Backend server groups. Because the IP address presented by the SLB creates a single interface through which all the backend servers are accessed, multiple ECS instances located in the same region can act as a highly available server pool and there's two different types of server groups that are supported depending on what type of listener you choose. If you choose a layer-4 listener for TCP or UDP, then we have what is called a master slave server group. This is the backend server pool. This is essentially just to provide HA, not really to provide true Load Balancing. The master, all requests are distributed to the master when it's working normally, if it goes down and fails its health check, then it'll request to go to this length. Things are a little bit more sophisticated for the layer-7 listener, we have what we call virtual server groups. So you can add multiple ECS instances into a virtual server group and you can even configure domain name or URL based forwarding rules to decide which of those servers, which of those ECS instances inside the group will handle which requests.
Health checks. There are actually several different types of health checks that the listener can perform depending on the protocol of the listener. For HTTP or HTTPS, the health check is fairly simple. The Server Load Balancer will send an HTTP head requests to each backend ECS server and it will expect to get back an HTTP status code like 200, okay. If it doesn't get the code was expecting, then the server fails the health check. For TCP, things are a little bit more complex.
For a layer-4 listener, during the health check, what happens is the Load Balancer will send an SYN to the backend ECS, it will expect back in SYM plus an acknowledgement and then it will send an acknowledgement and a reset. If it gets through that whole process, one, two, three, four steps, if it gets through all four of those steps, SYN, SYN plus ACK, ACK, reset then the instance is considered healthy. If it doesn't get an acknowledgement back at step two, if it doesn't get SYN plus ACK back from the backend server at step two, then the instance fails the health check. UDP is a little bit unique.
TCP and HTTP are both protocols that have a concept of a connection, right? So in those two protocols you have an expected challenge response that should take place to show that the backend ECS is there and working. With UDP that's not the case because UDP is a connectionless protocol. So what happens is the Server Load Balancer sends a UDP probe to the backend ECS server as you can see in the diagram here and it expects to get nothing back, if it doesn't receive a response within a certain duration, then it considers the backend ECS to be healthy. If it receives an ICMP unreachable error within its timeout window, then it considers the backend ECS to be failed or unhealthy. So that's a little counter-intuitive.
For HTTP and for TCP you expect a response. That's how you judge health. For the UDP health check, you expect to get nothing back. If you get a response back to the health check, that means the health check failed but in any case, this is all implemented for you inside the listener so you don't really need to worry about how this works exactly, but just know that if the Load Balancer detects that an instance is unhealthy, it can automatically stop sending new traffic to that instance which will help ensure the availability of your service. Okay, that's all for this section. In the next section let's take a look at Server Load Balancers additional settings.
Alibaba Cloud, founded in 2009, is a global leader in cloud computing and artificial intelligence, providing services to thousands of enterprises, developers, and governments organizations in more than 200 countries and regions. Committed to the success of its customers, Alibaba Cloud provides reliable and secure cloud computing and data processing capabilities as a part of its online solutions.