Elastic Load Balancer (ELB) is one of the key architecture components for many applications inside the AWS cloud. In addition to auto scaling, it enables and simplifies one of the most important tasks of our application’s architecture: scaling up and down with high availability.
Elastic Load Balancing automatically distributes incoming application traffic across multiple applications, microservices, and containers hosted on Amazon EC2 instances.
One of the many advantages of using ELB is the fact that it is elastic, which means that it will automatically scale to meet your incoming traffic. If you are a system administrator or a DevOps engineer running your load balancer by yourself, then you need to worry about scaling load balancer and enabling high availability. With ELB, you can create your load balancer and enable dynamic scaling with just a few clicks.
Since it was first released in 2009, Elastic Load Balancer has added numerous improvements and features. The Application Load Balancer (ALB) is a logical step forward in developing load balancing possibilities inside the AWS cloud. With this addition, the original load balancer has been renamed Classic Load Balancer, and it’s still available for use inside the AWS cloud. In this post, we’ll check out the features of Application Load Balancer compared to the original, show you how to monitor ALB, and finally, we’ll take a look at pricing.
Elastic Load Balancing
When you start developing your application, your architecture may look like this:
Even though it isn’t recommended, the one instance containing the application is often the database for that application. In the beginning, this setup will work just fine for your application. However, this infrastructure has a few challenges. For example, the one instance where your app is located can fail, and if that happens, your app will also be down. Also, if you experience sudden spikes in traffic, your instance won’t be able to handle certain amounts of traffic. To strengthen your infrastructure and face the challenges (unpredicted traffic spikes, high availability, etc) you need to introduce Elastic Load Balancer into the equation.
As shown in the image above, Elastic Load Balancer will take the incoming traffic from users and spread that traffic over three instances.
If any of these three instances fail, ELB will automatically detect that and shift the traffic to the remaining healthy instances. In combination with Auto Scaling backend instances, it will automatically scale just as ELB does if a traffic increase occurs.
Even if you don’t use high availability, each Elastic Load Balancer provides it by utilizing multiple Availability Zones. Elastic Load Balancer is located inside its own VPC (Elastic Load Balancer VPC). Inside that VPC, an ELB is located in various subnets which are then located in different availability zones, thus providing high availability.
After user demand arrives at the ELB endpoint, it gets accepted inside the ELB of the VPC by one of the load balancers. Then, it is securely injected into your VPC.
Application Load Balancer vs. Classic Load Balancer
In addition to Application Load Balancer, another load balancer, the network or classic load balancer, distributes traffic based on layer 3 and 4.
The Classic Load Balancer is a connection-based balancer where requests are forwarded by the load balancer without “looking into” any of these requests. They just get forwarded to the backend section.
ALB works on a Layer 7 OSI model and allows traffic distribution toward backend instances based on the information inside the HTTP requests header. With Application Load Balancer, the connection is terminated at the ALB, and there are connection pools toward the backend instances.
There is a possibility of opening multiple connections toward the backend instances, and those connections are used to forward requests. Also, you can modify the headers. Most importantly (and in contrast with Classic Load Balancers), the header contains an X-forwarded-for field containing the client IP address.
|Feature||Classic Load Balancer||Application Load Balancer|
|Protocols||HTTP, HTTPS, TCP, SSL||HTTP, HTTPS|
|Sticky sessions (cookies)||YES (you can provide your own application cookie)||Load balancer generated|
|Back-end server authentication||YES||NO|
|Back-end server encryption||YES||YES|
|Idle connection timeout||YES||YES|
|Cross-zone load balancing||YES||Always enabled|
|Health checks||YES||YES (Improved)|
|CloudWatch metrics||YES||YES (Improved)|
|Access logs||YES||YES (Improved)|
|Route to multiple ports on a single instance||NO||YES|
|Load balancer deletion protection||NO||YES|
Application Load Balancer enables content-based routing and allows requests to be routed to different applications behind a single load balance. While the Classic Load Balancer doesn’t do that, a single ELB can host single application. ALB isn’t an improved Classic Load balancer. It’s made on a completely new platform. As with Classic Load Balancer, ALB is a fully-managed, scalable, and highly available load balancing platform.
Thanks to the path-based routing feature, you can add up to 10 different applications behind a single ALB. In addition, ALB provides native support for microservices and container based architectures. With Classic Load Balancer, you could only use one port at a time, while ALB instances allow you to register with multiple ports. To support new functionalities added inside the ALB, a few new resource types were added, including target groups, targets, and rules.
ALB and Classic Load Balancer have listeners that define the protocol and port, where the load balancer listens for incoming connections. Each load balancer has to have at least one listener and it supports up to 10 listeners. Routing rules (content based, path based routing) are defined on listeners.
Target groups are a logical grouping of targets behind a load balancer and they can exist independently of the load balancer and may be added to it if needed. A target represents a logical load balancing target, and these can be EC2 instances, microservices, or container-based applications. Single targets can be registered with multiple target groups.
Rules provide a link between listeners and target groups and consist of conditions and actions. Each rule represents a condition and action that we want to follow. Currently, only one action is supported: forwarding requests to a specified target group action. If no rules are found, the request will follow the default rule, which forwards the request to the default target group.
Monitoring an ALB
Application Load Balancer is a fully managed service, which means that you don’t have the option to access the SSH to see what’s happening. Monitoring ALB consists of two parts: CloudWatch Metrics & Alarms, and access logs.
CloudWatch metrics are available on the load balancer and target groups levels.
There are several load balancer CloudWatch metrics that you should monitor. HealthyHost Count shows the number of healthy instances in each Availability Zone. Latency measures the elapsed time (in seconds) from the moment of the request being forwarded to the backend section, to the moment of the response from the backend section. Because ALB doesn’t use surge queues like the Classic Load Balancer, it’s important to pay attention to the Rejected Connection Count metrics. This is the number of connections rejected because the load balancer couldn’t make a connection to the health target to route the request.
Access logs are stored to the S3 storage. Access logs for ALB are generated every five minutes. You will have to pay S3 expenses but you won’t pay for the data transfer to the S3. Access logs are “eventually consistent,” which means that the files can be produced out of order.
AWS does not guarantee that every request will be written to the access logs. Some records may be missing in the end, Amazon says:
“Elastic Load Balancing logs requests on a best-effort basis.”
Access logs contain the request type (HTTP, HTTP/2, etc.) and the timestamp left by the ALB by the UTC time zone, ELB identifier, IP address, and client port, which sends requests, the IP address, and port target instance that the request is routed to.
Also, access logs contain information about request_processing_time, target_processing_time, and response_processing_time. Application Load Balancer needs a certain amount of time to receive and forward a request to the client; this is called response_processing_time.
Inside the access log, you can see data about the code status and target_status_code, which is produced by instances behind the load balancer and elb_status_code produced by the load balancer.
There are two different response codes because it may happen that the load balancer doesn’t receive a response by the backend instances, targets time out, and then it has to send the HTTP response back to the client containing the status code. Also, you have data about traffic, recived_bytes, and sent_bytes, which represent the amount of data received by the load balancer from the client’s side and the amount of data it sent back.
The access log contains the original HTTP request, user_agent, ssl_chiper, and ssl_ protocol as well as data about the target group, target_group_arn, to which the request is routed.
The Application Load Balancer pricing model is a bit unclear. You pay for each hour that your Application Load Balancer is running and you also pay for the number of unused Load Balancer Capacity Units (LCU).
The amount of capacity units is based on one of the three dimensions. The first dimension is for new connections. For each of the 25 new connections per second, you consume 1 LCU. The second dimension is for active connections. For 3,000 active connections per minute, you consume 1 LCU. The third dimension is for bandwidth. For 2.22 Mbps, you consume 1 LCU.
It’s worth pointing out that you will be charged only for the dimension you use the most. In other words, if your application receives a lot of new requests, the dominant dimension will be new connections and you will be charged for ALB according to that dimension. If you have a lot of large file data transfers, the dominant dimension is bandwidth. If you use WebSockets, the dominant dimension will be active connections.
Multiple applications on one ALB can save us a lot of money. In this way, we will reduce the hourly costs while maintaining the same amount of data received.
In addition to cost savings, Application Load Balancer offers more features and flexibility compared to the Classic Load Balancer. There are some exceptions, however. For example, if you’ll be using TCP/SSL or EC2-Classic, then you should use the Classic Load Balancer.
For all other use cases, Amazon recommends using the Application Load Balancer.
To learn more about Elastic Load Balancer, take a look at Cloud Academy Elastic Load Balancer Courses, Labs and Quizzes.