AWS Elastic Load Balancer — Overview | by Dilshan Fernando | June 2022

Posted on

Hi world!

Today I’m going to talk about AWS Elastic Load Balancer. But, before getting to the main topic, let’s get a brief idea about scalability and high availability.

Scalability and high availability

First, let’s talk about scalability. Scalability means handling the greater load of an application with the help of resource adoption. There are types of scalability.

  • Vertical scalability
  • Horizontal scalability (or it may call elasticity)

Now let’s take an in-depth look at these types of scalability.

Vertical scalability — means increasing the size of resources. For example, if you are running the t2.micro instance and notice that the application needs more resources, you can upgrade t2.micro to the t2.large instance.

This type is commonly used for non-distributed systems such as databases and RDS. ElastiCache is a service that can scale vertically. But this type has a limit which is the limit of the hardware. For example, if you are using the t2 family of ec2 instances, you can use t2.2xlarge.

Horizontal scalability — it means adding more and more instances/resources to your application. This type applies a distributed system. This is widely used for web applications. AWS EC2 is very easy to scale up/down.

But scalability!== High availability. High availability means running applications in at least two Availability Zones. If the number of Availability Zones increases, availability will be high.

So, let’s jump into our main topic.

ELB is a server responsible for distributing incoming traffic to the application on several servers (ex: instances).

So above I have attached the diagram which shows the high level overview of ELB. As you can see, first, USER-01 requests content by pointing to the Elastic Load Balancer via an HTTP/s request. Next, Elastic Load Balancer will point to an available EC2 instance (EC2–01). Then another user (USER-02) requests another or the same content by pointing to the Elastic Load Balancer via an HTTP/s request. Then ELB will point this HTTP/s request to the EC2–02 instance. So the same thing happens to USER-03, which HTTP/s will redirect to EC2 instance -03. This is how the load balancer balances incoming web traffic by distributing requests across multiple servers.

So why do we actually need this load balancer?

AWS ensures this will be available all the time, and AWS will update, upgrade, and maintain ELB. With the ELB, we can easily integrate AWS services and ensure high availability across multiple Availability Zones.

An elastic load balancer will provide a single endpoint (DNS) to access our applications.

EBL always ensures that all connected instances/resources will work correctly using the Helth check function. We will discuss this “EBL health check”.

Build stickiness with cookies.

So what is it Checkup in Elastic Load Balancer?

This is a simple concept in ELB. Basically, ELB forwards public HTTP/s requests to multiple instances or resources. Thus, ELB must ensure that these connected resources are available or functioning properly for response. Thus, ELB must regularly check the connected resources/instances. This check is called Checkup in Elastic Load Balancer. Usually the health check is performed on a port and a route (/health). If the instance is healthy, the response will be 200(OK); otherwise the instance/resource is unhealthy.

Let’s talk about the Jtypes of load balancers on AWS.

There are 3 types of load balancers which are all managed by AWS, and AWS recommends the latest version of load balancers.

  1. Classic Load Balancer (old version) – this was introduced in 2009, and this LB supports HTTP, HTTPS, TCP (layer 4 or lower) and SSL (secure TCP). This load balancer performs TCP or HTTP-based health checks and supports only one SSL certificate. Must use multiple CLBs for multiple hostnames with multiple SSL certificates.

2. Application Load Balancer (newer version) — This LB was introduced in 2016 and supported HTTP, HTTPS and WebSockets. This is also known as Layer 7 Load Balancer (HTTP). This ALB is a grid if you have a microservices or container-based approach. However, ALB has smart routing techniques. Thus, we can configure target groups according to the Path URL, Hostname, and query string parameters (refer to the diagram attached below). Additionally, ALB can target many groups, which are EC2 instances, ECS tasks, Lambda functions, and IP addresses. Supports multiple listeners with multiple SSL certificates. Uses Server Name Indication (SNI) to make it work.

3. Network Load Balancers (Layer 4) — This LB was introduced in 2017 and supported TCP, TLS (secure TCP) and UDP. This high performance load balancer can handle millions of requests per second with less latency ~100ms (vs 400ms for ALB). NLB has a static IP address per AZ, supports elastic IP assignment, supports multiple listeners with multiple SSL certificates, and uses Server Name Indication (SNI) to make it work.

OK, let’s talk about it. Sregularity of load balancing sessions.

So just what is the load balancer’s responsibility to distribute the incoming traffic load to the application. But sometimes we have to to stick on to a resource each time according to the HTTP request. So, in this scenario, AWS provides a feature called “Adhesion for Load Balancer”. This concept is also referred to as “session affinity”.

As you can see in the accompanying diagram, clients stick to a specific instance. But this breaks down the concept of balancing. Importantly, this only works for Classic Load Balancers and Application Load Balancers. Additionally, the “cookie” used for membership has an expiration date that you control.

Drain connection

Time taken to complete “requests in progress” while the instance unregisters or is unhealthy and stops sending new requests to the EC2 instance, which unregisters between 1 and 3600 seconds (default: 300 seconds) . And this concept naming differs depending on the reference load balancer.

  1. Drain connection — for CLB
  2. Deregistration delay — for ALB and NLB

What is that Autoscale group?

Thus, application traffic is not always the same in the real world. So in this case, if we can allocate more resources when traffic is high and remove allocated resources when traffic is low, we will get much more cost-effective and user-friendly latency. To do this, AWS provides a feature called Auto Scaling Group. The purpose of an Auto Scaling Group (ASG) is to:

  1. Scale (add EC2 instances) to cope with increased load
  2. Scale (delete EC2 instances) to match reduced load
  3. Make sure we have a minimum and maximum number of machines running
  4. Automatically register new instances in a load balancer

So, that’s all about Elastic Load Balancer.

I hope you have something new!

Have a nice day! See you soon in the following incredible story.

Leave a Reply

Your email address will not be published.