A load balancer is a type of server that sits between external web traffic and backend servers. It receives incoming requests and distributes traffic across two or more servers.
Every server has a maximum resource capacity or workload that it can handle. When a server reaches its maximum capacity, it can be scaled in two ways:
Load balancing is used with horizontal scaling when you have two or more servers.
Here are some key benefits of using a load balancer:
A well-designed load balancer is efficient, highly available, and secure.
Engineering teams often leverage a reverse proxy such as HAProxy or Nginx that is well-designed for managing and distributing web traffic.
Cloud providers also offer an out-of-the-box load balancing service, such as Amazon's Elastic Load Balancer (ELB).
While a single load balancer can be placed between external traffic and application servers, this is not always the case.
In a microservice architecture, you would use a load balancer in front of each service, allowing each part of the system to be scaled independently.
Load balancing can't replace good system design.
Adding more servers won’t fix issues related to inefficient calculations, slow database queries, or unreliable third-party APIs. These must be addressed independently.
While often used together, load balancing should not be confused with rate limiting.
Load balancers improve the scalability, reliability, and performance of applications. Multiple servers are grouped together to distribute the load, ensure high availability, and minimize latency.
On the other hand, rate limiters prevent abuse from a single user or organization by intentionally throttling or dropping requests that exceed a specific threshold.
Rate limiting is a common mechanism for protecting applications against DDOS attacks which try to consume server resources, making applications unavailable.
Understanding these key concepts and interview questions about load balancing will help you prepare for your system design interviews.
A load balancer is a reverse proxy that receives web traffic and distributes it across multiple backend servers.
The main benefits of a load balancer are improved performance, scalability, and reliability. A load balancer can also help to minimize latency and reduce downtime by providing redundancy and failover capabilities.
Read More: Strategies for improving availability in system design interviews.
Load balancers typically route traffic using the following mechanisms:
Typically, load balancers come in two types:
Layer 4 (L4) load balancer
Layer 4 is the transport layer of the OSI model. It's where routing decisions are made. A Layer 4 (L4) load balancer is also known as a network load balancer.
Here, routing decisions are made based on network layer information.
L4 load balancers perform Network Address Translation (NAT) on the request packet. They don't actually inspect the packet's themselves, though.
This load balancer gets the most use and availability out of IP addresses, switches, and routers by spreading the traffic across them.
Layer 7 (L7) Load Balancer
Application Load Balancer (ALB) is a Layer 7 load balancer. It routes traffic based on the content of the request.
It is one of the oldest forms of load balancing and allows routing decisions to be made at the application layer (HTTP/HTTPS).
Read more: When should you use HTTP or HTTPS?
ALB can use HTTP header, cookies, uniform resource identifier, SSL session ID, and HTML form data to determine which server should handle a request. This makes it possible to direct requests to different servers based on factors such as user location or device type.
The benefit of using an Application Load Balancer is that it can provide a more efficient distribution of requests across multiple servers.
Sticky session load balancing uses IP hashing to map requests from a specific client IP to the same server. Subsequent requests from the same user are always sent to the same server for the user session duration.
Why is this helpful? All data related to a user session is stored on a single server instead of all servers. This enables session state and data consistency while reducing latency.
A reverse proxy server is an intermediary server that sits between client requests and backend servers. It forwards client requests to one or more backend servers and then returns the response to the client.
Here are some benefits of using a reverse proxy:
Round-robin load balancing is a popular algorithm for distributing client requests across multiple backend servers. Requests are cyclically assigned to servers, so every server takes its turn in receiving and processing a request.
Here are some variations of the round-robin algorithm:
When using SSL (Secure Socket Layer) with load balancing, the load balancer must be configured to handle the decryption and re-encryption of data before it is sent to a backend server. This ensures secure communication between two endpoints.
When a request comes in, the load balancer must first decrypt it, determine which server should receive it, and then re-encrypt it before sending it off. This process adds additional latency to ensure data security.
If multiple SSL certificates are used for different domains or subdomains, the load balancer must be able to select the correct certificate for each request. This could be time-consuming, depending on how many certificates need to be managed.
While SSL load balancing doesn’t improve performance directly, its implementation adds complexity and latency, making it an important factor when designing system architecture.
Exponent is the fastest-growing tech interview prep platform. Get free interview guides, insider tips, and courses.
Create your free account