Load Balancing Diagram

Load balancing is one of the essential concepts for building architectures that can handle heavy traffic and are resilient to failures. It is a set of techniques for distributing workload across different computers in a group.

The use cases for load balancing are increasingly numerous, especially with the advent of cloud computing and decentralized data architectures. These techniques are used in: supercomputers, high-traffic HTTP services, databases requiring permanent access, big data, neural network training, etc.

The collection of processing servers (called real servers) and load balancers is called a “virtual server” (or server farm).

Load balancing uses algorithms to determine which real server will handle the request.

  • Layer 2 of the OSI model: The data link layer, it relies solely on MAC addresses for its balancing algorithm. Some solutions using this method can perform on-the-fly routing. Used by network equipment.
  • Layer 4 of the OSI model: The transport layer, it relies solely on the IP address for its balancing algorithm. Example software: HAProxy.
  • Layer 7 of the OSI model: The application layer, the balancing algorithm can use application data. Example software: Nginx.

Sticky Sessions

Some applications require “sticky sessions”, meaning users must always be served by the same processing server to keep their session files accessible.

There are two methods for this: either use a source hash balancing algorithm based on a client constant to determine a processing server, or apply a more traditional algorithm but send the client a token indicating which server holds the session, so subsequent requests are automatically redirected to the correct server.

Some services implement clusters capable of sharing data, so all servers in the pool can access session files and the session issue is pushed lower in the service architecture.

High Availability

The principle of high availability is to implement a “failover”: when a server is offline and the load balancer tries to contact it unsuccessfully, the request is assigned to another real server.

Weighting

Weighting consists of assigning each real server a weight to give it more or fewer tasks. The purpose is to have different server models while ensuring each is utilized according to its capacity. This technique is compatible with all load balancing algorithms.

Some Load Balancing Projects

  • Traefik is a load balancer specifically designed for modern Docker-based architectures. The principle is to make load balancing automatic by simply declaring each container’s purpose.
  • Kong is a project based on Nginx, giving it excellent performance. This project is built for centralized management of a set of APIs (e.g., for a microservices architecture).
  • ProxySQL is a project aimed at managing load balancing and data replication between multiple SQL servers.