A load-balancer is used to divert traffic to different servers when the time to route a request is small compared to the time to process the request. By fairly distributing the workload across multiple servers, a load-balancer can effectively respond to more requests that a single server could.

Why should I care?

There comes a point when customer traffic is more than a single server can handle. At that point, the application becomes unresponsive to some or all users.

How does it work?

When the time to route a request is small compared to the time to process that request, a load-balancer redistributes the workload fairly across multiple servers. This gives the server cluster the ability to respond to more requests than a single server or a server cluster without a load-balancer ever could.

With images, movies, PDFs, and other unprocessed files, aka "static" content, a CDN is often used to handle increased traffic. Application load-balancers are used when business logic needs to be executed - aka dynamic content.

Load-balancers are very close to CDN in purpose and will often include similar features like RBAC.