NOTE: Expert BIGIP and 3DNS Consulting from DOESconsulting is now available!
Click here to discuss your BIG-IP®/3DNS® consulting needs!
Load Balancing FAQs and Key Concepts
Server Load Balancers
A load balancer is a device that distributes load among several
machines. As discussed earlier, it has the effect of making several
machines appear as one. There are several components of SLB
devices, which are discussed in detail.
Virtual IP (VIP) is the load-balancing instance where the world points
its browsers to get to a site. A VIP has an IP address, which must be publicly
available to be useable. Usually a TCP or UDP port number is associated
with the VIP, such as TCP port 80 for web traffic. A VIP will have at
least one real server assigned to it, to which it will dispense traffic.
Usually there are multiple real servers, and the VIP will spread traffic among
them using metrics and methods, as described in the "Active-Active
A server is a device running a service that shares the load among other
services. A server typically refers to an HTTP server, although other or even
multiple services would also be relevant. A server has an IP address and
usually a TCP /UDP port associated with it and does not have to be publicly
addressable (depending on the network topology).
While vendors to indicate several different concepts often use the
term "group", we will refer to it loosely as a group of servers being load
balanced. The term "farm" or "server farm" would also be
applicable to this concept.
A user-access level refers to the amount of control a particular user
has when logged into a load balancer. Not only do different vendors refer to
their access levels differently, but most employ very different access-level
methods. The most popular is the Cisco style of user and enable (superuser)
accounts. Another popular method is the Unix method of user-level access.
A read-only access level is one in which no changes
can be made. A read-only user can view settings, configurations, and so on, but
can never make any changes. An account like this might be used to check the
performance stats of a device. Read-only access is also usually the first level
a user logs into before changing to a higher access-level mode.
A superuser is the access level that grants the user
full autonomy over the system. The superuser can add accounts, delete files,
and configure any parameter on the system.
Many products offer additional user levels that
qualify somewhere between the access level of a superuser and a read-only user.
Such an account might allow a user to change SLB parameters, but not system
parameters. Another level might allow configuration of Ethernet port settings,
but nothing else. Vendors typically have unique methods for user-access levels.
Redundancy as a concept is simple: if one device should fail, another
will take its place and function, with little or no impact on operations as a
whole. Just about every load-balancing product on the market has this
capability, and certainly all of those featured in this book do.
There are several ways to achieve this functionality. Typically, two
devices are implemented. A protocol is used by one device to check on its
partner's health. In some scenarios, both devices are active and accept
traffic, while in others, only one device is used while the other waits in case
In redundancy, there is often an active-standby relationship. One unit,
known as the active unit, takes on some or all of the functions, while another,
the standby, wait to take on these functions. This is also often called the
In certain scenarios, both units can be masters of
some functions and slaves of others, in order to distribute the load. In other
cases, both are masters of all functions, sharing between the two. This is
known as active-active redundancy.
The active-standby redundancy scenario is the easiest to understand and
implement. One device takes the traffic while the other waits in case of
failure (see Figure 2-1). If the
second unit were to fail, the other device would have some way of determining
that failure and would take over the traffic (see Figure 2-2).
There are several variations of the active-active scenario. In all
cases, however, both units accept traffic. In the event of one of the devices
failing, the other takes over the failed unit's functions.
In one variation, VIPs are distributed between the two load balancers
to share the incoming traffic. VIP 1 goes to Load Balancer A and VIP 2 to Load
Balancer B (see Figure 2-3).
In another variation, both VIPs answer on both load balancers, with a
protocol circumventing the restriction that two load balancers may not hold
the same IP address (see Figure 2-4).
As in all active-active scenarios, if one load
balancer should fail, the VIP(s) will continue to answer on the remaining one.
The other unit takes over all functions (see Figure 2-5).
Perhaps the most common
redundancy protocol is the Virtual Router Redundancy Protocol (VRRP). It is an
open standard, and devices claiming VRRP support conform to the specifications
laid out in RFC 2338. Each unit in a
pair sends out packets to see if the other will respond. If the sending unit
does not get a response from its partner, then the unit assumes that its
partner is disabled and initiates taking over its functions, if any.
While it's not necessary to know the inner workings of
the VRRP protocol, some details may come in handy. VRRP uses UDP port 1985 and
sends packets to the multicast address. These details are useful when
dealing with some types of IP-filtering or firewalling devices.
VRRP requires that the two units are able to communicate with each
other. Should the two units become isolated from one another, each will assume
the other unit is dead and take on "master" status. This circumstance
can cause serious network problems because of IP-address conflicts and other
network issues that occur when two units think they are both the active units
in an active-standby situation.
There are several proprietary versions of VRRP, each usually ending in
"RP." Two examples are Extreme Network's Extreme Standby Router
Protocol (ESRP) and Cisco's Hot Standby Routing Protocol (HSRP). While these
protocols vary slightly from the standard, they all behave in essentially the
Another method for detecting unit failure between a pair of devices is
provided by the fail-over cable. This method uses a proprietary
"heartbeat" checking protocol running over a serial line between a
pair of load balancers.
If this fail-over cable is disconnected, it causes
both units to believe they are the only units available, and each takes on
"master" status. This, as with the VRRP scenario, can cause serious
network problems. Spanning-Tree Protocol (STP) is a protocol for Layer 2
redundancy that avoids bridging loops. STP sets a priority for a given port,
and when multiple paths exist for traffic, only the highest-priority port is
left active, with the rest being administratively shut down.
One of the issues that a fail-over scenario presents (the
"little" in little or no impact on network operations, as stated
earlier) is if a device fails over, all of the active TCP connections are
reset, and TCP sequence number information is lost, which results in a network
error displayed on your browser. Also, if you are employing some form of persistence,
that information will be reset as well (a bad scenario for a web-store type
application). Some vendors have employed a feature known as "stateful
fail-over," which keeps session and persistence information on both the
active and standby unit. If the active unit fails, then the standby unit will
have all of the information, and service will be completely uninterrupted. If
done correctly, the end user will notice nothing.
Also referred to as the "sticky," persistence is the act of
keeping a specific user's traffic going to the same server that was initially
hit when the site was contacted. While the SLB device may have several machines
to choose from, it will always keep a particular user's traffic going to the
same server. This is especially important in web-store type applications,
where a user fills a shopping cart, and that information may only be stored on
one particular machine. There are several ways to implement persistence, each
with their advantages and drawbacks.
One of the tasks of an SLB device is to recognize when a server or
service is down and take that server out of rotation. Also known as health
checking, this can be performed a number of ways. It can be something as simple
as a ping check, a port check, (to see if port 80 is answering), or even a
content check, in which the web server is queried for a specific response. An
SLB device will continuously run these service checks, usually at
Depending on your specific needs, there are several methods of
distributing traffic among a group of servers using a given metric. These are
the mathematical algorithms programmed into the SLB device. They can run on
top and in conjunction with any of the persistence methods. They are assigned
to individual VIPs.
The networking infrastructure is composed of the networking components
that give connectivity to the Internet, Extranet, or Intranet for your web
servers. It connects them to the users of your services. This is usually done
one of two ways: in a location controlled by the site or in a location
maintained by a colocation/ hosting provider that specializes in hosting other
companies' server infrastructures. Provider infrastructure also includes the
facility that your site is housed in, whether it is your facility or the
Whether your site is housed internally or at a colocation provider,
your equipment is usually housed in type of space called a data center. Data
center is a fairly general term, but it usually refers to an area with high
security, environmental controls (usually air conditioning), nonwater-based
fire suppression (such as Halon or FM200), and UPS power backup systems with
generators on standby, among other things. Money is probably the determining
factor in the level and quality of the data center environment, from Fort Knox
conditions to (literally) someone's basement.
In a leased-line scenario, a site is housed internally with one or more
leased-line connections from one or more providers. It can be as simple as a
DSL line or as complicated as multiple OC3s running full BGP sessions to
multiple providers. The advantage of this is that you have full control and
access over your equipment. In Figure 2-6, we see a common leased-line
scenario, where one location is connected to the Internet via two DS-3 (45
Mbps) lines from two separate providers. The site would probably run BGP on
both lines, which is a protocol that allows redundancy in case a line from one
provider goes down.
Colocation is when you take your servers and equipment to a provider's
location and house all your equipment there. Usually in racks or secure cages,
your equipment sits on the colocation provider's property and is subject
to its security, power, bandwidth, and environmental controls. The colocation
provider typically provides the bandwidth through his or her own
connectivity/backbone through a "network drop," usually an Ethernet
connection (or multiple connections for redundancy). The advantage to
colocation is that a colocation provider's bandwidth is usually more scalable
than what you would have at your own facility. When you want more bandwidth from
a colocation provider, you: just take it, or upgrade your Ethernet lines,
which don't take long to procure (a couple of days, depending on your
provider). If you have leased lines to your own facility, it can take anywhere from
30 days to 6 months to get telco companies to add more bandwidth, such as
T-l 0.5 Mbps) or DS-3 (45 Mbps). Higher capacity lines usually take
Colocation is the typical route taken nowadays, mostly
because of cost and scalability concerns. It's just easier and cheaper
in most situations to let another company worry about the data center and
connectivity. Its network connectivity is usually very complex, involving
peering points, leased lines to other providers, sometimes even its own
backbone. Usually, all a hosted site need concern itself with is the network
drop from the provider.
(Note: content derived from: "Server
Load Balancing", O’Reilly & Associates, 2001, Tony Bourke, Inc, pgs 15-23.)