A lot of people have recently asked how to configure load-balancing in pure Terminal Server environments. This is a topic that Ron and I covered in our Terminal Server book, so I've pulled out that section for this article.
Microsoft Windows Network Load Balancing (“NLB”) is the “free” out-of-the-box software load balancing solution available for Windows 2003-based Terminal Servers. NLB is available with all editions of Windows Server 2003, although your Terminal Servers must be running at least the Enterprise edition of Windows to use the Session Directory. (We'll cover the Session Directory in an upcoming article.)
Network Load Balancing works by assigning a single virtual IP address to those multiple servers that can respond. You then assign a DNS name to the virtual IP address. RDP clients connect to this DNS name, and the system responds by automatically connecting the user to the least-busy server.
Under the hood, Network Load Balancing enables all of the configured nodes on a single subnet to detect incoming network traffic for the cluster's virtual IP address. (When using Windows NLB, all servers must be on the same subnet.) On each Terminal Server in the cluster, the Network Load Balancing driver acts as a layer residing between the cluster driver and the TCP/IP stack. A portion of the incoming network traffic can be received by the host.
Windows Network Load Balancing works at the network level by distributing the network client request between hosts. Windows NLB is limited to a maximum number of 32 possible hosts in any one cluster.
Also, as its name implies, Windows Network Load Balancing is only able to determine which server is the least-busy based on network load. If one server has failed but is still responding to the network, the NLB system will continue to send users to it.
Advantage of Load Balancing with Windows NLB
- It’s the “free” solution that’s built-in to Windows.
Disadvantages of Load Balancing with Windows NLB
- Load calculations are only based on network load.
- You can’t natively load-balance more than 32 servers.
- All servers must be located on the same subnet.
- What if you need to load balance more than 32 Terminal Servers?
One major limitation of Windows Network Load Balancing is that you can only use it to load balance 32 servers. If you need more than 32 servers in your cluster, you must implement one of the following options:
- Move to a third-party hardware (F5, etc.) or software (Citrix, WTS Gateway Pro, etc.) load-balancing solution as described later in this chapter.
- Combine multiple groups of NLB clusters with round robin DNS servers.
Let’s take a closer look at this second option. In this case, your DNS servers should be configured with entries for both of the clusters’ virtual IP addresses in a round robin entry so that clients connect to either one in a one to one ratio. Make sure that each cluster has the same number of servers, or adjust your round robin ratio accordingly.
At this point you may be thinking that a DNS round robin solution could suffice for simple load balancing. Before you go down that path, remember that there are reasons why it’s called DNS round robin and not DNS load balancing.
If a server failure in an NLB cluster will be detected by the other servers (through the cluster’s heartbeat packets), new RDP connections will be distributed only among the remaining Terminal Servers. However, a DNS round robin scheme will continue to send connections to the server that has failed until a change is manually made to the DNS entry.
Configuring Windows Network Load Balancing
This article is not meant to an exhaustive study of Windows Network Load Balancing. However, we’ll cover some of the Terminal-Server specific items that you probably won’t find in other papers covering NLB.
There are only a few requirements that all servers must meet to use Windows NLB:
- Have at least one network interface configured for Load Balancing.
- Use TCP/IP.
- Be on the same subnet.
- Share a common (virtual) IP address.
In an ideal world, each of your Terminal Servers within in the cluster would have two network cards. The first would be used for the “front-end” RDP traffic between clients and server. The second would be used for “back-end” services and data access.
All versions of Windows Server 2003 come with Network Load Balancing installed. To use it, all you have to do is enable it on the network card that you intend to use for RDP connections (Control Panel | Network Connections | Right-click on your network card | Properties | Check the box next to the “Network Load Balancing” option).
Once you enable NLB, you must configure it (Network adapter properties | Highlight “Network Load Balancing” | Click the “Properties” button). There are several configuration options to understand when using NLB in a Terminal Server environment.
The Properties button leads you to a window with three tabs—Cluster Parameters, Host Parameters, and Port Rules.
On the Cluster Parameters tab, you’ll first enter the virtual IP address, subnet mask, and DNS name that your cluster will use. These should be the same on all Terminal Servers in the cluster.
Then you’ll select a cluster operation mode. Windows NLB has the ability to work in two different modes: “unicast” and “multicast.”
Regardless of the mode you choose, NLB creates a new virtual MAC address assigned to the network card that has NLB enabled, and all hosts in the cluster share this virtual MAC. Then, all incoming packets are received by all servers in the cluster, and each server’s NLB drivers are responsible for filtering which packets are for that server and which are not.
When in unicast mode, NLB replaces the network card’s original MAC address. When in multicast mode, NLB adds the new virtual MAC to the network card, but also keeps the card’s original MAC address.
Both unicast and multicast modes have benefits and drawbacks. One benefit of unicast mode is that it works out of the box with all routers and switches (since each network card only has one MAC address). The disadvantage is that since all hosts in the cluster all have the same MAC and IP address, they do not have the ability to communicate with each other via their NLB network card. A second network card is required for communication between the servers.
Multicast mode does not have the problem that unicast operation does since the servers can communicate with each other via the original addresses of their NLB network cards. However, the fact that each server’s NLB network card operating in multicast mode has two MAC addresses (the original one and the virtual one for the cluster) causes some problems on its own. Most routers reject the ARP replies sent by hosts in the cluster, since the router sees the response to the ARP request that contains a unicast IP address with a multicast MAC address. The router considers this to be invalid and rejects the update to the ARP table. In this case you’ll need to manually configure the ARP entries on the router. (Don’t worry if you’re lost at this point. Just be aware that if you’re using multicast mode, you’ll need to get one of your network infrastructure people involved.)
The bottom line is that you don’t want to use unicast in a Terminal Server environment unless you have two network cards. (That way, you can still connect to a specific Terminal Server if you need to via another adapter and another IP address.) If your servers have only a single network card, then you’ll want to use the multicast mode.
The “Host Priority” is a unique number assigned to each server in the cluster. This number (an integer) identifies the node in the cluster and determines the order in which traffic is delivered to the servers by default. The priority is organized by lowest to highest with the lowest number handling all traffic not otherwise handled by the set of load balancing rules.
The Port Rules tab allows you to configure how load-balancing works within the cluster. By default, a rule is created that equally balances all TCP/IP traffic across all servers. To use NLB for a Terminal Server cluster, you’ll need to change some settings.
First add a new rule (Port Rules tab | Add button) that will specify how RDP traffic is to be load-balanced. Configure the port range for 3389 to 3389 to ensure that this new rule only applies to RDP traffic. Select the “TCP” option in the protocols area and the “Multiple Host” as your filtering mode.
The “Affinity” determines if a specific client’s requests will continue to be routed to a specific server (such as the first server they were connected to) based on the client’s IP address. If you’re using the Session Directory then a specification here is not required or can be set to “none.” If you are not using the Session Directory, set this rule to “single affinity” so that a client will always be serviced by the same server and users can reconnect to their disconnected sessions.
Finally, the “Load weight” setting determines the amount of users/load this server should handle. The cluster algorithm will divide the server’s load weight setting by the total of all the servers’ settings to calculate a load index value for each server, allowing you to route more connections to larger servers.
A simple example is a two-server cluster, the first server having a quad processor configuration and the second having a dual processor configuration. Through load testing, you have determined that the quad can handle exactly twice the number of users as the dual. One server (the dual) can be configured with a load weight of 50 while the other server (the quad) can be configured with a load weight of 100. In this configuration, the second server would receive twice as much traffic as the first. The default load weight setting is “Equal” and assumes all servers in the cluster can handle an equal amount of load.
Baseline NLB Configuration
As we discussed earlier, NLB clustering is extremely complex. Nevertheless, you should be able to create a basic configuration for lab testing fairly simply. The following settings will work for almost every environment and allow you to easily configure RDP load balancing:
|Cluster Parameters Tab|
|Cluster IP Address||Common IP shared between all servers
|Subnet Mask||Common Subnet|
|DNS name of cluster
||Shared DNS name (should refer to the Common cluster IP)
|Host Parameters Tab|
|Priority/Host ID||Start at 1 and work up as you add servers. Each must be unique
Dedicated IP IP Address of NIC that will accept load balanced requests
|Subnet Mask||Subnet mask of NIC configured for Load Balancing.
|Port Rules Tab|
|Cluster IP Address||If only using one, leave the default at “All”|
|Port Range||3389 to 3389 (or whatever port you're using for RDP)
|Protocols||Default of “Both” will work so will “TCP”|
|Filtering||Multiple Hosts, Affinity set to None. (If you’re not using Session Directory you can set this to “single.”)|
Leave the remaining settings at their default values. (You can also use these settings for load balancing your web servers. Just change the port rule from 3389 to 80.)
Once your cluster is up and running:
- Check that each server’s dedicated IP address must be unique, and the cluster IP address must be identical for each server in the cluster.
- Verify that any load-balanced applications are installed and configured on all cluster servers. Remember that Windows NLB is not aware higher level applications and does not start or stop applications or services on each server.
- Ensure that the dedicated IP address is always listed first (before the cluster IP address) in the Internet Protocol (TCP/IP) Properties dialog box to ensure that responses to connections originating from a host will return to the same host.
- Make sure that both the dedicated IP address and the cluster IP address are static IP addresses. They cannot be DHCP addresses.
- Do not enable Network Load Balancing on a computer that is part of a “real” Microsoft cluster services cluster. Microsoft does not support this configuration.
Limitations of Windows Network Load Balancing
Even though it’s “free,” Network Load Balancing has some weaknesses. In addition to the disadvantages listed previously, some people want load-balancing tools to check the health of individual servers or create load indexes based on CPU utilization or the number of active sessions. For this functionality, you’ll need to turn to third-party tools. There are hardware- and software-based solutions for load balancing Windows 2003 Terminal Servers.