Load balancing in a Citrix Presentation Server environment has not fundamentally changed since the days of MetaFrame XP. Today's PS 4.5 environments still have zone data collectors, load evaluators, and load indexes.
As a very, very quick refresher, remember that the load evaluator is a thread in the IMA Service on a Presentation Server that calculates the load index for that server. The load index is an integer value from 0 to 10,000 that objectively represents how busy a server is. A value of 0 equates to no load, while a value of 10,000 represents a full load and will prohibit new user connections from being load-balanced to that server.
Each server continuously sends it’s current load index to its zone data collector, meaning that the data collector knows the relative load for multiple servers. Then when incoming connection requests come in, a quick check with the zone data collector will reveal which server is most appropriate for that user to connect to.
This load balancing architecture generally works fine but has one fatal flaw: new incoming requests are always routed to the least-busy server. While this doesn’t seem like a problem at first, consider an environment that has maybe 20 servers each with 80 users. If a new server is brought online (or if a server unexpectedly reboots), that new server will have 0 sessions while the other servers all have 80 sessions. So guess which server the next 80 users will be routed to?
Again, this is not a problem in theory, but in reality, the logon process is “expensive” in terms of CPU and processing power, and in many cases even the fastest servers can only actually process a handful of logons at the same time.
So if this 20-server farm gets 20 new logons at more-or-less the same time, they will all be routed to that new server. Since it’s unlikely that the new server can actually handle 20 logons at once, some of these logons will fail. The users will try to re-logon, and they will be again routed to that new server (which at this point has maybe 10 sessions compared to the 80 on the other servers).
Citrix of course is aware of this problem. When the current generation of load balancing technology was released in MetaFrame XP, Citrix attempted to mitigate this with something called the “load bias.” The load bias was an temporary and artificial increase of a server’s load index that happened whenever a server received an new incoming connection.
The logic is this: If we go back to our 20 server with 80 users each example, those twenty servers might be humming along just fine each with a load index of around 8000. (Remember the load index is an integer value from 0 to 10,000 that indicates how relatively busy a server is.) When the new server comes online, it will have a load index of 0. When a new user connects to that new server, the load evaluator on that new server will calculate its new load index and submit that to the zone data collector. The problem is that the whole process of establishing a session and updating the data collector takes time—a minute? Maybe two? Meanwhile, the data collector keeps on sending new user requests to that server (since it sees a load index of 0 versus 8000 on the other 20 servers), and that new server is quickly overrun creating the black hole effect.
To address this, the zone data collector will temporarily increase a server’s load (via that “load bias”) whenever a new connection request comes in. This temporary increase will be voided whenever the user establishes their session and the server updates the data collector with its real load index.
The Load Bias in MetaFrame XP
In MetaFrame XP, the load bias was 200 points.
This 200-point temporary increase worked great in situations where you had a bunch of servers that were all more-or-less loaded equally, since it prevented a burst of new users from all going to the same server. But in our new-server-versus-20-full-servers it’s worthless, since the 200-point load bias wouldn’t really have an effect when one server had a load index of 0 versus all the other servers with load indexes of 8,000.
Fortunately even in the MetaFrame XP days you could change this load bias via the registry (HKLM\SOFTWARE\Citrix\IMA\LMS\LoadBias). “Great!,” people thought, “I can just set the load bias to like 1000 and be all set!” Unfortunately if your load evaluators contained the “user count” rule, that registry key was ignored and Citrix used a different formula to calculate the load bias!!?! (This is crazy because pretty much everyone uses that rule.)
The Load Bias in MetaFrame Presentation Server 3.0
In MetaFrame Presentation Server 3.0, Citrix added a new registry value called “ForceRegBias” in the HKEY_LOCAL_MACHINE\SOFTWARE\Citrix\IMA\LMS key that would force the load evaluator to use the load bias you entered into the registry regardless of what rules you used.
While this helped, it was still far from ideal.
The Load Bias in Citrix Presentation Server 4.0
Nothing changed with the release of Presentation Server 4, but Hotfix Rollup Pack 1 (HRP01) for PS 4 introduced another load bias modification that Citrix called “Slow-Start Load Balancing.”
The idea behind slow-start load balancing was that Citrix would officially give logons a really high load bias. How high? It depended on what the current load index of the server was. The data collector would create a load bias that was half the remaining load for logons! The formula is:
Temporary slow-start load bias = Current Load + 1/2 of (10000 - Current Load)
In other words, a server that had a load index of 4000 would have a temporary slow start load index of 7000. The load bias in this case was half of the remaining headroom (or half of 6000, since the server had a load index of 4000 out of a max of 10,000). The load bias (3000) is added to the original load (4000) to come up with the 7000 temporary load index.
Temporary load index = 4000 + 1/2 of (10000 – 4000) = 4000 + 1/2 of 6000 = 7000
The process continued for as many simultaneous logons as there were. If that server with a load index of 4000 got a single logon, the load index increased by half the headroom to 7000. If a second simultaneous logon occurred, that server would see it’s load index increase again by half of the remaining headroom, taking it up to 8500. A third simultaneous logon would increase it again by half the remaining headroom, bringing it up to 9250, and so on.
The key here is that these are just temporary increases. Once the users are fully logged on, the server will recalculate its “real” load index and send that value up to the data collector. The only reason these really high temporary load indexes exist is to prevent one server from getting a ton of new logons and essentially rendering an entire farm unusable.
By the way, you can disable this slow start stuff by setting the key HKEY_LOCAL_MACHINE\Software\Citrix\IMA\LMS\UseILB to 0.
The Load Bias in Citrix Presentation Server 4.5
Citrix Presentation Server 4.5 allows you to further tune the load bias calculation performed by this slow-start behavior. This tuning can be done as via a new rule called “load throttling” that you’ll see when configuring a load evaluator in a PS 4.5 farm.
If you enable the load throttling rule, you’ll see a drop-down box called “impact of logons on load.”
Behind-the-scenes, this rule affects the mathematical formula that’s used when calculating this slow-start load bias. Remember how after HRP01 on PS 4, the slow-start load bias will always be one-half of the remaining headroom? Well these load throttling numbers allow you to change that. Now instead of new connections consuming one-half of the remaining load, you can configure them to be one-third or one-forth or one-fifth (or even 100%) of the remaining load.
Mathematically, these load throttling values allow you to change the denominator in the slow-start load balancing algorithm. Recall from above that the formula is:
Temporary Slow-start load = Current Load + 1/2 of (10000 - Current Load)
Changing the values of “Logon Impact” allows you to change the bottom part of that “1/2” multiplier into some other value, from 1/1 to 1/5.
The values are as such:
Since these values affect the bottom halves of the fractions, the higher impact choices are lower numbers, meaning they lead to higher slow-start load biases. (By the way, it should now be clear mathematically why the default value is 2, since the default behavior is that the load bias is one-half of the remaining headroom.)
But if you feel that adding one-half the value of the remaining load is too much, you could choose “Medium-High,” which would switch the multiplier to one-third, meaning that each logon would only add one-third of the remaining headrom for each logon instead of one-half.
Similarly, a value of “extreme” means that your multiple would be 1/1, or just “1.” This in effect means that when a user logs on, the temporary load index will always be 10,000, since the algorithm will add the entire remaining value to whatever the current load is. A value of “extreme” means that only one person can logon at a time. (Your server can still support dozens or even hundreds of users. The “extreme” value just means they will connect one-by-one.)