In the previous section, we discussed server sizing. Server sizing and performance tuning are closely related. The main difference is that server sizing is about choosing the proper hardware, and performance tuning is about making software configuration changes that affect how efficiently the hardware is used. In this section, we'll look at several techniques and resources that you can use to tune your servers, their applications, and the network.
If your server sizing tests showed that you can get 40 users on a MetaFrame server, there might be some configuration tricks that you can use to bump that up to 42, 44 or even 50 users.
It seems like almost every day another performance tweak or registry hack is discovered that can help MetaFrame XP servers run faster. In this book, we've focused on the techniques that you can use to get the best performance out of your MetaFrame XP environment from a design standpoint. If you want to tune your servers even further, there is one place that lists all the registry keys and the settings that can help you. That place is the website www.TweakCitrix.com. This site is run as a "side project" by Rick Dehlinger, a Senior Systems Engineer for Citrix. TweakCitrix.com is home to the famous "MetaFrame Installation and Tuning Tips" document. This document, compiled by Rick, contains hundreds of tuning tips and tricks sent in by Citrix administrators worldwide. TweakCitrix.com also features several message boards containing up-to-the-minute postings about performance tuning MetaFrame XP systems.
Rather than waste the paper required to list those tuning tips (which are constantly being updated anyway), you should go straight to the source at www.TweakCitrix.com.
Another phenomenal way to tune your MetaFrame XP servers is to purchase a utility called "TScale" from Kevsoft. (www.kevsoft.com) TScale is a small program that runs in the background on a MetaFrame XP server. It constantly monitors how the server uses its virtual memory. Then, in the middle of the night (every night), a batch process runs that applies optimizations based on what it saw that day.
It only takes a few minutes to install and configure TScale, and after a few days you will notice a performance increase of 30% to 40%, which means that you can support 30% to 40% more users. A 30-day evaluation copy of TScale is available, and you should definitely try it out. (Author's Note: I have no vested interest in TScale or KevSoft. I just really like their products, and I think that they work well.)
Lastly, as was mentioned elsewhere in this book, remember that you can always visit http://thethin.net for a lively source (about 100 new messages per day) of thin client topics, including performance tuning techniques.
In addition to the techniques that you can use to tune your server, you also can specify the CPU priority of published applications if you're using Feature Release 1 or 2 and you have MetaFrame XPa or XPe (CMC | Published Application Properties | Application Limits Tab). There are five levels of priority settings that you can set: high, above normal, normal, below normal, and low. Configuring these settings directly affects the application's processes, similar to changing a process's priority with the Task Manager (Task Manager | Processes tab | Right-click on process | Set Priority)
When you configure the CPU priority of a published application, every instance of that application is launched with the configured priority, even if the application is load-balanced across more than one MetaFrame XP server.
Be careful when setting the CPU priority of a published application. Just about all of the processes on a MetaFrame XP server are, by default, set to "normal." If you configure an application to be above that, then you may run into trouble because that application may take processing time away from other critical operating system components.
Realistically, configuring application CPU priority should be done as a sort of "last resort," when other server planning and sizing methods fail to produce adequate performance results.
Advantages to Setting Application CPU Priority
- Allows more important applications to preempt less important ones.
Disadvantages of Setting Application CPU Priority
- Don't expect too much.
- Not a substitute for real server sizing planning.
- Can be very dangerous if hastily planned.
- Requires Feature Release 1 or 2.
- Does not work with MetaFrame XPs.
- Does not work with Terminal Server 4.0.
Tuning the Network
The last MetaFrame XP performance tuning element to look at is the network. Before we explore the details of how you can tune your network, it's important to review some basics of network performance.
Factors Affecting Network Performance
When thinking about network performance, you need to understand the difference between "latency" and "bandwidth." Both are used to describe the speed of a network. Bandwidth describes how much data can pass through the network in a given period of time, such as 10 megabits per second, or 256 kilobits per second. Latency describes the length of time, usually expressed in milliseconds (there are 1000 milliseconds in one second), that it takes for data to get from point A to point B. Bandwidth and latency are independent of each other.
The fact that bandwidth and latency are different from each other is an important concept to understand in MetaFrame environments. (This is so important that we're going to resurrect the "highway" analogy.)
Imagine that each bit of data is an automobile, and the network is a highway. In order for the data to get from point A to point B, an automobile would have to travel from one end of the highway to the other. In high-bandwidth environments, the highway has many lanes, and hundreds of automobiles can be on it at the same time. In low bandwidth environments, the highway is a narrow country road, and only a few automobiles can fit in it at the same time. The width of the highway is like the bandwidth of the network.
Since latency affects how long it takes data to traverse the network, the speed of the automobiles on the highway represents the latency. Even on narrow highways (low bandwidth), there might be people who drive really fast and travel the road quickly (low latency). Conversely, even if you have a large number of automobiles on a wide highway (high bandwidth), the drivers may choose to drive slowly (high latency).
We said that bandwidth and latency are not really related, because the width of the highway does not directly affect how fast you drive on it. However, as you've probably guessed by now, it's possible that bandwidth can affect latency.
Thinking again to our highway example, imagine a low-bandwidth environment that also has low latency (a narrow highway where the people drive really fast). If there are only a few automobiles on the road, they can go fast without problems. However, imagine that the hallway begins to fill up with more and more autos. Even though the people want to drive fast, they can't because the highway is too crowded. The effect is that it will take the longer for automobiles to get from one end of the highway to the other. In a sense, the latency has increased from low latency to high latency simply because there are too many vehicles on the highway.
There are several solutions to the overcrowded highway problem. You could:
- Widen the highway.
- Remove some of the vehicles.
- Force people to drive smaller vehicles.
- Install traffic signals and lane control devices to manage the traffic.
As you'll see, the four potential solutions to the overcrowded highway problem are also the four potential solutions to overcrowded networks in MetaFrame XP environments.
Back in the real world, a network connection's bandwidth and latency each affect Citrix ICA session traffic in different ways:
- Bandwidth affects how much data the session can contain. Higher resolution sessions require more bandwidth than lower resolution sessions. Sessions with sound, printing, and client drive mapping all require more bandwidth than sessions without. If a particular session only has 15Kbps of bandwidth available, that session can still have decent performance so long as the resolution, color depth, and other virtual channel options are tuned appropriately.
- Latency is usually more critical in thin client environments. Since latency affects the amount of time it takes for communication to pass between the client and the server, environments with high latency can seem like they have a "delay" from the user's perception.
For example, imagine an environment where a user was using Microsoft Word via an ICA session. When they press a key on their ICA client device, the key code is sent across the network to the MetaFrame server. The server processes the keystroke and prepares to display the proper character on the screen. Because this is a Terminal Server, the screen information is redirected back across the network where it is displayed on the local ICA client device. In order for a character to appear on the screen, data must travel from the client to the server and then from the server back to the client again.
In this situation, if the latency of the network is 10ms, the network delay will only add 20ms (because the data crosses the network twice) to the time between the key press and the character appearing on the user's screen. Since 20ms is only 0.02 seconds, the delay will not be noticeable. However, if the latency was 200ms, the total delay to the user would be 400ms, or almost one-half of a second. This length of delay would be noticeable to the user, and would probably be unacceptable.
An easy way to get an approximation of the latency in your environment is to perform a TCP/IP ping. You can ping the server from the client or the client from the server, it doesn't matter which way. For example, if your server is called "server01" would you execute the following command from an ICA client workstation:
(Be sure that you execute the ping command locally on the workstation, not via an ICA session on the server.) The results will look something like this:
Pinging server01 [10.1.1.42] with 32 bytes of data:
Reply from 10.1.1.42: bytes=32 time=378ms TTL=118
Reply from 10.1.1.42: bytes=32 time=370ms TTL=118
Reply from 10.1.1.42: bytes=32 time=360ms TTL=118
Reply from 10.1.1.42: bytes=32 time=351ms TTL=118
Ping statistics for 10.1.1.42:
Packets: Sent = 4, Received = 4, Lost = 0
Approximate round trip times in milli-seconds:
Minimum = 351ms, Maximum = 378ms, Average = 364ms
Notice that the time= section of each line shows you the approximate latency. This time is the time that the "pinger" waited for a response from the "pingee," meaning that the time shown represents the entire round-trip.
The above scenario with the 364ms latency could have occurred in a dial-up environment with a bandwidth of 28kbps or a frame-relay environment with 512kbps. In either situation, the performance would not be as good as an environment with less latency.
Resolving Network Performance Issues
Once you've determined whether your network performance issues are bandwidth-related or latency-related, you can begin to address them.
If your network suffers from a lack of bandwidth:
- See what type of traffic you can remove from the network. This is like removing extra automobiles in our highway example.
- Make the ICA sessions as "small" as possible. For example, your current ICA sessions might be 24-bit color, with audio and port mapping enabled. Monitoring the network might show that your ICA sessions were consuming 40k bps. By switching to 16-bit color and disabling audio and port mapping, you might be able to get your ICA sessions down to 20k bps. This is like convincing everyone to drive smaller cars in our highway example.
- Install a hardware device to monitor and control application and ICA bandwidth. This is like adding a traffic cop and traffic signals in our highway example.
The two most popular types of bandwidth management devices are Packeteer's PacketShaper (www.packeteer.com) and Sitara's QoSWorks (www.sitaranetworks.com). Both are physical hardware devices that sit between your network and the WAN router, as shown in Figure 6.4.
Figure 6.4: Third party bandwidth-management device usage
These devices allow you to analyze and capture current traffic usage (which is when you'll discover that 75% of your WAN traffic is web surfing).
You can then give Citrix ICA traffic priority over other types of traffic. You can even configure these devices to give different priorities to different IP addresses. You can also guarantee certain amounts of bandwidth to ICA (or any protocol).
These third-party devices are similar to Cisco Quality of Service (QoS) devices, except that the Sitara and Packeteer devices are "Layer 7" routers and can differentiate ICA traffic from other types of traffic.
If you determine that your network performance issues are due to high latency, there are also some steps that you can take to help improve performance:
- Enable Citrix "SpeedScreen" technology. This technology uses different kinds of caching and character generation methods to make user sessions appear faster in highly-latent environments. Full details on SpeedScreen are covered in Chapter 10.
If you have high latency, don't forget (as shown in our highway example) that freeing up some bandwidth might also have the positive effect of lowering your overall latency.