Fileserving in Windows environments is usually of critical importance. After all, if you can't reach your files or have to wait five minutes every time you browse a share, the heat starts to build up in the IT department.
File serving is more than just saving a file to your home directory. I wrote a two-part article on MSTerminalServices.org on file serving and Terminal server environments. I suggest you read that article (Part 1 and Part 2 ) first to get a feel for the proper context of this article.
One of the main reasons I wrote that article is that fileserving can easily become a bottleneck if not configured properly, especially in Terminal Server environments.
To solve these performance problems, you sometimes have to tune the fileserver (lanmanserver) and the “fileserver-client” (lanmanworkstation). However, this isn’t for the faint of heart and can cause huge problems if you do it wrong. Unfortunately, documentation on these tuning parameters is rather scarce.
So in this article, I’ll try to explain what the important parameters are, what they do, and how they relate to each other. Once you know this, you'll be able to tune your fileserving environments yourself.
Before we jump into this, please note that there are also a great deal optimizations that you can do in the "Terminal Server Terminal Server Client" hemisphere. Although the basic fileserving principles also apply in that area, this article is not meant to help you perform those optimizations. Also, there is a lot of additional tweaking you can do in other parts of the (Terminal Server) registry. I've purposely left these optimizations out because I wanted this article to focus on the performance of Fileserving components only.
This article was written assuming you’re running Windows 2000 (SP4+) or Windows Server 2003, Service Pack 1.
Before we get down and dirty, we need to take a look at the core components that the Windows file serving environment is made of. File serving in Windows is a classic example of a Client-Server mechanism. All you have to do become a file server is to check the box “file and printer sharing for Microsoft networks” in the network connection properties box. On the other end all you have to do to “use” this file server is to check the box “client for Microsoft Networks”.
Both the server and the client components are run as a service. Not surprisingly, this is the "Server" service for the server component and the "Workstation" service for the client components.
Settings for these services are stored in the Windows registry. For the Server service this location is: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanserver. The corresponding location for the Workstation service is HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanworkstation.
By default, the lanmanserver registry key on a freshly installed Windows Server 2003 Service Pack 1 machine looks like this:
The first sub key we encounter is the “AutotunedParameters” key. If you look at it you'll see that it's empty. Don’t worry--it’s supposed to be empty. This registry key exists because, by default, the Server service is auto-tuning. This means that every time the system boots, the server takes a look at the hardware configuration and incorporates any changes in the configuration of the Server service. Changes in hardware that are monitored are the amount of memory and the number of processors. There’s even a formula for it:
MB = Megabytes RAM on the server
SMBServerPerfSetting = .5 if "Minimize Memory Used"
SMBServerPerfSetting = 1 if "Balance"
SMBServerPerfSetting = 2 if "Maximize Throughput for File Sharing"
OSVersion = 2 if running NTServer with > 16MB RAM
#Processors = is the number of processors in the system
In the formula you’ll notice that it refers to the SMBServerPerfSetting. This brings us to the only GUI ‘tool” native to Windows that you can use to “tune” the Server service. When you select the properties of a connection and then select the properties of “file and printer sharing for Microsoft networks”, you should end up with a window like this:
This is where you can optimize the Server service for a specific role. Consequently, if you do the numbers you’ll see that the higher you set the SMBServerPerfSetting, the higher the outcome of the formula is. But what is this number? Good question.
This number represents the value Windows will use for the MaxWorkItems, an important value in tuning the Server service. However, MaxWorkItems is just one of the parameters you can set to tune your fileserver. Let’s take a look at the (registry) values.
Before we begin discussing the relevant parameters you can use to tune the Server service you should know that you should create them in the parameters sub key of the lanmanserver registry key. Let’s take look at the most important parameters:
As said, MaxWorkItems isn’t the only thing tuning the Server service. It is one of the most important parameters though. What does this parameter mean? Well, MaxWorkItems specifies the maximum number or work items (receive buffers for file requests) that the Server service is permitted to allocate at one time. If this limit is reached, you get really bad performance out of your file server on even no performance (new connections to the file server are denied).
Possible values: 1-65535
This configures the number of work items allocated to a processor during startup. (The "initial" work items.) If this number is too low, it can significantly reduce performance or even deny new connections to the file server.
Possible values: 1-512
This parameter permits a fileserver to provide a suggested maximum number of simultaneous outstanding client requests to itself. During negotiation of the Server Message Block dialect on this initial connection, this value is passed to the client's redirector where the limit on outstanding requests is enforced. A higher value can increase server performance, but requires more use of server work items (MaxWorkItems).
Possible values: 1-65535
MaxWorkItems and MaxMpxCt Relationship
The value for MaxWorkItems must be at least four times as large as that for MaxMpxCt. For example, if MaxMpxCt has a value of 4096, then MaxWorkItems needs to have a value of at least 16384.
This value determines the maximum number of raw receive buffers that a server can allocate. If this limit is reached, server performance may be degraded.
Possible values: 1-512
This value controls the number of free connection blocks that are maintained for each endpoint.
Possible values: 2–4096
This value specifies the minimum number of free connection blocks to be maintained for each endpoint. This setting can sometimes dramatically improve performance.
Possible values: 0–256
This specifies the size of a WorkItem (see MaxWorkItems) that the Server service uses. Small WorkItems use less memory, but large WorkItems can improve performance.
When running applications that use a lot of copy or move functions to a remote server (profiles anyone?), the speed at which this function completes is determined by network speed (of course) and by the SMB size. By increasing this WorkItems size, you will allow the server to complete its file copies faster. This will increase the performance of the application making the copy/move calls.
For computers running Windows Server 2003 and with 512 MB or more of physical memory, the default size of the request buffers is 16,644 bytes; for servers with less physical memory, the default size is 4,356 bytes. If this entry is present in the registry, its value overrides the default value.
Possible values: 1-65535
As you can see, there’s no “AutotunedParameters” here. However, there is a "parameters" sub key in which we can do some tuning. It is not uncommon (especially in Terminal Server environments) to have to tune the Workstation service to alleviate performance problems. This is due to the nature of Terminal Servers. My article on MSTerminalServices.org discusses this in detail, but in a nutshell it’s like this: the workstation service was (and is) designed for a single workstation (like your desktop). However, a Terminal Server can easily host 50 desktop sessions, but unless you do manually intervene this server most likely is still configured just as your desktop would be. It’s pretty obvious that this could lead to some performance problems.
Although there aren’t that many important parameters like in lanmanserver, there are still a few parameters of the Workstation service you should definitely know about.
Specifies the maximum number of network control blocks that the redirector can reserve. The value of this entry coincides with the number of execution threads that can be outstanding simultaneously. Increase this value to improve network throughput, especially if you are running applications that perform more than 15 operations simultaneously.
MaxCmds actually serves the same purpose as the MaxMpxCt on the Fileserver. Not surprisingly these two parameters have a special relationship. It’s like this: whenever an SMB session is setup (i.e. a shared file is accessed), the SMB session is negotiated. During this negotiation the Fileserver passes down the value of MaxMpxCt to the client (a Terminal server for example). The client then compares this value to his own MaxCmds value. The lower of the two values then is used to set a maximum on the number of outstanding client requests to the File server.
Possible values: 1-65535
The MaxThreads specifies how many threads are allowed to run at once. (Each thread allows one outstanding operation.) By increasing this you can increase the amount of simultaneous work. Each extra execution thread will take 1 Kbyte of additional NonPaged pool memory.
Possible values: 1-255
Specifies the amount of data that must be present in the buffer of the redirector to trigger a write operation. If the amount of data in the buffer meets or exceeds this value, then it is written immediately. Otherwise, it is retained in the buffer until either more data is added or the value of the CollectionTime entry expires.
Possible values: 1-65535
Problems stemming from poor fileserving performance can sometimes be a bit tricky to pinpoint. One way to make sure is by using good ol’ perfmon. The problem with interpreting perfmon counters is that you can never know what the "right" value is unless you have baselined your environment properly. So what to monitor and how to interpret those values is entirely up to you. However, there are some counters you can monitor that I can give some basic tips on. Configure perfmon to monitor the following counters:
You can measure this on the Terminal Server as well, but you should start at the file server. If the queue length is more than one for a sustained period of time, then your disks are hyperventilating. Give them some air: up your I/O throughput. Look on the software-side: are you paging a lot? (that'll kill your I/O throughput right there) or is your system disk heavily fragmented? Or on the hardware side: buy faster disks (15K SCSI) or upgrade your RAID controller.
This is something you should only measure on your Terminal Server(s). You should monitor the "current commands" in the Redirector object. If the value is higher than 20 during sustained periods of time then you could have a bottleneck.
Server Work Queues
The Server Work Queues object should be monitored on the File server. You should monitor the "Available WorkItems" counter. Sustained values smaller than ten mean that the File server is running out of work items. When it does, performance really starts to plummet. Make sure this doesn't happen by upping the MinFreeworkItems value.
In this object there's a counter called "Work Item Shortages". This value represents the number of times no work items were available or couldn't be allocated to service a file request. Obviously if you see any other value than zero, you need to start worrying. Upping the InitWorkItems or MaxWorkItems could help out here.
Again, there's so much more you can monitor but interpreting the results depends heavily on your environment. Just browsing the performance monitor objects I mentioned and playing around with it will give you a lot more information.
So what do I set these registry values to? Unfortunately it’s not that simple. For starters, it depends on your specific environment. Also, an unfortunately side effect of almost every one of these registry values is that when they are increased, they consume more kernel memory. Seeing as (the lack of) kernel memory is often a bottleneck in scaling up in Terminal Server environments, you should be very careful in adjusting/creating the registry settings we discussed. If you are not careful, you could end up having more performance problems than you started out with. You need to know why.
Tuning LanManServer and LanManWorkstation in the registry, requires the use of more Non-Paged Pool memory. This can be a real issue on the File Server (LanManServer). Let me briefly explain where Non-Paged pool memory fits into the whole “2GB-Kernel--Memory-Bottleneck-Of-32-Bit-Windows”.
When you have a 32 bit operating system, this means that you have a 32 bit address space. That translates to 4GB of addressable memory space (2 to the power of 32). This 4GB is evenly shared between the user mode and kernel mode. User mode is the memory space that applications run in and kernel mode is used by the system for everything else. This 2 GB kernel mode memory is divided into several areas, amongst which is the NonPaged Pool. Because there’s only 2GB to share, the NonPaged pool gets configured with a maximum size at boot time. By default this is 256 MB. This 256 MB is the area in which you should perform your (LanManServer) tuning.
Why should you worry about this 256 MB? Well, because if the NonPaged pool is depleted then your system usually becomes unresponsive until some NonPaged pool becomes available again. So how does this apply to LanManServer tuning? Well, if you tune LanManServer in such a way that it allocates memory than the NonPaged pool has available and you indeed use up ALL of that allocated memory then you have effectively pushed Windows beyond its limits.
So what should you do? A safe way of doing it is to tune LanManServer in such a way that it can never deplete the NonPaged pool. The amount of memory LanManServer allocates in the NonPaged Pool is primarily determined by two parameters: MaxWorkItems and SizReqBuf. So if you set MaxWorkItems to 8192 and SizReqBuf to 16644 (default) (which in reality is 20480 due to tracking overhead) the amount of memory LanManServer will allocate is (8192 x 20480 bytes) 160 MB. This fits nicely into the 256 MB NonPaged Pool area.
So it basically boils down to this: If you have more than 512 MB of memory in your Terminal Server (which is every Terminal Server on earth and adjacent planets) then SizReqBuf starts out at 16644. This allows you to push the MaxWorkItems value to 8192. If you try higher numbers to create more of these similar sized WorkItems AND your File Servers tries to use these, you run the chance of running out of NonPaged Pool.
So there is however a decent chance that having 8192 WorkItems does not cut it for you. This is when the bits start to hit the fan. If you’re in that rather sad place, you really have only three options, with option 3 being the safest choice:
- Try making the size of the WorkItems smaller (trough the SizeReqBuf parameter) so you can safely set higher MaxWorkItems values. For example: If you set SizReqBuf lower to 8322 (plus a overhead of 3836 makes 12158 bytes) then this would allow you to have 13800 WorkItems ( 160MB / 12158 bytes).
- You could even try to up the MaxWorkItems and SizeReqBuf values further with the risk that you run out of NonPagedPool. Now, you should also know that you can tune the Kernel Mode memory in such a way that more memory is allocated to the NonPagedPool. The downside to this is of course is that this memory is taken away from other parts of the Kernel Mode memory. I wouldn’t go there if I were you (unless you’re up there with the likes of Mark Russinovich).
- Make sure that less Work Items are demanded from the File Server. This is a topic on its own but quick suggestions are: limit folder redirection (especially Application Data) or / and distribute File Services (put for example home directories on one Fileserver and redirected folders on another).
I have provided two .adm templates, one for lanmanserver and one for lanmanworkstation. I've separated these purposely because the lanmanserver adm template should be applied to your File Server and the lanmanworkstation adm template should be applied to your Terminal Servers.
Thincomputing.net Lanmanserver Tuning.zip
This template (download) contains all of the Lanmanserver parameters discussed in this article. When you import the ADM template and enable the policy, it will set the following parameters to the maximum recommended, safe values:
These optimizations should applied to your FILESERVER, not your Terminal Server. I've included the possibility to 'undo' the optimizations made the template. You can do this by selecting -Undo Lanmanserver Optimizations- and REBOOTING.
Thincomputing.net Lanmanworkstation Tuning.zip
This template (download) contains all of the discussed Lanmanworkstation parameters in this article. When you import the ADM template and enable the policy, it will set the following parameters to the maximum recommended, safe values:
These optimizations should applied to your TERMINAL SERVER, not your File server. I've included the possibility to 'undo' the optimizations made the template. You can do this by selecting -Undo Lanmanworkstation Optimizations- and REBOOTING.
Although some settings have been improved in Windows 2000 and even more in Windows Server 2003, I must say that I’m a bit disappointed that file serving problems like I discussed in the article are still quite common in Terminal Server environments. These problems have been around just as long as Terminal Server has, and one would think these problems would at least be a lot less common, but maybe that’s just my point of view.
Microsoft, finally, recently has published an excellent article which discusses these issues in very good detail. This article isn’t just about Terminal Server environments but it is still the best article Microsoft has ever written on the subject. Bookmark KB317249.
I hope that this document has provided you with enough knowledge to combat file serving performance problems.
There’s however a good chance that these problems with the file serving components of Windows will relatively soon be something of the past or at least be a lot less common. Windows Vista and Longhorn server will incorporate many changes, amongst which are major revisions in the file serving components. For example Vista comes with a major revision of the SMB protocol identified as SMB 2.0. The current protocol (SMB 1.0) was built to support file-serving solutions a couple of decades ago and was based on the assumptions existing then.
These are some of the key enhancements in SMB 2.0:
- SMB 2.0 supports an arbitrary, extensible way of compounding operations to reduce round trips. This makes the protocol less chatty as compared to SMB 1.0. Chattiness of SMB 1.0 has often been a major pain point.
- SMB 2.0 supports much larger buffer sizes compared to SMB 1.0.
- SMB 2.0 greatly grows the restrictive constants in the protocol, so we never need to worry about the protocol itself being the limiting factor for scalability. This includes increasing the number of concurrent open file handles on the server, and the number of shares that a server can share out, among other things.
- SMB 2.0 supports durable handles that can withstand short network glitches.
All these enhancements in SMB 2.0 will result in better performance and security over LAN and WAN.
Sounds good huh? I’ll believe it when I see it, but the file serving future looks bright!