Brian Madden Logo
Your independent source for application and desktop virtualization.
advertisement

Terminal Server Freeze on W2003, in the Performance Tuning / Server Sizing forum on BrianMadden.com

rated by 0 users
This post has 8 Replies | 2 Followers

Not Ranked
Points 145
David Johnson Posted: Mon, Jan 28 2008 7:29 AM
We are running WTS on Windows 2003 server and are seeing regular instances (10 per day per terminal server) where the entire terminal server locks up for 10-20 seconds, i.e. all WTS sessions are frozen and Windows Server is itself locked up to the extent that perfmon stops recording counters. After the freeze is over then the server (and user sessions continue normally).

There seem to be a number of known issues around this area like this one:

http://support.microsoft.com/kb/317357/EN-US/

For many of these types of issue the workaround seems to be "Enhance system disk-write performance, and turn on write-back caching."

Problem is the hardware we are using does not support write-back caching.

Does anyone out there recognise this problem? - if we cannot enabling write-back caching then is our hardware never going to run WTS properly?

thanks,
Dave

  • | Post Points: 35
Not Ranked
Points 145
I've found out more about what's happening but I'm at a loss to explain what's happening, maybe someone out there knows ....

What we see in the glitch is that the system process is flushing the registry to disk for all active users on the server, this takes 5-10 elapsed seconds during which time the server is frozen.

Its like Terminal Server decided to flush the registry for all users at the same time - is this what it does ?, if so why etc ?. I was expecting the lazy hive writer to pick up each update from the LOG file every 5 seconds, not to have the LOG file and DATA files updated straight after each other.


17411218 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\ntuser.dat.LOG
17411220 16:09:06 System 4 WriteFile C:\$LogFile
17411221 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\ntuser.dat.LOG
17411222 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\ntuser.dat.LOG
17411223 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\ntuser.dat.LOG
17411224 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\ntuser.dat.LOG
17411225 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\ntuser.dat.LOG
17411226 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\ntuser.dat.LOG
17411227 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\ntuser.dat.LOG
17411230 16:09:06 System 4 WriteFile C:\$LogFile
17411232 16:09:06 System 4 WriteFile C:\$LogFile
17411233 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\ntuser.dat.LOG
17411235 16:09:06 System 4 WriteFile C:\$LogFile
17411237 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411238 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411239 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411240 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411243 16:09:06 System 4 WriteFile C:\$LogFile
17411245 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411248 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411256 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411259 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411262 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411267 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411272 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411278 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411281 16:09:06 System 4 WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT
17411287 16:09:06 System 4 WriteFile C:\Documents and Settings\LocalService\ntuser.dat.LOG
17411289 16:09:06 System 4 WriteFile C:\$LogFile
17411290 16:09:06 System 4 WriteFile C:\Documents and Settings\LocalService\ntuser.dat.LOG
17411291 16:09:06 System 4 WriteFile C:\Documents and Settings\LocalService\ntuser.dat.LOG
17411293 16:09:06 System 4 WriteFile C:\Documents and Settings\LocalService\ntuser.dat.LOG
17411294 16:09:06 System 4 WriteFile C:\Documents and Settings\LocalService\ntuser.dat.LOG
  • | Post Points: 20
Not Ranked
Points 30

I've seen the same problem with one of our customers. I found out it was related to the background policy update procedure in Windows. When I executed gpudate the Terminal Servers froze immediately. The workaround was to disable background updating of policies. Unfortenately I didn't find a more constructive solution.

Hope this helps.

  • | Post Points: 5
Top 150 Contributor
Points 1,151

Hi,

Are you running UPHClean? Also, have you tried disabling your antivirus software to see if it makes a difference?

  • | Post Points: 20
Not Ranked
Points 30

Mike Smith:

Hi,

Are you running UPHClean? Also, have you tried disabling your antivirus software to see if it makes a difference?

Yeah this client is running UPHClean. I think we've tried almost everything, including disabling anti-virus.

The freezes happend quite regularly, but not with a fixed interval. This is also common for background group policy updates, they happen in a set interval with a random offset. Combining this knowledge with the output from filemon, where writes to the registry files were shown at about the same time of the freeze, led me to beleive that the group policy update proces had something to do with it. I was able to verify this by running gpupdate manually and see if the Terminal Server froze or not. It did.

Disabling background group policy processing on the Terminal Servers was a workaround our client could live with, because they do not change policies very often and are migrating to Windows 2008 very soon.

  • | Post Points: 5
Top 10 Contributor
Points 24,510

Hi,

It sounds like you need to do some SMB tuning. The easy way is to use the Total Control Templates (TCT) from Login Consultants (you need to register in order to download):

http://www.loginconsultants.com/index.php?option=com_docman&task=cat_view&gid=20&Itemid=149

For background information, have a look at these articles:

http://www.brianmadden.com/blogs/guestbloggers/archive/2007/02/19/updated-lanmanserver-and-lanmanworkstation-tuning.aspx

http://www.loginconsultants.com/index.php?option=com_content&task=view&id=121&Itemid=107

And review this post for some good information too:

http://www.brianmadden.com/forums/t/18432.aspx

Alan Osborne

President (MCSE, CCNA, VCP, CCA)

VCIT Consulting - Citrix/Terminal Services Remote Desktop Solutions for SMB

VCIT website My Blog

  • | Post Points: 20
Not Ranked
Points 30

Hi Alan,

Thanks for replying. As it was some time ago that I encountered the problem at hand, unfortunately I can't remember specifics. What I do know is that at first we thought it had something to do with SMB performance. I've read the article on brianmadden.com before and I think it's very useful. So we applied the theory mentioned by measuring the applicable performance counters, such as server work items. After analysing the results we're led to believe the problem had nothing to do with SMB performance, because the counters did not show any values to suspect problems in the SMB area.

Thererfore we searched further and came to the apparent relation with the group policy update process.

Why do you think the group policy update process relates to a problem with SMB performance, e.g. why do you think SMB tuning will help solve the specific problem mentioned with gpupdate? I'm curious, because I fail to see the relation (in this case) and it could be important for solving this problem.

Kind regards,

Lex de Visser

  • | Post Points: 20
Top 10 Contributor
Points 24,510

Actually, it wasn't me that suggested a GP issue - someone else.

Have you looked at Disk Queue and Processor Queue length counters in perfmon yet? What about paging counters?

A RAID controller with WB caching is ALWAYS preferred. If you identify the disk sub-system as the bottleneck, you might want to consider getting a new disk controller (if possible).

Alan Osborne

President (MCSE, CCNA, VCP, CCA)

VCIT Consulting - Citrix/Terminal Services Remote Desktop Solutions for SMB

VCIT website My Blog

  • | Post Points: 5
Top 10 Contributor
Points 24,510

I just noticed that the log dump you provided is strange. All of the "WriteFile C:\Documents and Settings\NetworkService\NTUSER.DAT" operations are for the NetworkService profile. I wonder if some service that runs under that account is causing the problem. Another possibility is that the user hive for that account is corrupt or AV software is interfering.

 

Alan Osborne

President (MCSE, CCNA, VCP, CCA)

VCIT Consulting - Citrix/Terminal Services Remote Desktop Solutions for SMB

VCIT website My Blog

  • | Post Points: 5
Page 1 of 1 (9 items) | RSS