Hi,
We are encountering some strange problems with a XenApp 5.0 farm that runs on Windows Server 2008 SP2. The farm consist out of 2 XenApp 5.0 servers with a single server that hosts the web interface. Underneath the OS we run VMware ESX 3.5 update 4. Usually everything runs ok but's the performance is very bad. The XenApp servers have 2 vCPU's and 4GB memory. We use roaming profile with folder redirection for some folders.
WE publish the desktop to our users so we installed all application on the 2 servers en use NTFS permissions for application access.
The problems:- Performance is not ok. When users are typing in Word (Office 2007) or other applications the characters will display with some latency, not always it happens sometimes.
- One server becomes unresponsive. Application that are allready started work very slow. New application which are started from the Start Menu won't appear. After some minutes the server goes further and all application will appear. It looks like the server is sleeping for a while. When that happens the System log shows the following errors:
7011 - A timeout (30000 miliseconds) was reached while waiting for a transaction response from the RasMan service. 7011 - A timeout (30000 miliseconds) was reached while waiting for a transaction response from the NlaSvc service. 7011 - A timeout (30000 miliseconds) was reached while waiting for a transaction response from the CryptSvc service. 10010 - The server {AAC1009F-AB33-48F9-9A21-7F5B88426A2E} did not register with DCOM within the required timeout.
Kind regards,
Onno van den Berg
I'm in the same exact boat as the others so I'm grabbing a paddle and hoping to get my hands on the private patch. I just need a case number to reference so I can try to get the same patch that has worked for others.
Hi Toby, if you are after KB976674 I would hold off on that. It made my life hell and Microsoft basically confirmed that they buggered that one up. Getting a crash dump for this issue has been a real challenge given the state the servers get themselves in to but I finally managed to get one yesterday and uploaded to MS last night.
I'm not sure about you but for me the frequency of this issue increases with load so the best I can manage at the moment is to spread the user load as best as possible and be on hand to bounce the box when it does happen.
I'll post any updates.
does anyone else have any updates or progress? How is the "for testing purposes only" hotfix going?
Hi Danny, The private ("for testing purposes only") hotfix that I received has been working great for quite a while now. From what I've heard from my MS Support contact, it looks like the fix may be publicly released this or next week. I have an email out to him to further check on the status. I'll post more once I hear back.
Jamie
Jamie.. Thanks for the reply and I'm really looking to get my hands on the private fix sooner than later as our users are starting to complain more about the issue.
Do you have a case number I could reference when I call in? I would like to think Microsoft would release a public fix this week or next but if it's typical Microsoft fashion, it won't be until sometime in Januaray.
All-
I received a notification that the hotfix will be ready next week. I should be receiving an update on 12/08.
I now have a v2 release of the 976674 Private Hotfix which I'll be deploying on one of my servers tonight. I'll let y'all know how it goes.
If this testing goes well I've been told it will be released publicly within 7 days. fingers crossed!
Good Luck !!!
I do not know if I have mentioned this online, but I can pretty much tell when my server is going to freeze. If I have any bad applications that I cannot kill the process to.
Most of the time it is an Outlook process that is left open. I will take over the users session and Outlook will not be running, and I cannot end the Outlook process.
I have seen this with Excel as well, and QB Enterprise really had a hard time when I removed the QB database manager from the start up. This I believe is a good test. If you have Enterprise QB and take out the database mgr, and try to open a file, it will hang and if anyone tries to log out the server it will freeze the server up.
Has anyone else noticed similar behaviors?
Apparently, we are having more luck than most as KB976674 appears to be the answer to our problems. We have been running the "for testing purposes only" version of it in production for weeks without issue. Also, we have been running the official release on less than 5% of our production servers with 50% more users for 2 days without issue (which, historically, was reproducable in less than 12 hours without the hotfix). It sounds like our issue is as described by jeauxk - "a deadlock between the Cache and SMB module". This hotfix resolved the following symptoms: (1) NLASvc service timouts resulting in cascading service failures. (2) Terminal Server otherwise unresponsive. (3) New logins to Terminal Server stuck at the Welcome screen. (4) Unable to kill processes from Terminal Server sessions. (5) Terminal Server unable to access SMB shares yet the Terminal Server's SMB shares (\\TSserver\c$) are accessible from other systems.
From the ongoing posts it would appear that there is at least one other root-cause that results in similar symptoms to what we have been experiencing (NLASvc service timeouts) yet is not resolved by KB976674. It also sounds like there is a "for testing purposes only" hotfix in circulation with a hopeful soon-to-be-release official version to address this particular root-cause.
If anyone has the time or energy, let's see if we can't come up with some specifics to differentiate between the two issues (with similar symptoms).
So far so good. I had no issues yesterday on the with 976674 v2 and have installed on another 3 servers. Today is going well too. No server hangs yet and we are giving the servers a bit of a hammering to stress test them a bit. all going well I'll install across all remaining servers tonight. It's looking promising for a public release shortly.
Our issue specifically can be described as RDBSS (the redirected drive buffering sub system) having a deadlock with the memory manager. This is preceeded by 7011 events in the system log (starting with NLASvc) and causes the server to become completely unresponsive (to the point where we couldnt even force a blue screen using the attached keyboard). RPC is still working as we can still connect to shares and remotely view event log etc.
I'm have a couple of other support calls in as well. the most significant one being with Citrix where the IMAAdvanceService terminates CSRSS (which results in the server blue screening) but that's a different animal.
Server 2008 Terminal Services, in my opinion, was clearly never ready to play this game.
For me KB976674 v2 also seems to do the trick. I've installed it on our test server on Sunday, put it on one production server on Monday and on another three prdoduction server Tuesday. All ran without problem. As of today all production servers run with KB976674 v2.
RegardsStephan
Danny or Stephan,
Would you post the file versions of rdbss.sys and ntoskrnl.exe after having installed KB976674 v2? I assume I have of KB976674 v1, which appears to be working for me and will be deployed to all our production servers this weekend. After installing KB976674 (v1?), mine are both version 6.0.6002.22263.
Hi David
Both files are version 6.0.6002.22276. Creation date for ntoskernel.exe is Nov. 27, rdbss.sys is Nov. 26.
Could you send the fix/or ms incident number that i could reference to get a copy of the hotfix. i seem to be having the same issue.
Getting KB976674v1 from MS was no problem, but getting v2 I became following answer:
##########################################
Problem:
-------------------------------------------------------------------
Request for hotfix :
976674 The computer stops responding when you access some shared files from a computer that is running Windows Server 2008 or Windows Vista
Solution:
Hotfix still in beta testing and there is no available version for sending.
Release date is scheduled to be end of January 2010 if no further problems are being encountered during testing.
So I have to sit and wait?
Michael
Send me a personal message and I will send you the files.