XenApp 5.0 Farm running on Windows Server 2008 becomes unresponsive, in the Citrix XenApp / Presentation Server forum on BrianMadden.com
Brian Madden Logo
Your independent source for desktop virtualization, consumerization, and enterprise mobility management.

XenApp 5.0 Farm running on Windows Server 2008 becomes unresponsive, in the Citrix XenApp / Presentation Server forum on BrianMadden.com

rated by 0 users
Answered (Not Verified) This post has 0 verified answers | 468 Replies | 73 Followers

Not Ranked
Points 965
Onno van den Berg posted on Fri, Jun 19 2009 6:06 AM

Hi,

We are encountering some strange problems with a XenApp 5.0 farm that runs on Windows Server 2008 SP2. The farm consist out of 2 XenApp 5.0 servers with a single server that hosts the web interface. Underneath the OS we run VMware ESX 3.5 update 4. Usually everything runs ok but's the performance is very bad. The XenApp servers have 2 vCPU's and 4GB memory. We use roaming profile with folder redirection for some folders.

WE publish the desktop to our users so we installed all application on the 2 servers en use NTFS permissions for application access.

The problems:
- Performance is not ok. When users are typing in Word (Office 2007) or other applications the characters will display with some latency, not always it happens sometimes.

- One server becomes unresponsive. Application that are allready started work very slow. New application which are started from the Start Menu won't appear. After some minutes the server goes further and all application will appear. It looks like the server is sleeping for a while. When that happens the System log shows the following errors:

7011 - A timeout (30000 miliseconds) was reached while waiting for a transaction response from the RasMan service.
7011 - A timeout (30000 miliseconds) was reached while waiting for a transaction response from the NlaSvc service.
7011 - A timeout (30000 miliseconds) was reached while waiting for a transaction response from the CryptSvc service.
10010 - The server {AAC1009F-AB33-48F9-9A21-7F5B88426A2E} did not register with DCOM within the required timeout.

Kind regards,

Onno van den Berg

  • | Post Points: 812

All Replies

Top 100 Contributor
Points 2,634
Mowens replied on Fri, Jun 19 2009 8:09 AM

How does task manager look?

  • | Post Points: 35
Top 10 Contributor
Points 24,600
Alan Osborne replied on Fri, Jun 19 2009 2:13 PM

Hi,

There are too many tweaks and best practices for running Citrix VMs to go over here, so instead I'll point you to some good references on the topic:

http://virtualfuture.info/2009/03/citrix-xenapp-on-vmware-esx-1-or-2-vcpu/

http://knmi.wordpress.com/best-practices-for-deploying-citrix-on-esx/

http://virtualfuture.info/2008/07/citrix-on-vi3x-recommendations/

http://virtualfuture.info/2008/10/more-xenapp-45-on-vmware-recommendations/

http://viops.vmware.com/home/docs/DOC-1226;jsessionid=F48BFEB860FA403B1BEEFBE828963B5F

Critical points:

- Don't P2V. Also, if you migrated from VMWare Server to ESX, make sure the default setting of having debugging enable has been turned OFF. Look under Options -> General -> Debugging and Statistics - should be set to "Run normally". This can cause brutal performance problems.

- Performance issues can almost always be traced to excessive hardware interrupts within the VM, which cause tons of context switches. You can confirm this using Performance Monitor. You should eliminate as much unneeded virtual hardware within the VM as you can. Disconnect or eliminate virtual floppy and CD-ROM drives, serial ports, etc. Also, go into the BIOS settings of each XenApp VM and disable anything not needed in there as well. Things like Legacy Diskette A:, local bus IDE adapter channels, I/O device config for serial/parallel/floppy controllers, etc

- If your ESX hosts support nested page tables TURN THAT FEATURE ON IN THE VMs - this one change alone will make a huge difference to performance. The latest quad core Intel and AMD processors provide nested page table support. You need to tweak the setting in the VMX file directly - http://communities.vmware.com/docs/DOC-9150

- If you have lots of cores, use dual vCPU VMs otherwise use single vCPU VMs. Also, make sure you have the correct HAL! Major performance problems if you don't

- Disable hyperthreading on the ESX hosts

- Don't use processor affinity

- Disable sound if you don't need it

- Despite advice elsewhere, use the memory balloon driver. I haven't had issues with it and it helps to conserve RAM

- Despite advice elsewhere, do not disable page sharing

- Make sure you have the latest VMware tools version installed and disable on installation the Shared Folders option

- Disable client drive mapping if you don't need it

- Configure real-time virus scanners to scan only modified files or disable real-time scanning altogether

- Use the Login Consultants TCT templates to properly tune kernel memory, SMB, etc

- Within the W2K3 OS, disable kernel paging (http://support.microsoft.com/?kbid=184419). Good article on other tweaks - http://www.redbooks.ibm.com/redpapers/pdfs/redp3943.pdf

Lots more info if you Google for xenapp virtual machine best practices

Alan Osborne

President (MCSE, CCNA, VCP, CCA)

VCIT Consulting - Citrix/Terminal Services Remote Desktop Solutions for SMB

VCIT website My Blog

  • | Post Points: 50
Not Ranked
Points 965

Thanks for de replies. Those URL's are known to us so we tweaked the enviroment. But i'm unable to find and relative information about Xenapp 5.0 running on Windows 2008 in combination with VMware. Right now we are trying to find the bottleneck which causes the performance issues. We installed a new Xenapp server in the farm with a seperated published desktop. Users connected to tha server do not have roaming profile and folder redirection enabled. Futhermore we have some clues that the SAN could be the bottleneck in the whole enviroment. Or could it be the combination of Windows 2008, Xenapp 5.0 and Office 2007 that's causes performance issues.

We allready resolved some issues with the SAN infrastructure. We saw high peeks on the DAVG en GAVG counters while monitoring the ESX hosts using ESXTOP. So the changes the storage paths and the high peeks on those counters are gone. Now we see some peeks on DAVG/RD and GAVG/RD. Anyway it's very hard to find the problem owner.

 

 

 

  • | Post Points: 5
Not Ranked
Points 965

Things are very strange even with 5 users connected to the one of the Xenapp servers with roaming profiles and folder redirection enabled. They encounter problems like latcency within Word one user typt 3 lines and Word showed just one word after a couple of seconds the rest came over. Other complains are maximazing a window which couse latency while most of the thinclient are on the local LAN. Sometimes they even ask the IT department to disconnect their session because nothing happens anymore their session hangs. CPU utilization is at that time between 25% and 35%.

I treid to logon to the server using a local user account. That's because of the locked down group policy that is active on the O/U where the Xenapp server resides. Well nothing happens I'm just getting a black background and that's it other users are still active and i'm able to logon using a different domain based user account.

On the test server without roaming profiles and folder redirection we also encounter some slight performance problems for example when a user scrolls trough his Excel document the scroll bar will continue when the user allready stopped scrolling. But on that server there are 3 users active.

We also have a single Windows 2003 server active with Xenapp 4.5 and Office 2003 which runs fine with 15 users on it. In both cases the event log doesn't show weird error messages.

  • | Post Points: 20
Top 10 Contributor
Points 24,600

Your going to have to run some perfmon counters for at least a day to identify which sub-system is the culprit. In other words, narrow down the problem and post back your findings. If you search these forums for the words performance and perfmon, you'll find plenty of help on what to monitor and how to interpret the results.

These articles will get you started:

http://www.brianmadden.com/blogs/terminal_services_for_microsoft_windows_server_2003_advanced_technical_design_guide/pages/monitoring-your-terminal-servers.aspx

http://www.msterminalservices.org/articles/Windows-Performance-Monitor-Baseline-Terminal-Server-Part1.html

Alan Osborne

President (MCSE, CCNA, VCP, CCA)

VCIT Consulting - Citrix/Terminal Services Remote Desktop Solutions for SMB

VCIT website My Blog

  • | Post Points: 20
Not Ranked
Points 965

Thanks for the tips. I found a post on the Citrix site http://forums.citrix.com/thread.jspa?threadID=97619&tstart=15 they are encountering the same problems as we do.

Almost any of the posts we found for tweaking and tuning the OS are for the Windows Server 2003 OS. I'm very interested in the experience of other admins with Windows Server 2008 as Citrix server. We don't see any strange behaviour when we monitor the machines with perfmon.

  • | Post Points: 20
Not Ranked
Points 58
rudyt replied on Thu, Jun 25 2009 10:28 AM

Hi,

We have exactly the same problem with Terminal server 2008. The events that you mentioned also apear in our eventlog. The server becomes unresponsive and if you try to login , you'l see a screen that stay's on "welcome".

I wonder if you've found a solution for this ????


7011 - A timeout (30000 miliseconds) was reached while waiting for a transaction response from the NlaSvc service.
7011 - A timeout (30000 miliseconds) was reached while waiting for a transaction response from the CryptSvc service.
10010 - The server {AAC1009F-AB33-48F9-9A21-7F5B88426A2E} did not register with DCOM within the required timeout.

more information:

We've noticed it often starts happening when users are logging out at the end of the day.

We publish a desktop.

  • | Post Points: 20
Not Ranked
Points 965

Hi,

No, we haven't found a solution yet. We are still working on it we're trying to alter some registry settings on the Xennapp servers and hopefully they resolve most of the issues. Are you also running the TS server on VMware ?.

If we found a solution I'll post it on this forum.

Kind regards,

Onno 

  • | Post Points: 20
Not Ranked
Points 238
sk0tto replied on Thu, Jul 30 2009 4:30 PM

I'm having the same EXACT issue on two Hyper-V VMs.  They are both running Terminal Services on 32-bit Windows Server 2008 Enterprise.

{AAC1009F-AB33-48F9-9A21-7F5B88426A2E} points to tstheme.exe in hkey_current_root\clsid.  The strange thing to me is that this particular guid is in all lower case in regedit, and all the others are in caps.  The security on this reg key seems strange, also, but that may be for another discussion.

I believe that tstheme.exe  is part of the Desktop Experience feature in Server 2008.  Has anybody attempted removing this feature from TS to see if it resolves the issue?

  • | Post Points: 95
Not Ranked
Points 370

I have been having the same exact problem, only we are not using VMWare at all.  We are running XenApp 5 on all physical servers.  We have 12 - DL360 G5's with Windows Server 2008 x64; 10GB RAM, 2 – quad-core Xeon CPUs.  We have basic applications installed & published (Office 2007, Internet Explorer 7, Windows Explorer, Notepad, Stedman’s Medical Dictionary, & RightFax).  We also had the exact same application set published on our previous Presentation Server 4.0 farm on Windows Server 2003 x86 servers; however, we never experienced these issues on that environment.

I have seen the exact same errors in the event logs that most of the folks on this thread have posted & I’ve researched them down many rabbit holes that provided me with no resolution.  We are using seamless windows & I'm not sure how performance looks when a server becomes unresponsive but only a few users report problems when it occurs.  The only way I can correct this is to manually power the server down hard & then power it back up.  This is not limited to any 1 server either.  It has happened randomly to ever server in my farm over the course of the past couple of months since we rolled it into production.

What happens is the following:

1) If XA-Server-1 hangs & I can't see any users when drilling down to XA-Server-1 in the AMC.  It just processes on "There are no items to display" then it eventually times out & generates an error message that it couldn't return the results.

2) If I click the Servers node in the AMC & sort by server, I can scroll down & see that the majority of users on XA-Server-1 is still showing Active under the "State" column & those users don't report any issues.  However, there are a few users (usually 2-5) that show Disconnected under the "State" column.

3) For whatever reason, the application session they are running gets disconnected from that particular server.  When they try to re-launch the application (let's say Outlook), the farm recognizes that this user has a disconnected session of Outlook on XA-Server-1 so it tries to reconnect that user's session on that server.  But because the server is in a hung (unresponsive) state, the user cannot get the app to launch.  The user just sees their XenApp client trying to reconnect on their desktop & it never does.

Peculiararities I've noticed with each server when they go into this unresponsive state are:

1) I can Remote Desktop to the server but when attempting to login it just processes on the Welcome screen, as others in this thread have reported.

2) SMB still works because I can connect to shares on the hung server via UNC paths.

3) I can connect to the server remotely using Computer Management & see everything; however, I cannot seem to start/stop services.  I’ve tried to stop the IMA service before & it timed without stopping it.

Things I’ve tried to resolve the issues are:

1) Scheduled nightly reboots of all servers.

2) Disabled Session Reliability.

3) Applied Hotfix XAE500W2K8X64004 to all servers (http://support.citrix.com/article/CTX120139 - Servers might become unresponsive when enumerating applications).

 

I’m not really sure what else to do.  Any direction from anyone who might be able to offer assistance would be appreciated.

BJ Bodden

  • | Post Points: 20
Not Ranked
Points 70

Hello,


Have the exact same issue with 2008 Terminal server with random timeouts of services, to many to list but numerous time outs and total hang of server.  Only way to resolve was to power the VM off and power back on.

I raised technical support call with Microsoft yesterday and got following fix after remote control sesison with them.

Disable the following 2 services which relate the HP universal printer driver, we dont use this but did previously and it was quickly removed as it is a horrible driver.

Disable and Stop - Driver HPZ12 and PML Driver HPZ12

Once you have done this modify the permissions on the following registry key:-

Add Authenticated User under HKCR\CLSID and give Full Control to the Key (apparently this should be assigned by default but was not in our environment, we didn't remove this key though)

Reboot server.

Only just implemented this fix so cannot confirm if 100% resolves the issue or not but will report back.  Also if you are removing the HP universal print driver I would strongly advise to remove it but if you have to use it contact HP as Microsoft advised there is a fix for these 2 services I have disabled.


Thanks

  • | Post Points: 50
Not Ranked
Points 965

Well, after many hours of testing and reconfiguring we found a solution for a part of our problems. First we ran the whole XEN enviroment within VMware ESX 3.5 u4. Running on 3 physical servers connected to a HDS SAN. In all circumstances we had performance issues and we had services that froze so the whole server became unresponsive. After some minutes the server would come back but users had many issues with it. XEN was configured with 2 XenAPP servers, 1 server hosting the webinterface and a seperate SQL server voor de database. While running 5 users on the farm the servers became unresponsive and people were complaining about the lack of performance. So we added a new virtual XEN server to the farm that was using local profiles only. That's bacause we also had problems with the userprofiles and so we could test if the roaming profle with folder redirection was the big issues with the freezing problems and the lack of performance. In the mean time we also tried all kind of registry tweaks but none of them resolved some issues.

So at that time we had the following problems:

  • Unresponsive XEN servers
  • Lack of performance (very slow)
  • User sessions suddenly froze or were very slow sometimes it was only one user at a time

Solutions

The unresponsiveness and slowness was caused by I/O problems in combination with buggy software. After we upgraded VMware ESX to 3.5 update 4, installed the new VMware tools and all Microsoft fixes and servicepacks the unresponsiveness was almost gone. Only the performance was still very bad. Right now we run Windows 2008 on a physical server and that solved the performance issues completely. The ony thing we have to alter is installing a new server and switch over to roaming profiles again. So the first 2 problems were almost gone but we still had users complaining because their session would become slow until it stopped working. After 2 weeks we found the problem in the thinclient (HP T5135). We gave the users a slim pc (pc/notebook with Windows XP and ICA client) and they had no problems anymore. 

@markc_1982

we also use the HP universal driver which has a bug. we used a workaround using a empty file so the service wouldn't restart itself anytime and it also resolved the CPU hugs.

  • | Post Points: 5
Not Ranked
Points 370

Thanks for the suggestion, markc_1982.  I've looked through all of my services & installed print drivers on my servers but I don't see either a Driver HPZ12 or a PML Driver HPZ12.

I worked the Google on the Internet machine to see if I was overlooking something about these drivers/services but from what I can tell they are supposedly regular old services that register in the service control manager.  However, I don't see them on my Citrix servers at all.

Did you install any additional HP print drivers or are these 2 drivers in particular supposed to be installed by default?

I've opened a case with our solutions advisors, who are Citrix partners, so once they start digging in I'll update the thread with what is found.

BJ Bodden

  • | Post Points: 5
Top 500 Contributor
Points 1,045

Hello Guys,

I am having the same issues here. I have 4 TS Servers, only 2 have this issue. One is a Hyper-V Host and the other is a Hyper-V Guest.

I notice this in the System Event Logs 

The server {AAC1009F-AB33-48F9-9A21-7F5B88426A2E} did not register with DCOM within the required timeout.

Which relates to TSTheme.exe

I am looking forward to see if Markc's resolution works out, but in the mean time, I have been looking at a few suggested Hot Fixes by MS

 

http://support.microsoft.com/kb/956438 This article describes my situation perfectly.

Although these files listed in this KB are older than the ones I have on my system.

I was able to get them to send me the latest files using KB968992 which have the same files, but a different reason for the KB.

Let me know if anyone has found a solid solution for this issue.

 

Thank you

 

 

  • | Post Points: 50
Page 1 of 32 (469 items) 1 2 3 4 5 Next > ... Last » | RSS