Brian Madden Logo
Your independent source for application and desktop virtualization.
advertisement

Mixed results virtualising XenApp on ESX 3.5 - Does this sound about right?, in the Virtualization + Server-Based Computing forum on BrianMadden.com

rated by 0 users
This post has 3 Replies | 2 Followers

Not Ranked
Points 40
huggybear Posted: Thu, May 7 2009 6:39 AM

So I'm currently undertaking a POC in order to test virtualisation of XenApp on ESX 3.5. Before undertaking this project I read as many articles as I could on ESX and XenApp optimal settings, single vs dual CPU and so on. What appeared to be the general consensus is scaling out was better than up e.g. more single cpu vm's vs less dual or quad cpu vm's. However contrary to these opinions, the project VRC tests proved that dual CPU vm's produced the best results for their tests.  My objective is to attain a 4:1 consolidation ratio against standard blade BL20p's.  My test platform is HP DL580 G5 16 cores 128GB RAM.

So the purpose of this thread as there is a lot of text coming up :) is to gather some thoughts on how I may be able to improve number of concurrent sessions as all tests I did seem to suggest 14/15 sessions was the limit.

A bit about my test approach before I discuss the mixed results which is the purpose of this thread. I am using EdgeSight for Load Testing to simulate ICA sessions.  I tested against 3 scripts running different types of apps and also tested on various VM configs.  Unfortunately I was limited to 4GB RAM per vm but this didn't affect my results.  The main 2 vm configs were single cpu vs dual cpu vms.  Single CPU would give me 16 vm's and dual CPU 8 vm's given the number of cores available on the hardware.  I also tried various ESX configs never over committing resources and found the following:

  • Advanced CPU HT Sharing - optimal setting is set to none so cores are not shared.  When set to any performance degradation was huge.  This setting is key.
  • CPU affinity - this didn't seem to make a difference presumably because of the setting above.  I'm still trying to work out whether there is any difference in these 2 options?
  • Virtualised MMU - tried both to force and disable.  Neither setting seemed to improve or degrade performance.  I suspect this is because I had more than enough RAM on my ESX box and never hit the VM RAM limits.
  • Other general settings as per various forums: disabled all devices, didn't install balloon or file sharing driver, standard Citrix optimal settings.

I created 3 scripts to test various apps and scenarios and this is how it panned out.  Each test launched users at 1 per minute and the same test was applied against a blade BL20p (4 cores).

  1. Script 1 launches Excel, Word and Adobe Acrobat.  Enters random data in Excel, creates charts etc, copies to word in order to use OLE. Then word opens random set of docs.  Then adobe also opens random pdf docs and uses the autoscroll feature which is cpu intensive operation.
  2. Script 2 launched a vendor application which is relatively cpu intensive but reasonably light on memory (60MB per user).  The app then simulated general routine user activity.
  3. Script 3 launched an in house application which was memory intensive (1GB per user).

In case 2 my bare metal optimal load was 8 users.  The app launch was cpu intensive and then during the script.  At 10 users cpu usage was averaging 90 - 100%.  The same test was applied to 2 cpu vm where the same cpu usage patterns arrived at 4 concurrent users with 5 too many.  Context switches were constant in both cases.  Conclussion: Half number of users per vm but with 8vm's objective achieved.

In case 3 the application took an extremely long time to launch since it was loading everything into memory.  On the BL20, when the 4th user launched the app, user 1 was still loading the app.  This degraded performance.  The app was also not multithreaded so each process would attempt to hold 100% of a core.  Once everything was loaded, performance was good however being a critical app there is no way we can risk long load times impacting on normal usage.  On the dual cpu vm's the speed of the cpu's meant that at the same load rate, applications were loading much quicker.  Conclussion: Because of the quicker load times, low number of users and processes context switches weren't an issue and speed of ESX or underlying hardware made virtualisation viable.

Now for case 1.  This is were I hoped to get best results but where the opposite happened.  My bare metal kit allowed 40 - 45 users before performance degraded.  My first test on single cpu vm's allowed me to get between 8 & 10 users.  My second test on dual cpu vm's only allowed for 15 to 17 users.  At this point context switches escalated and performance degraded.  Conclusion:  single cpu vm's seem to definitely offer better performance when launching more sessions and processes however 10 seems to be the limit. 

After all of that I do have some questions:

  1. Is there anything I have missed that would allow me more users on my single or dual vm for test 1?  I have read articles with people suggesting 20, 30 even 40 users per vm.  HOW??
  2. Has anybody else done similar testing and seen the same results only to find out they'd missed something?  I'd love to hear others experiences.

cheers

huggybear

  • | Post Points: 20
Top 100 Contributor
Points 1,635

In our converting from PS4.0 to XenApp4.5 we have done some 'real world' testing.  The scripted application testing is fine but it's not real life.  Depending on the app, even a cpu intensive one, won't have all users logging in and working constantly.  As you've seen in your testing you will have to configure your servers in line with the application performance.  Running a standard Published Desktop (MS Office, Acrobat, MSIE) with specialized apps launched via PNA off said desktop we can get almost 30 people comfortably.  Somewhere north of 30 the servers experience performance issues and other anomalies that prevent additional logins, management, task manager, etc. so I cap them at 26.  Dual cpu and 4gb ram is what we've found to be the optimum config.  Enterprise version, to increase the RAM, didn't do enough to make it worth while.  6-8GB ram allowed more users (40?) but now you get to a place where we'd rather have 4-5 servers supporting 26 people each, with room to breath and redundancy vs. 2-3 servers w/maybe 40 people each, but we can't afford to lose a server.  Then you might want to add cpu's but the payoff wasn't there.  Similar w/64bit server, but other 64bit issues arose, so we abandoned 64bit for this environment.

Not sure if this helps but willing to keep the conversation going.

  • | Post Points: 20
Not Ranked
Points 40

Hi there,

Thanks for the response. Would have like to have tested on real world users but unfortunately wasn't an option in the early stages although I always knew that we would typically get more users on a system than during a fully loaded simulation test.  I'd be interested to know whether you have any specific ESX settings configured for your Citrix virtual environment e.g. CPU affinity, MMU, any custom settings etc.  Also you mentioned dual cpu configs seem to be the best CPU approach.  I was wondering if perhaps you tested single CPU configs and what or why you had made this conclusion?  This is obviously quite a key decision in terms of how many vm's you can host particularly when you have 16 cores to play with so it is a key area for me and certainly the tests I did seem to suggest single CPU's would scale better.

I totally agree with the RAM question, 4GB is probably more than enough in most situations although there are incentives with Microsoft in buying Enterprise edition in that you get 4 vm lics with that.

Anyhow once again thanks for the response and if you have anymore info relating to the optimal CPU config that would be great.

  • | Post Points: 20
Top 100 Contributor
Points 1,635

The setup & config of our VM environment is handled by our Architect.  I'll check re: those specific settings.

We did test single cpu configs.  Single cpu allowed 12-14 users before we started having issues (no more logins allowed, task manager unresponsive).  Adding a 2nd cpu more than doubled the allowable users (24-26 easily before problems started).  That seemed to be the best bang for the buck. 

We ruled out 64bit due to other issues we had w/64bit.  We did try 2003 Enterprise to increase ram (8gb).  We got an increased user count, maybe another 35% (32-34 users).  Didn't try increasing cpus as other posts, including WMware, seemed to relate that not being a worthwhile option.

My numbers are a little conservative as I am able to push the user counts a little higher, but I cap them using Load Evaluator to not take chances when the user counts hit the high end of my testing (i.e. 2 cpu did get 30 users, but started having noticable performance issues and somewhere around that number I will get other issues ultimately requiring a reboot.). 

Whatever your cost basis and user count is you might consider scaling differently.  If I was supporting 5,000 users I might try to scale my users per server higher, but then I have to weigh how many users I lose if a server fails (it does happen) and if my remaining servers can support the load.

  • | Post Points: 5
Page 1 of 1 (4 items) | RSS