Brian Madden Logo
Your independent source for application and desktop virtualization.
advertisement

slowness but where??, in the Citrix XenApp / Presentation Server forum on BrianMadden.com

rated by 0 users
Answered (Not Verified) This post has 0 verified answers | 38 Replies | 6 Followers

Top 50 Contributor
Points 4,555
Jackson Lai posted on Tue, Jan 6 2009 3:43 PM

I have a two CPS4 server environment that publishes a Web-driven

.NET application from our partner office.  The CPS4 runs on Windows

2000 Server and are located in another state.  The way the application is accessed

is over a site-site VPN using two VPN appliances.  The connection

is over a T-1.  Basically lots of people have suffered from the

symptoms of freezing and lagging from time to time and I want

to know if it's my CPS4's fault or because there simply isn't

enough bandwidth to support so many users over VPN on a T-1?

Any ideas?

It's Me!
  • | Post Points: 80

All Replies

Top 10 Contributor
Points 48,501

I've just gone through something similar.  Although the servers weren't reporting particularly heavy CPU or memory usage, they got busy enough that at some point they started not being able to service ICA requests and connections started to stall or disconnect.  Ultimately, I threw some more hardware in to the mix and the issue went away, but it was $#%@! near impossible to identify the actual point of contention.  If you can, can you try adding a third server, just to see if spreading the load out across more servers helps?  How many users do you typically have on a server when this problem shows up?

Dan

Why is it called "Common Sense"? It doesn't seem all that common!

  • | Post Points: 5
Top 500 Contributor
Points 880

If there's one thing I've learned about slowness troubleshooting is that more times that not, there is usually more than one thing causing the slowness. 

From client PC to backend database, all should be considered.

 

  • | Post Points: 20
Top 50 Contributor
Points 4,555

yes all things should be considered.  I have for instance one small office site, constantly getting sluggishness and the "network connection to the application interrupted" message.  Could not figure out the problem to this day.   It seemed at one point to be an issue with their ISP or their bandwidth or their shoddy network infrastructure (picture spaghetti cabling all over the floor).  Sometimes I get isolated cases of this in larger places.  8 people would have no trouble at all and one person experiences the interruption errors and slowness.

In regards to the hardware, pretty much everything sits on the Primary CPS4 box including the datastore.  At any given point in time there are no more than 16-20 users spread out between the two servers.  It's not exactly even load balancing all the time so I can't say how many per server.  Sometimes more sometimes less.

It's Me!
  • | Post Points: 20
Top 150 Contributor
Points 1,151

Have you disabled your AV yet? AV's are notorious for causing these sorts of issues.

Also, break out perfmon and add some counter like avg disk queue length and maybe throw in some network counters as well, look for errors, resends, etc.. I've seen in the past where something is beating up the disk when the "pauses" happen.

 

--Mike

  • | Post Points: 5
Top 100 Contributor
Points 1,837

By now you should have gotten some troubleshooting done. What can you say it's NOT? WAN saturation should be easy to check via the VPN appliances.

Without knowing the exact physical configuration of your farm, I would say overall your architecture could be better.  Furthermore, I would bet my lunchmoney that that the issue is disk and WAN related (T1 Should be able to handle the user count you mention in an application you describe - until they print something - with so few users, you should probably stick with static local profiles).

It sounds as though you don't have WI implemented, so I won't talk to that. I would consider getting a couple of 1U servers to build a dedicated Data Store and Data Zone Collector (which can double as your XML Broker). Build Data Store (SQL 2K5 Express - RAID1). If you don't have a budget for that, look at re-using a couple of older server's of sufficient specs & put them on 3rd party warranty to fulfill this role.

Something I've been keeping an eye on is solid state drives. Like BBWC, they can help latency & put rocket fuel on i/o. Great for placing the page file and/or user profiles on. These have ben around a while but have always been expensive. Either you bought software to create a RAMDrive, or a memory board that plugged into the PCI bus. Nowadays they come in SATA formfactor (2.5") & like all RAM, have become very reasonable & have decent MTBF. Check it out.

Samuel A. Rodriguez
Sr. Systems Administrator

  • Post Points: 20
Top 50 Contributor
Points 4,555

I'll check on the Antivirus software on the client end.  I have a feeling they might be running something so maybe that could be it. 

The funny thing is one client complained about the same lagginess with another third party Web-based software out there, not mine.  Is it just coincidence or is there something related to the network at play here?  I don't know.

Sam, I do have Web Interface 4 implemented along with Citrix Access Gateway.  There is also some printer related traffic that flows through the same connection to other clients, not all the time but at times.  However tracking that down is somewhat harder to do because I don't know exactly when that occurs.   I'm sorry when it comes to creating additional things like another Data Store, etc I'm a little light on the experience.  If you could perhaps help with a little more details on how to go about this and what exactly I need to do?  As I've said there are only two servers at this point, the primary one contains the data store and zone information.  Each server has two drives paired in a Raid 1 configuration.  Using the PerfMon I have seen spikes, mostly in the first server, particularly when someone signs on or off.  Profiles are stored on the server itself.  However the spikes don't seem to affect the users, as I've not heard someone complaining about lag when it happens.  Furthermore, we have users who have a direct VPN connection to the backend site and they too have complained about the lag from time to time.  Keep in mind they are not utilizing any SBC software like Citrix, just a straight VPN.

 

It's Me!
  • | Post Points: 5
Not Ranked
Points 70
Suggested by rtotrainer

Windows 2000 doesn't use virtual memory as efficiently,out of the box, as it might.

First check your performance.  In Perfmon check these counters:

Cache:  Copy Read Hits%

Memory:  Free System Page Table Entries, Pool Paged Bytes, Pool Paged Resident Bytes

Terminal Services:  Total Sessions

The really important counter here is the Cache.  If Virtual Memory is tuned right it will be pegged at 100%.  It may occassionally drop to 99, but you definitely don't want to see it dropping lower or often.

If it is not 100%, the fix is to change a pair of registry settings:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\PagedPoolSize

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\SystemPages

Both shoud be dword values and you'll want to set them to ffffffff

If you want to really understand why all this works read this:  http://www.rtosoft.com/Enter.asp?ID=157

COFFEE.EXE missing.

Insert CUP and press ENTER to retry.

  • Post Points: 20
Top 50 Contributor
Points 4,555

Thanks but I've checked the Cache counter and yes it's at 100%.

It's Me!
  • | Post Points: 5
Top 50 Contributor
Points 4,555

any more suggestions?  I've been checking the latency aspect by using tools like pathping to try to determine if there's any sort of saturation or overloading but so far I haven't seen anything conclusive.  Everything seems to look normal.

It's Me!
  • | Post Points: 35
Not Ranked
Points 70

Can you isolate your problem to any one server?  Can you isolate the problem to specific users?

I was chasing my onw tail on a similar issue last week when it dawned on me that the complaints came from teh saem four users over and over.

When I checked their laptops, I found >ahem< questionable software installed as well as out and out spyware and at least one indication of possible viral activity.  I alerted desktop support who have cleaned them up and I haven't had a complaint since.

COFFEE.EXE missing.

Insert CUP and press ENTER to retry.

  • | Post Points: 5
Top 10 Contributor
Points 24,510

Hi,

If the perfmon counters you've been running on your servers don't show any bottlenecks RE disk I/O (particularly disk queue length), processor (processor queue length, context switches, CPU utilization, etc), and network I/O then the WAN connections are likely the culprit. You should also make sure that SMB isn't a problem - the TCT templates from Login Consultants include tweaks for that (SMB delays typically appear as several second delay for certain operations (File -> Open for instance) - have a user open up Notepad and get them to type up something when things are slow as Notepad will be unaffected by SMB issues).

Have you tried restricting bandwidth of various virtual channels yet via Citrix policy? In particular, printing, client side drive mapping, and clipboard can be pigs - try setting limits of 128 kbps for each. Also, turn off visual effects, menu animations, windows contents on drag, use aggresive image acceleration, and disable unused VCs (like audio, COM ports, TWAIN). You might also want to consider settings an overall session bandwidth limit. You can monitor in realtime for spikey traffic on the VC level by using the SMC console.

If possible, have a user who is experiencing latency issues call you when things get slow and apply an aggresive bandwidth Citrix policy to their account and have them log back in for comparison. You can also ask the user to switch connection speed profiles in the WI interface to Medium Low or Low for comparison sake (enables SLR and high compression among other things).

Your ISP should also be able to provide you with a usage report for your WAN connection so you can identify which protocols are hogging all the bandwidth as latency can be caused by saturation of the WAN link. Your firewall might even have QoS capability to give preferrential treatment to Citrix traffic. Also, if you can use ACLs on the firewall for the VPN tunnel to exclude "bad" traffic, that can help.

Troubleshooting these problems can be complex and time consuming, you need to be methodical and have lots of patience. If you would like assistance with this, I would be happy to consult. You can reach me via my website Contact page:

http://www.vcit.ca/about/contact.html

Alan Osborne

President (MCSE, CCNA, VCP, CCA)

VCIT Consulting - Citrix/Terminal Services Remote Desktop Solutions for SMB

VCIT website My Blog

  • | Post Points: 20
Top 50 Contributor
Points 4,555

Yes Alan, I've tried most of that.  The user does need at least printer mapping but not so much drive mapping.  I've also enable SLR and other things to help speed the process.  The funny thing was the other day the user said that in the morning she timed out several times while trying to log into the environment, didn't make sense at all.  I've never heard of someone timing out while logging into CAG. 

Also I checked the T-1 usage on the part of the ISP and although I do see some spikes it wasn't sustained at all, at least to be called saturation.  The highest spike was under 1 Meg and it was a blip on the graph.  I'm running out of ideas...really don't know how to help this user at this point.  I even used something like pathping to ping her IP but it didn't really show any lost packets or latency.  Of course, pinging is different than sustained streams of Citrix traffic.  I'd hire you to consult but we've no funding and I'm just a lone gunman here.

It's Me!
  • | Post Points: 5
Top 50 Contributor
Points 4,555

Also in regards to disk queue length and time, yes sometimes I do see spikes but again it's just momentary, a second or so and then nothing more.  Nothing sustained on the servers that I can see to cause complete bottlenecking.  If it was completely overwhelmed I would think every user would be in trouble but clearly this is not the case.  You have several groups of people who are purring away in their sessions (sometimes a little slow but not enough to drive them crazy) and then you have the isolated users who get symptoms of latency, network connection to the application interrupted errors, etc.

Keep in mind they are basically accessing a .NET app running in IE which is over a point to point VPN to a backend data center somewhere.  I don't really have much processing on the server except loading the profiles, managing datastore, session information etc.  Nothing is being saved really on the servers.  Everything is done over the IE published app.  I can see perhaps getting a point to point T1 to the backend data center could be a definite plus but I'm wondering if it'd help these few isolated users. 

It's Me!
  • | Post Points: 20
Top 10 Contributor
Points 24,510

An apparently uncongested, low latency link where packets are constantly fragmented can act just like a saturated, high latency link.

To figure whether the max MTU is a problem, remote into the user's workstation and run a PING with a 1500 byte packet (28 bytes are for ICMP header, hence the 1472 parameter):

PING -l 1472 -f backend_server_ip

If the result is "Packet needs to be fragmented but DF set", then decrease the packet size by 10 bytes and repeat until you determine the max MTU that the link will support. Remember to add 28 to the result for the real MTU.

If the MTU is lower than 1500 bytes, try to figure out where the MTU drops (repeat the PING tests to intermediate hops to find the culprit) and correct the problem. You can also lower the MTU on the client side if necessary (http://www.dslreports.com/drtcp).

A possible MTU issue was the reason why I previously suggested changing the connection speed profile because "Medium Low" and "Low" use a lower MTU.

Below is a VBS script that you can modify that runs an automated PING test (use scheduled tasks to run it periodically) that emails you when latency is high or packets are dropped:


On Error Resume Next

Const wbemFlagReturnImmediately = &h10
Const wbemFlagForwardOnly = &h20

If WScript.Arguments.Count = 1 Then
   strComputer = WScript.Arguments.Item(0)
Else
   strComputer = "target_IP"
End If

   Set objWMIService = GetObject("winmgmts:\\.\root\cimv2")
   Set colItems = objWMIService.ExecQuery _
    ("Select * from Win32_PingStatus " & _
        "Where Address = '" & strComputer & "'")
   For Each objItem In colItems
     If objItem.ResponseTime > 100 Then
 strSender = "sender@domain.com"
 strRecipient = "recipient@domain.com"
 strSubject = "PING response time is " & objItem.ResponseTime & "ms to host " & strComputer
 BodyText = ""
 Set objEmail = CreateObject("CDO.Message")
 objEmail.From = strSender
 objEmail.To = strRecipient
 objEmail.Subject = strSubject
 objEmail.TextBody = BodyText
 objEmail.Configuration.Fields.Item _
     ("http://schemas.microsoft.com/cdo/configuration/sendusing") = 2
 objEmail.Configuration.Fields.Item _
     ("http://schemas.microsoft.com/cdo/configuration/smtpserver") = _
         "smpt_server_ip"
 objEmail.Configuration.Fields.Item _
     ("http://schemas.microsoft.com/cdo/configuration/smtpserverport") = 25
 objEmail.Configuration.Fields.Update
 objEmail.Send
 set objEmail=nothing
     End If
     If objItem.StatusCode > 0 then
 strSender = "sender@domain.com"
 strRecipient = "recipient@domain.com"
 strSubject = "PING failed to host " & strComputer
 BodyText = ""
 Set objEmail = CreateObject("CDO.Message")
 objEmail.From = strSender
 objEmail.To = strRecipient
 objEmail.Subject = strSubject
 objEmail.TextBody = BodyText
 objEmail.Configuration.Fields.Item _
     ("http://schemas.microsoft.com/cdo/configuration/sendusing") = 2
 objEmail.Configuration.Fields.Item _
     ("http://schemas.microsoft.com/cdo/configuration/smtpserver") = _
         "smpt_server_ip"
 objEmail.Configuration.Fields.Item _
     ("http://schemas.microsoft.com/cdo/configuration/smtpserverport") = 25
 objEmail.Configuration.Fields.Update
 objEmail.Send
 set objEmail=nothing
      End If
   Next

Alan Osborne

President (MCSE, CCNA, VCP, CCA)

VCIT Consulting - Citrix/Terminal Services Remote Desktop Solutions for SMB

VCIT website My Blog

  • | Post Points: 20
Top 50 Contributor
Points 4,555

Great thanks Alan I'll try some of this out.  Where exactly do I change the Connection Speed profile from Medium to Low?

It's Me!
  • | Post Points: 20
Page 2 of 3 (39 items) < Previous 1 2 3 Next > | RSS