Do you have design principles for VDI? How does VDI fit into your broader strategy? - Part 1 of 2

In October I wrote a post highlighting that the CapEx Virtual Desktop Infrastructure (VDI) focus is drowning out OpEx considerations and asked the question why bother with VDI?

In October I wrote a post highlighting that the CapEx Virtual Desktop Infrastructure (VDI) focus is drowning out OpEx considerations and asked the question why bother with VDI?

This generated some robust comments which essentially argued the following: that the business benefits outweigh the costs; in a mixed PC/Server Based Computing (SBC) environment those costs already exist; benefits of a single architecture for internal and external users; and, even Shawn Bass chimed in with OpEx costs are BS 99.9% of the time so it's a futile discussion. :)

I have no doubt that there are many benefits, but I firmly believe many remain confused and therefore present timid arguments to a CIO. In this two-part post I want to extend that discussion and share some of my observations.

I think people must ask themselves four key questions:

  1. Is there an exciting project that users or execs want/need to succeed?
  2. What is your VDI experience priority?
  3. What are your design principles?
  4. How does VDI fit into your forward-looking infrastructure and app strategy?

I’ll discuss the first three questions in part one and cover question four in part two.

Is there an exciting project that users or execs want/need to succeed?

Paraphrasing innovation guru Clayton Christensen, innovation can be defined into three primary category types:

  1. Sustaining current status quo by adding incremental capability – e.g., VDI for developers, disaster recovery, etc.
  2. Driving efficiency out of status-quo models – e.g., shared VDI models, thin clients, etc.
  3. Enabling new capability – e.g., Bring your own device (BYOD) use cases, more work from home opportunities, real business agility use cases such as M&A

It’s been my experience that type C innovation will generate the most passion and interest, and stands the greatest chance of generating enthusiastic funding for your VDI project, especially in the early stages. If you want evidence of that, just look at how much the VDI industry has marketed the benefits of Windows on iPads even though it’s an occasional use case.

Yet, most people try to build a VDI business case on type A and type B innovation. I tend to yawn when people regurgitate the low-hanging fruit use cases. If that’s your CIO’s primary focus, I have limited hope for your success because your users will not care and will simply hate you for taking away the freedom they think they have with their computer at work. This is usually due to their attachment to a physical object to which they associate an obscenely large amount of value and entitlement.

There have to be real tangible benefits that your business values and your users care about if you want to expand beyond slow-moving, IT-centric niche use case projects, that are too often mired in bureaucracy and politics.

My implementation journey started many years ago before VDI really existed. It began with reimagining a new type of trading floor environment. One that was super-dense, quiet with compute power on demand, representing the most risky user group. At first, this was thought to be impossible, then cost objections were put up until the value to change the workplace architecture was internalized, understood and accepted broadly.  Additional use cases grew from there as lessons learned were reinvested to solve new problems and new technology investments were leveraged to improve existing systems. I’ve seen many good examples across industry verticals that started with solving an ambitious problem that galvanized the organization to act.

So my advice is to start with type C innovation use cases, and then go from there, else you’ll probably end up having many of the arguments about OpEx and CapEx that I highlighted in my previous post.

Next, you’re likely to get into the SBC-based on Remote Desktop Services (RDS) vs. VDI (a desktop OS) architecture debate, and bang your head against the wall. I’ve personally been through this and one of the important lessons I learned was it really helped to have a few key design principles. I’ll share some that were key when thinking about a trading floor and I think can also be applied more broadly.

What is your VDI experience priority?

This is a fundamental question to ask. It will help you understand how important your users’ experience is in your project. With respect to SBC vs. VDI, it’s not a case of right or wrong, it’s a question of the type of experience you want to create. If your mindset is that greater density and lower cost are priority, then I think the below image represents the user experience you will likely create.

Fig. 1 - What’s your user experience?

There’s nothing wrong with this if your business problem is solved with cheap affordable transportation that does the job for the majority. Perhaps a harsh analogy, and yes there are many things you can do to optimize SBC environments. In my trading floor use case, this was not the goal. Experience trumped cost. It’s also why I believe that SBC has not replaced, but rather, complemented the PC for the most part despite offering lots of cost benefits.

What are your design principles?

In the vast majority of enterprise environments, even with users of moderate sophistication, you don’t share your desktop/laptop. So why do you expect to with VDI? It’s an IT first vs. user first approach that too often results in a Fig. 1 experience.

In my case, it was a desktop replacement project to build out a new trading floor. We needed a solution to replace the experience as best we could. So, our first design principle was:

It’s a desktop

The next challenge was determining how to build a reliable trading floor and at what cost vs. risk? Lots of options presented themselves, but ultimately it was critical to understand that we didn’t want to pull so many dollars from the project to constantly set off a mousetrap.   

Fig. 2 - How many dollars can you steal without setting off the mousetrap?

Clearly SBC was cheaper, but we had experienced Citrix “Black Hole” problems in our traditional XenApp environment. I’ve summarized below scenarios we had experienced due to the Citrix “Black Hole” problems from an old troubleshooting session I attended:

  • Terminal Services hangs on a server or critical threads in IMA hang on a server - however, IMA on that server responds to DC heart beats - Server appears “healthy” but cannot accept any user connections.
  • XML service hangs on the XML broker server - Web Interface Server is unable to detect the hang condition - Continues to send user connections to XML broker server.
  • An application error occurs on problem servers when users connect - Errors prevent application to be launched successfully - Users are directed to this “least loaded” server

When I discussed these challenges with Citrix at the time, a tool called the health check agent (HCA) was proposed as a mitigation strategy. While a perfectly valid suggestion, it did not make sense for my use case. I had to compare the risk of changing what it took to logon to my desktop currently, which was the desktop logon shell, and compare that to adding all the additional features of a XenApp farm that was designed for a different purpose. I had to consider the potential impact of adding other farm components as well. A look forward at Fig. 3 provides some insight.

Fig. 3 - A slide from a 2010 presentation

Clearly in addition to “Black Hole” problems, there were other shared components like the data store that could impact a large number of users. This represented a greater risk than a traditional desktop crashing by itself on the existing trading floor. It’s just not something that made sense for my use case. Out of this the second design principle was born:

Risk boundary of a desktop

The impact of this was that since VDI did not exist at the time, we ended up designing a custom farm solution where all components were self-contained in a single image. Yes we had lots of farms! I used to joke that we should trademark it as the ranch architecture, or the desktop grid/farm. :)

What also became clear as we moved to this model, was that we had to think about ways to determine where to place users on VDI racks. Additionally we had to keep many of the business rules we had for redundancy on the trading floor intact in the new datacenter model.  This was one of the many things we had to consider. Having a 1-1 model really simplified us dealing with this transition because we could keep our existing management practices, choose to evolve over time and instead invest in innovation where there were gaps preventing us from moving quickly.

Despite all this, there was still a lot of discussion about the possibility of driving down costs for a subset of the user population using regular XenApp. When considering these arguments, Fig. 1 and the ‘it’s a desktop’ design principle were once again an inspiration.

On a trading floor, compute power was highly valued and a lot of money had been spent to ensure people were provided with high-end machines with a finite amount of compute. Despite efficiency, moving to a shared architecture meant sharing it with others even if there was more to share overall. What we learned early on in user experience testing was that users prefer predictability. For example, if things feel fast one day and slow the next, users get more frustrated vs. something that is more consistent but useable. As we thought through this, it became clear that due to unpredictable application patterns that CPU and memory spikes could hamper predictability beyond our comfort level, even if we kept the number of users sharing these resources to a minimum. This also suited our farm per user approach and the third design principle was written.

Predictable user experience

This led to the decision to use this architecture for all users no matter how we categorized them. We believed that efficiency through greater density for a subset of users who really didn’t need as much compute as a trader would be solved over time by Moore’s Law, at a risk factor that we could accept. With this belief, we stuck to our ‘risk boundary of a desktop’ design principle, and also were able to apply the same management tools, new and existing, consistently across the globe.

You may consider my use case extreme. I’ve had the benefit of talking to hundreds of customers across verticals and industries over the last few years. I think Fig. 4 can help you think about where you want to be in terms of risk and then you can choose the right architecture for your high-value, type C innovation use cases. There is also nothing wrong with having several models for different use cases and risk profiles as you evolve. That is why I continue to believe desktop virtualization has to be thought about more broadly than just VDI for the majority of customers.

Fig. 4 - What’s your risk profile?

Forget about stateless VDI for now

So what does this have to do with CapEx vs. OpEx as discussed in my previous post? Simply put, the focus has to start with solving a problem and implementing quickly. Else your project slows to a halt. To achieve this goal, it is clear to me that:

Persistent VDI is the easier first step For most customers the pre-requisites for implementing stateless virtual desktops (user personalization, application virtualization, user installed apps solutions, etc.) and completely changing the desktop management processes makes it difficult to get their VDI projects off the ground. As a result, many customers start with persistent VDI or a mixed environment and then migrate over time to a stateless VDI environment. 

However, why even concern yourself with stateless just for VDI? How much will it matter over time with so many other conflicting priorities? If you have use cases that existing single image VDI solutions help you achieve - sure go for it. In reality, I suspect that this is a minority use case that could just as easily be addressed with SBC solutions.

The desktop is very hard to standardize across the board, even in enterprises that are 1000 users. Sure, in an ideal solution the user should not have to care about the moving parts (PC,SBC,VDI, etc.), they should only care about the human computer interface and their information. Has anybody at scale achieved this to match his or her desktop experience? That’s a serious question, because I’d really like to learn more from somebody who’s been successful…

If you’re a bit of a wannabe physics geek like me, you’ll understand from thermodynamics, that entropy teaches us that the natural state of the universe is disorder. In other words, no matter what you do, users will keep changing and you’ll always be chasing configuration drift and trying to capture, define and manage that state.

Instead, why not just focus on making the 1-1 model work better over time. It’s great to see storage solutions like Atlantis (full disclosure I am an advisor) evolving to solve many of the CapEx challenges for this model. During Citrix Synergy Barcelona in October I was pleased to see that the 1-1 model is now a recognized workload for service providers and even Reverse Seamless can be used with it now to solve for those problem apps! The large players like Cisco and Dell are building out their offerings that will also better enable 1-1 VDI. Hey there’s even VDI-in-a-box now…

I also have no doubt that as more service providers offer VDI solutions, they’ll be able to deliver greater economies of scale for VDI leveraging the emerging technology stack.

But the OpEx is still where the rubber meets the road regardless of your ability to really measure it. If you are not a 100 percent VDI shop, simply making your management model consistent across the physical desktop and VDI/SBC for internal and external users requires a lot of work. This helps you move towards a better-managed infrastructure. VDI adds, as it did for me, many new considerations about business logic that you need to consider. Also, let’s not forget in this 1-1 world, strategies like application virtualization, user personalization, etc., are all still relevant. They’re just not pre-requisites to solve the problems that really matter to your business, as many believe. 

When you think about all these things as part of your broader infrastructure strategy, it begins to reveal additional perspectives. This is something that I’ll discuss in more detail in part two, as the smartest people I know are not getting pre-occupied with shared VDI any time soon. They know 1-1 VDI provides them plenty of opportunity to enable capability and drive efficiency as part of a broader data center strategy, while other desktop models, like regular PCs and SBC, solve for specific use cases.


Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

Great, so now I've got to go back to the drawing board.. ;) Great read. Much appreciated.


This is good stuff Harry, but it all kind of makes me think that I inhabit some parallel VDI universe in way.

I think I have been remarkably lucky to score some wonderful VDI projects, large scale ones, that have been in production now for years and with mostly happy users.

Plus I point blank refuse to entertain RDS/TS or have anything to do with Microsofts vision of the HVD, so we never had to deal with these choices between RDS/TS and VDI and were always virtual desktop men.

I read about the struggles others have with VDI, but we seem to suffer from none of them and I think this is down to the model we deploy and the need it meets rather than the skill of our engineers.

Despite what our website may say, my company only builds VDI/DaaS platforms for cyber-defence these days and deployments make everybody happy, we stick to our niche these days and we are very much looking forward to 2013.

Here is a funny thing for you though, I completely agree with everything you say (I have not been agreeing with anyone lately) except for your premise that its easier to deploy persistent VDI initially  and then move onto non-persistent later, although I appreciate you are telling it like it is.

I always thought this was an intellectual inconvenience rather than a technical or execution issue, I get the sense that everybody still does persistent desktops because they THINK it is easier to sell and deploy rather than because it actually is.

I think this is primarily because they do not want to go through the intellectual inconvenience of disassembling the desktop in the customers head and then technically, fat PC to persistent HVD is an easy sell.

It just does not make sense to me that you deploy persistent now with a view to moving to non-persistent later, I do feel that if you are a VDI man deploying persistent desktops to your users then you are kind of old fashioned and in a comfort zone.

Seems to me that there are just too many benefits to non-persistency in VDI to ignore it because persistency is easier to get your head around.

Intellectual laziness and fear of complexity are the primary culprits here I suspect.

You cannot tell me that you do not relish the thought if chopping up a traditional desktop estate into stateless heaven Harry, surely it seems a cop out to you if you need to deploy persistent just to get the deployment rolling ?

All in all this is a great article, I enjoyed reading it and its about time somebody wrote some good stuff on here containing insight from somebody who has been living and breathing this for years.

The Madden family are getting boring, I think because they do not do real work anymore and just talk to people all day, I look forward to part two of this and also to Shawn Bass writing more stuff.  

You two are the only writers here I feel engaged enough by to comment on :)


@guisebule I think Jack Madden has a lot of high quality articles and deserves more credit.

@harrylabana I'll reserve judgement until I read part 2, but this was a good read. As I've said before the HDX Connect thing from Citrix should be made an option for all.


I disagree that entropy teaches us that the natural state of the universe is disorder - entropy is about dispersion, rather than disorder. I think you need to consider a wider picture.

I recall conversations where we agree that choosing a single solution rarely works. In that context "making the 1-1 model work better over time" in itself rages against a dispersion of options.

There is no one model for enterprise desktops that should be a fundamentalist route. VDI does and can work. Choosing stateless (no matter how whizzy Atlantis' stateless VDI is (and it is whizzy)) and *only* stateless, hamstrings more organisations than it catapults.

I keenly await part II