In late 2012 I wrote two articles: Why even consider VDI? the VDI "Capex vs. Opex" see-saw and Do you have design principles for VDI? How does VDI fit into your broader infrastructure strategy?. In the first article, I highlighted that the CapEx Virtual Desktop Infrastructure (VDI) focus is drowning out OpEx considerations and asked the question, "Why bother with VDI?" I built upon this in the second article and asked four key questions that I think guide people to success. I ended up discussing three of those questions and procrastinated answering the fourth question. This turned out to be a good thing because of technology evolution and some inspiration at Citrix Synergy in Los Angles in May 2013.
One of the most interesting sessions at Synergy is the no-holds-barred techie session called Geek Speak Live. Ten minutes of the Geek Speak video from Synergy (the "Desktop virtualization deployments: the “Reality Show” from 9:35 to 19:00) got me thinking there must be a better way to achieve non-persistent VDI for a broader user base that leapfrogs current physical desktop management practices.
As I thought about this problem, I wanted to figure out if there was a way to combine existing technologies to achieve non-persistent VDI with the following goals:
1. Work with existing infrastructure as much as possible
2. Leverage commodity components as much as possible
3. Deliver a great end-user experience
4. Be simple to setup and maintain
5. Be scalable
My hypothesis was that a combination of technology from Atlantis Computing, Cloud Volumes, and Immidio might be able to achieve the above goals. At Synergy I asked contacts from each of these companies if they would we willing to take part in an experiment to test my hypothesis. All of them said yes, so a big thanks for indulging me! In particular, thanks to Toby Coleridge from Atlantis Computing for getting the lab ready for me during a recent visit to the West Coast.
Below is a ten minute video showing the three technologies working together, followed by a more complete discussion of my thinking behind the idea.
I started with a problem statement for a fictitious company I am going to call ACME. ACME is distributed with 200,000 users across the globe with satellite offices, 5000 applications and lots of personalization and policy per user. They want to implement non-persistent VDI desktops using a desktop operating system. Citrix XenApp is not an option because their design principles clearly state that a desktop operating system is a requirement. See my second article for more on this.
In the Americas, they have a main office in New York and five satellite offices across North America. They have the same distribution for Europe and Asia. So in total, eighteen offices with three headquarter offices in New York, London, and Tokyo, plus the regional offices.
ACME has achieved the goal of centralizing IT and agreed upon a common golden image for the business that includes Office, IE, Security and Systems management agents. This image needs to be maintained monthly for security and application updates, and it needs to have the flexibility to be updated for emergency maintenance.
Globally there are 5000 applications. Analysis has found that only approximately 20 percent of these applications are actually needed globally. The remaining 4000 apps vary greatly by each office and user within that office. User installed apps are not a primary requirement but would be nice to have for some users.
All 200,000 users require some degree of personalization for their desktop environment and applications. IT also requires policy to be distributed to these users globally to ensure consistent service delivery.
Current engineering options
The current option would be to build a persistent / 1-to-1 VDI model by distributing a golden image globally to all sites using NetApps or local storage and then making applications, personalization and policy available locally. This could be achieved by putting the golden image on a NetApp Filer or distributing via local storage. Then make as many applications available as possible via Microsoft App-V and use 3rd party User Environment Management solutions to deal with sophisticated personalization and policy requirements to replace policy server infrastructure with a database replication infrastructure. Alternatively, they could consider various OS layering approaches.
Key challenges with the current options
- Cost and performance of storage. i.e IOPS for the NetApp filer costs.
- Management overhead of 200,000 local disks for users. (If this is used as an approach to address IOPS).
- Compatibility of App-V with all apps and the packaging process costs.
- Cost of replicating all 5000 apps globally when only 20 percent are common.
- Building a scalable distribution infrastructure for layering solutions, while also expecting runtime provisioning of applications
- Want to avoid UEM personalization and policy configuration complexity and do not want to build and manage a distributed SQL server infrastructure.
Engineering has looked at incumbent desktop management solutions from both Citrix and VMware and has determined that the respective solutions do not provide native capabilities to solve for the use case at hand. As a result, the engineering recommendation is that the least risky path is to stick with simply implementing PERSISTENT desktops. This would allow ACME to stay with existing management processes and ACME would have to justify VDI on business benefits.
Senior management knows this won’t scale globally, as many users are part of lower margin businesses that are looking for cost effective solutions that enable flexibility and agility using a desktop OS. Not all users need to have a highly reliable desktop such as a trader desktop in a financial services firm. So sharing some infrastructure components is acceptable as long as it is a desktop OS.
New thinking that enables a non-persistent solution that reduces the OpEx and CapEx cost of a desktop is required. Management already understands that OpEx for a physical desktop is greater than CapEx for a physical desktop and that’s what they have attacked in the past. Now they want to attack OpEx and CapEx for a VDI desktop and not be held back by legacy process and technology.
Atlantis + CloudVolumes + Immidio = a potential solution
Since ACME has agreed to a golden image, it should be relatively easy to distribute and maintain a high performance golden image with the Atlantis diskless VDI solution. You can see it here in action. I’m going to borrow a good description of it from Brian Madden’s article:
Last September Atlantis Computing released a product called "ILIO Diskless VDI." I wrote about it on BrianMadden.com then, and Gabe and I gave it the "Best of VMworld 2012" award in the desktop virtualization category. If you're not familiar with ILIO Diskless VDI, the basic idea is that you run an Atlantis storage appliance VM on each of your VDI hosts and it uses RAM to store all the deduped, compressed disk blocks of all the desktop VMs on that host. So essentially you're using a RAM disk for the C: drive of each VDI instance, meaning they run fast. Really fast. Screaming crazy fast. And since Atlantis is only storing each disk block once (regardless of how many VMs on that host use that block), they can run dozens or hundreds of VMs off of a very small amount of memory.
As demonstrated in the video, ILIO was set up with local storage only. This fits well with the goals of ACME to leverage commodity infrastructure.
Further, ACME engineering also realizes that the standard industry view that 15-20 IOPS per desktop is wrong. At a minimum these days, a virtual desktop needs the same as an entry level PC with a SATA drive—90 IOPS. I briefly discussed this requirement with Atlantis, who informed me that the ILIO diskless VDI software appliance delivers between 300-1500 IOPS per Desktop VM. Each instance has a 50,000 IOPS capacity built into it using its unique RAM as primary storage architecture. This would explain the “screaming fast” user experience sentiment from Brian, as this is more IO than a physical PC even with SSD.
CloudVolumes describe themselves as a virtual workload management company, and make a point of highlighting that they use your existing infrastructure. In other words, they allow you to manage application workloads such as SQL server, full XenApp workloads and, most relevant to this post, can do things like instantly provision 50 apps into a running desktop VM.
They achieve this by installing applications natively into storage and then capturing them as VMDK/VHD stacks outside of the OS, which can then be distributed. You may think this is just like application packaging with App-V or ThinApp but it’s not quite that. They natively store the bits as they are written during the install, in a different location, and then take note of things like services which are started and roles which are enabled into the OS. These are then 'put' onto the AppStack volume, and when complete (which can span reboots, and several apps or dependencies being installed one after the other) you tell the agent through a dialog in the provisioning VM you are done, and that VMDK/VHD is then locked as a read-only volume which can now be assigned to others.
When this read-only volume is attached to a server or desktop VM running their agent, its contents are immediately virtualized into the running OS, registry, files etc. Unlike ThinApp or App-V, it’s immediately available and seen by other applications on the system as if it was natively resident (no need to stream)—without having to do any special registry changes to see the contents of the opaque object/package within ThinApp/App-V.
If this volume is loaded onto a host with Atlantis ILIO, it is also loaded and de-duped in RAM. This means that for all the common apps for your VMs, there would be a single instance of the application in RAM, available super fast and centrally manageable across all instances. You would then also be able to load in additional per user AppStacks. ACME could realize significant saving from this using commodity infrastructure. In fact, existing App-V or ThinApp packages could be placed into a CloudVolumes AppStack located on Atlantis ILIO so that they can now be delivered through RAM instead of streamed across the network into every VM.
These AppStacks can also be assigned to Active Directory groups, so that any user in the group will receive the AppStack. As an example, the Accounting group could be assigned Intuit Quickbooks and Domain Users could be assigned Microsoft Office. Accountants would automatically receive both Intuit Quickbooks and Microsoft Office upon logging into their virtual desktops due to the group associations.
There is also the option to have a writable volume. This could be a volume used for User Installed Apps and storing the user’s profile. Moreover, if different users install applications these will automatically be de-duped by Atlantis ILIO.
Immidio is in the User Environment Management (UEM) market. What caught my attention was a thoughtful blog post from their CEO. One of the key points highlighted in the post is that their solution is infrastructure-less and robust having coming from the foundation of the well known Flex profile kit. Having competed with Immidio in the past and since I am no longer in the UEM business, I decided they probably wouldn’t mind giving me a good a demo of their latest Flex + offering. ☺ I was surprised to see how much functionality is included in their latest Flex + release using no infrastructure and how simple it was to configure.
It’s key to note that people invest in UEM solutions for policy and personalization. The later is what so many people often equate UEM with and say things like Citrix UPM + PVS, persistent disk, or UE-V can do this. This is not the case, and much more granular capability is required in real life. It doesn’t have to be complicated which is the default answer of VDI incumbents without the capability or understanding of how to run end user operations at scale in a diverse enterprise. This is true no matter whose solution you use, but that discussion is outside of the scope of this article.
Just like applications, 200,000 individual ACME users would need their personalization settings delivered globally at ACME. Each personalization profile is estimated to be around 200KB in size (real life folks). Today adding profile servers into your infrastructure can solve this, and what’s cool about Immidio is that they can simply leverage that existing infrastructure, making it very simple.
However if you install Immidio into a writable CloudVolume, you all of a sudden get de-duped blazing fast personalization and policy delivered from RAM!
Results and Conclusion
As can clearly be seen from the demo video above, the setup was not complicated. We were able to get up and running in four hours plus about two hours of conference calls and coordination. The setup demonstrated that we were able to build a single rack compute unit (POD) using commodity storage and with native install experiences, all while avoiding complex additional infrastructure and all three solutions worked together with high performance pretty much out of the box. Using a POD as a standard compute unit for your infrastructure, you could scale the model out horizontally to address the broader distributed ACME use case. When thinking about this I did come up with some ideas to further enhance the model that I’ve shared with the teams as a possible future follow up.
However, for many, simply building a POD architecture this way could solve for OpEx and CapEx for many VDI use cases. This includes enterprises, service providers and system integrators.
As I mentioned at the start of this article, I hadn’t answered question four from one of my previous articles. That question was:
How does VDI fit into your forward-looking infrastructure and app strategy?
As evidenced by the Geek Speak conversation, it’s clear that people desire to achieve non-persistent VDI. However, they struggle with how to achieve it, especially at scale. The de facto answers today are, go 1-to-1 / persistent VDI, treat management as a separate problem or just go with XenApp if you can standardize. This is a far cry from analogies from a few years ago illustrated below. Where the current desktop was characterized as old, once useful but since saddled with layers of complexity added on year over year. The promise of course was VDI was going to solve the management problem, which has not happened in a meaningful way for enterprise users. What progress has been made is painfully slow…
Justifying broader adoption of VDI on business enablement alone only gets you so much adoption. The numbers are there to prove it, and hence why we have seen attempts like VDI-in-a-box appeal to a broader SMB user base. However, I submit, that even for those use cases, the needs are going to be more granular than image management and a better set of solutions is needed to attack OpEx across the board. As new technology matures, hopefully more people will have greater confidence to start first with a non-persistent model and not be intimidated by the necessary work of assessment that has to be done upfront to achieve it.
I look back at years gone by and reflect how things have evolved. We’ve always tried to become better managed over time as part of the sustaining innovation we do naturally to become more efficient. This evolved for many of us in IT as a shift in focus to labor arbitrage (i.e., outsourcing). Next came virtualization for consolidation and now we’re in the midst of mass automation to further drive down costs of commodity work and enable greater agility. These foundational pieces will be required to take advantage of cloud architectures and solutions.
The smart people I know have focused on these efforts and not let themselves become distracted with enabling non-persistent VDI use cases as they built and delivered business value quickly first while putting down the foundations for their forward looking data center strategy that enable the consumption of new architectures.
These thought leaders are now starting to look at ways to deliver end user computing as part of their broader data center strategy. This means taking advantage of virtualization and automation across the stack. They understand that the desktop is a stack of components, where delivery can be re-imagined. If they don’t, they’ll remain stuck in today vs. enabling new capability. It’s certainly a journey, however I refuse to believe that the de facto answer of “stuck with the limits of today” is the answer even if you start there.
When I first started in the VDI business, people said VDI was impossible, the status quo was good enough, etc. Over the years I've learned to take that with a grain of salt. First understand what you want, then break down the problems into consumable bits and keep searching for solutions or combinations of technologies. While you do that, continue to focus on time to value for your business users by taking practical action while you evolve your technology strategy to take advantage of innovative new capabilities to leapfrog legacy thinking. As Steve Jobs reminds us, we all have the ability to shape things into whatever we can dream up.