New! Listen to this post in our daily podcast.

by
Brian Madden
Today's topic is something that BrianMadden.com user "AppDetective" has been talking about in almost every one of the 243 comments he's posted. Essentially he's saying that VDI today is limited to persistent, or "1-to-1" disk images. (This is where each user has his or her own disk image, and changes are saved from session-to-session.) This contrasts what most of the big vendors are pushing, which is the "shared" or "non-persistent" model (where the changes are not retained when a user logs off and each new logon boots to a clean image from a master template). Several readers have asked for a full-fledged analysis and discussion of this topic, so here it is!
Some Background: How did we get here?
In the early days of VDI, everything was done with the persistent 1-to-1 images. This is mainly because it's easy. You can practically P2V your existing desktop computers to create the disk images for the VMs in your datacenter that will drive your VDI environment, giving you the essential benefits of server-based computing (Management, Access, Performance, and Security) for not a lot of work. Nice!
The problem, of course, is that datacenter storage is orders of magnitude more expensive than desktop hard drives which means this solution is orders of magnitude more expensive than old-fashioned local desktops. (This is not to suggest that VDI with 1-to-1 images is never useful, it's just that it would cost a lot, so it's use is limited to the folks who can truly need it for one of the other advantages.)
Of course over time, various techniques for reducing the overall storage
of VDI have been creating, the two most prominent being thin provisioning and data
deduplication.
Thin provisioning versus data deduplication
Thin provisioning is the concept of a single "master" disk image being used as the starting point for additional derivative images. So with thin provisioning, many VMs can essentially "share" the single master image by mounting it in a read-only way. Each VM's "writes" are written to its own additional disk image file, often called the "delta" image because it contains only the changes that that particular VM made from the master image. When a VM boots up, the disk image that it sees is actually a combination of two physical files--the read-only master and the individual delta file.
There are a few advantages to thin provisioning. First and foremost is that the actual provisioning process happens very fast, since creating an additional instance of the image is essentially nothing more than telling a VM to mount and existing master image and to save its changes to a new image. Thin provisioning is also a great drive space saver, since literally thousands of VMs can share the same master image with their own small delta files.
The challenge, though, is that these "small delta files," if left unchecked, can grow into "large delta files." I mean think about it. Imagine what your laptop looked like on Day Zero--maybe 20GB consumed. And now you probably have 200GB consumed, meaning if your image had been thin provisioned, your delta file would be somewhere north of 180GB! Now think about times as many VDI users as you have!
So after some period of time (maybe even only a month or two), the delta files are taking up so much space in your SAN that it probably almost doesn't even matter that they were thin provisioned in the first place! This is where the concept of "data deduplication" comes in. Data deduplication (or just "dedupe") is exactly what it sounds like: it's the concept of removing duplicate sections of data from a physical disk system. I think every SAN vendor offers some kind of dedupe capability, as do several software vendors whose solutions work no matter what kind of hardware you have.
Most dedupe solutions are out-of-band processes, which means that the data is actually written to the physical disk, and then some process runs (maybe each night) that scans everything and looks for duplicate chunks of data. When dupes are found, the file table is updated and the dupes are removed, thus freeing up space for more data.
So as you can see, even though thin provisioning and data deduplication can both help shrink the overall footprint of your data, the two concepts are actually quite different. But what's all this have to do with VDI?
Non-persistent shared disk images
As I wrote in the opener to this article, all the big VDI vendors are talking about the concept of many users sharing a single disk image. But in their case, they're talking about sharing non-persistent disk images, i.e. each time the user logs on, they get a full brand-new instance of the master read-only image, and their delta image files are not saved when they log off. (If this were a physical PC, it would be like re-imaging the PC every single time it was booted.)
What's interesting here is that thin provisioning doesn't enable the image to be non-persistent per se, because as we've seen it's possible to let the thinly provisioned delta image files survive from session-to-session. But thin provisioning could also be used for this non-persistent mode, which is what the big vendors are talking about.
Of course using non-persistent disk images for VDI is simple for task workers, since all the users probably have the same apps and any customizations would be simple things that could be captured in each user's roaming profile. (In this sense the VDI shared image ends up being a lot like a Terminal Server image that is of course shared by all the users of that Terminal Server.)
But when it comes to "real" users (or "knowledge workers" or whatever they're called now), the whole shared image thing is harder. A lot harder. The specifics why are beyond the scope of this article, but some quick thoughts are:
- How are you delivering your applications into the image? If you use an app virtualization tool, how do you handle the non-compatible apps?
- How do you handle the user settings and personality configurations that are outside the normal roaming profile locations?
- How do you handle user-installed apps?
Where are we today?
The point that AppDetective (and others of course) make is that this whole single-image / shared-image / layering concept, while great, is just not real today. There are just too many complexities and unknowns to do this at any large scale. So until it's real, the only way to do VDI for hard-core users is to give each one his or her own personal disk image. (Again, simpler task worker scenarios can still use the disk sharing method.)
So let's talk about this. Will we ever get to the whole sharing and layering thing? (Actually that's a great topic for a future article.)
For your own VDI deployments using shared or personal images? Is anyone doing "hard core" workers with shared images today?
And would anyone mind if I quickly point out, once again, that if VDI image sharing is only useful today for task workers, why don't people just use Terminal Server instead? ;)
(Note: You must be logged in to post a comment.)