Pursuant to yesterday's blog post, this is the first of my brain dumps--a "Brian" dump, if you will--from the time I've spent with the vendors in the desktop and application virtualization space. Even though Atlantis Computing was not the first company I met with, they're definitely the first company I'd like to write about. Atlantis Computing thinks they've solved the "file-based" versus "block-based" master / delta disk image problem that plagues large-scale VDI deployments today. They're doing so by creating a virtual appliance and presents a block-level boot device to a VM that is completely "virtual" and composed on-demand at run time. Instead of being a static VMDK or VHD file (or instead of being a read-only master that's merged with a read-write linked clone), the Atlantis virtual disk is composed of dozens (or even hundreds) of "flocks", their term for the smallest slice of File IO and Block IO. Using Atlantis Computing could solve the "layering" problem of Windows and replace the need for Provisioning Server, View Composer, roaming profiles, and user-installed apps.
That is a huge statement, but I'll make an even bigger one: Atlantis Computing's product has the potential to be the coolest and most important add-on product that I've ever seen in my thirteen years in IT. (Remember how excited I got about Ardence back in 2006? This is like that times ten.) Atlantis Computing has the potential to change our world.
Seriously, when I was in their office a few weeks ago, I just kept saying, "Wow. Wow. Wow. YES! This is what we need! Wow. Wow. Must breathe. Wow."
The problem that Atlantis solves is fairly complex, and it might not even be apparent to those who are new to the VDI space. So let's do a "choose your own adventure"-style split here. If you know what the file-based versus block-based disk image problem is, then skip to the last section of this article called "How does Atlantis Computing's product work?" And if you have no idea what I'm talking about, continue on to reading this next section.
Background info on the VDI "shared master disk image" concept
The key to understanding the "block-based versus file-based VDI master image problem" is to take a few steps back and look at why Windows Terminal Server became popular in the mid 1990s. While there were several great reasons to use Terminal Server, one of the big advantages was the fact that multiple users shared a single installation of Windows. This meant that when you applied a hotfix or updated an application on the Terminal Server, that change was (by definition) automatically available to any user who ran a session on that server.
But all users sharing the same installed instance of Windows also introduced some challenges, namely, if we wanted users to be able to save their own settings and configurations from session to session, we had to figure out some way to save those that did not involve writing to the Terminal Server's disk that would affect other users. That was initially kind of a pain, but over the past decade or so we figured out all the tricks and gotchas about roaming profiles and special folder redirection and network-based home drives, essentially solving that problem by the early part of this decade.
Unfortunately our success was short-lived, because the introduction of VDI a few years ago introduced a whole new wave of conversations and complexities. Even though VDI was "just" server-based computing to the old-school TS folks, the majority of the early VDI proponents were re-learning what the TS folks had learned a decade earlier.
Like Terminal Server, VDI offered many advantages over the traditional computing model of a local Windows OS installed on every single device. And like Terminal Server, VDI offered access, performance, and security benefits that were compelling to many customers.
But VDI and Terminal Server had one major difference, namely, while Terminal Server required all users to share a single instance of Windows (since one server hosted many users), VDI had no such requirement. In other words, companies could fully embrace VDI yet still have to deal with hundreds or thousands of separate disk images. (One VDI Windows workstation OS image for each user!)
There were ways to manage all of those Windows disk images that new VDI converts found in their datacenters. Inventory and software distribution products like SMS and Altiris had been on the market for years. But for a company that just spent piles of money building out their datacenter and virtualization architecture to support VDI, the thought of running the SMS for their datacenter desktops was generally not something they wanted to deal with.
In order to get some real payback from their VDI solution, companies had to move away from a strict one-to-one relationship between individual users and disk images. Not only is it a huge pain to manage and patch and update all those disk images, it's also quite expensive since all those VDI disk images are now in part of expensive datacenter SAN storage. (You figure what, 20GB per user times how many users at how many dollars per gigabyte?)
But what if there was a way for multiple VDI users to share a single disk image? If you could create a master image file that acted as a template for all your users, then you would (in theory) only need to manage one single copy of Windows, regardless of whether you had one, one hundred, or one thousand users.
In the old desktop computing days, you'd use a product like Ghost to make a snapshot of a master disk image which you'd then deploy to all your desktops. In these new VDI days, you don't need Ghost to do that since your virtualization platform has disk image snapshotting built right in.
This is a nice idea, but unfortunately it doesn't actually let us manage hundreds or thousands VMs with a single image.
Why not? Because the image snapshotting is only good for the initial rollout. After you've snapshotted or cloned your initial template image, you'd still have to manage all the copies with the old school products like SMS.
What is the "file-based" versus "block-based" problem?
VDI vendors like Citrix and VMware understand this, and the last thing they want to do is build a solution that virtualizes all your desktops while requiring you to still manage a gazillion disk images.
The way they avoid this is that instead of making a copy copy copy copy copy of the master image for each "cloned" VM, they just keep one single instance of the master image file which they mark as read-only. Then when they create a new clone for a new VM, they just create a tiny little read-write disk image file that's specific to that VM only. When the VM boots, the virtualization software links the read-only master image to the read-write differential file. The VM thinks it has a normal writeable disk, but in reality all the writes are being channeled into the read-write differential file. (VMware calls this capability "linked clones," because a cloned disk image is linked back to a specific master read-only image.)
The more a user writes and updates the "disk" within the VM, the bigger the differential file becomes. Since the master file is treated as read-only during this whole process, multiple VMs can simultaneously share the same master file at the same time while each individual VM has its own differential file. The big advantage of this is in disk space savings in the SAN, since you only have one copy of the "core" Windows master read-only disk image that's shared by many users.
At first you might think this solves the problem where you have manually update all of the disk images using something like SMS every time you want to roll out a patch. Unfortunately this is not the case. All of the virtualization systems virtualize hard drives at the hardware level. In other words, they virtualize the physical disk blocks at a level that's below the file system. This physical disk block level is where the linked clones are managed.
So what's this mean? If you were to change the read-only master disk image (for example, by converting it back to a read-write copy image and applying a patch), the patch would update, add, or replace certain files. These files would be written by the NTFS driver in Window, which would then convert those changes into physical block-level changes on the disk.
This would work no problem. However, since your linked clone differential files are based virtualization at the block level, the files that changed in the master disk image would mess up all the blocks in that image, and the linked clone differential file wouldn't know how to hook back into it. In other words, in a linked clone environment with a master read-only disk image and dividable read-write linked clones for each VM, any time you update the master image, you break all the clones and they have to be thrown out.
How can you get around this? How can you update the master image while preserving the differential linked clone disk images?
The only other option has traditionally been to *not* use block-level linked clones. (How funny is that? The only way to not have the linked clone update problem is to not use linked clones. ;) This can be done by creating file-level differential disk images instead of block-level linked clones. In other words, don't let your virtualization software handle the disk linking--do it from within Windows.
Sound impossible? Not to anyone who's ever administered a Terminal Server.
Imagine that you had a single Windows disk image that you wanted multiple users to use and to be able to customize. One way to do this would be to create two disk images. The first would be the C: drive and would be the core Windows image that was shared by all users. But you could also give each user his or her own personal disk image which would be available to them as the D: drive. Then you could use Group Policy or the registry to configure Windows so that all the personal stuff is written to the D: drive instead. (Essentially you'd point the "Documents and Settings" folder to the D: drive.) Then when a user's VM booted up, it would mount the C: drive for the system stuff and the D: for the settings and data. You'd even be able to update or patch the shared C: drive without affecting the users profile on the D: drive.
After hearing that, it's easy to think "Great! I'll just use this file-based disk image differential stuff for my VDI deployment and everything will be ok!"
Unfortunately it's still not that simple. This file-based redirection stuff is fine for user profile settings and data, but it kind of breaks down for installing applications. (Even if you get your users to install new apps to the D: drive, many apps still write stuff to the C: drive and the system registry.)
Also, some settings (like the computer name and domain identifiers) are not actually stored in files, so even if you wanted to use a 100% file-based differential system, you can't. These systems still require some block-based differential stuff to physically allow multiple users to access the same master Windows disk image file.
- Easier to do, as it's built-in to all hypervisor platforms.
- Below the file system, so you have 100% compatibility for all guest OS scenarios.
- It's below the file system. Any changes to the master image "invalidate" the linked clone differential files
- You can update or modify the master image, and the differential images are still valid since they contain files.
- You need to have an agent or configure each OS to redirect the content that will change to a different disk image.
- Users can only partially personalize their environment. For example, they can't install apps or do anything that would write to the C: drive.
- If the guest OS is Windows, you still need to do some basic block-level cloning. (Computer name and maybe domain membership at a minimum.)
How Citrix and VMware handle disk image cloning today
Both Citrix and VMware have different methods of dealing with the master / clone disk image relationships.
Citrix's solution is to use their Provisioning Server product which is bundled with the various XenDesktop products. (I wrote an in-depth paper describing how this technology works several years ago when it was called "Ardence," before Citrix acquired them. That paper, while old, is still relevant today and provides a good overview of how Provisioning Server works.)
Citrix Provisioning Server is a block-based system. Multiple VMs share the same read-only master image, and each individual VM has its own VHD file where the differential data is stored. Like all block-based systems, changing, updating, or patching the master image means you have to throw away your differential files. To that end, Citrix actually encourages you to never save the differential files. They'd like you to always re-create them on-demand when a VM boots up. Provisioning Server uses a SQL database to keep track of the unique identifiers for each VM (name, domain info, etc.), so as each VM is booted, a new differential file with the proper information is created on demand.
In general, Citrix's non-persistent model works well, mainly because most Citrix customers are legacy Terminal Server customers who are well-versed in the concept of a read-only master with all user settings pushed out to non-C: drive locations (network shares, etc.). The only real problem with this non-persistent model is that there is no support for saving files or changes that were made outside of the area that administrators have configured to be redirected into the differential file. In other words, users can't install their own applications or change really deep Windows settings. (Of course an easy fix is to use Provisioning Server in "persistent" image mode, where the differential files are saved from session-to-session. This allows users to install their own applications, but it also means that if the administrator changes the master read-only image, all the differential files are invalidated. So if you want to use Provisioning Server with persistent images, you're back to using something like SMS or Altiris to maintain your images.)
VMware's approach to disk image management with their View Composer component of View 3 product is actually quite similar to Citrix's, although they try to make it out to be very different. VMware also uses a read-only master file (a VMDK file in their case) along with differential VMDK files that contain the "personality" of a VM. (In their case, they're storing this personality in a differential disk image file which can be recreated from a database in the event that the master image changes. So it's similar in effect to what Citrix does.) Like Citrix, VMware also offers a persistent mode, although this is achieved by simply redirecting the users's "Documents and Settings" folder to a completely separate VMDK file that's able to persist even when the main Windows disk image is updated that the differential VMDK file is invalidated. (If you're interested in more details about how VMware View Composer works, Roland van der Kruk wrote an in-depth piece for us last month.)
How does Atlantis Computing's product work?
It's clear that neither the block-based nor the file-based method of creating master and clone disk images is perfect, as each has its pros and cons. On top of that, neither addresses the user-installed apps issue and both still require the use of user profiles and other things that were never designed for today's world.
Atlantis Computing solves this problem with a forthcoming pair or software products--ILIO for Virtual Desktops and ILIO for Virtual Servers. (For the purposes of this conversation, both of these products are essentially the same. They're just licensed differently depending on your use case.) The company was founded by Chetan Venkatesh, and they're planning on shipping these ILIO products in April.
As I wrote in the intro to this article, Atlantis Computing's ILIO ("In-Line Image Optimization") product is essentially a virtual appliance that plug-ins to an existing virtualization environment that presents a block-level bootable device to a VM. That device is composed on-demand and in real time, and never actually exists as a traditional disk image.
Ultimately, ILIO does three things:
- It's the creation and provisioning of what looks like a disk image that a VM can boot from.
- It's a single point of change. (This dynamically-build disk image can contain patches, apps, settings, profile info, kernel changes...)
- It's journaling, with point-in-time snapshots, rollbacks, and recovery.
Since the ILIO virtual appliance is what actually emulates the disk image that a VM boots from, ILIO is able to watch all disk IO reads and writes. From there, it can intelligently figure out what type of data needs to be saved as block IO and what type of data needs to be saved as file IO. ILIO breaks up everything into little chunks which Atlantis calls "flocks." (File IO plus Block IO. Get it? :) The creation of these flocks is mostly done by the ILIO virtual appliance, although Atlantis does stick a software agent inside the VM which is used to put markers into IO blocks so the ILIO virtual appliance has some intelligence about how it should slice up the IO.
That said, Atlantis' ILIO product is actually made up of three different components:
- A namespace server which is a the ILIO database that holds all the flocks.
- The ILIO virtual appliance which is a Windows disk image router on the I/O channel. This is what exposes the mount point to the VM which can be CIFS, HBA, or NFS.
- The service virtual appliance which is a web-based admin console and a SQL or Oracle database for managing which flocks are applied where. (So there are two databases. One is the management database which holds the configuration information, and the other is ILIO's own database of flocks.)
In practical terms, you fire up the management console and create a new baseline image. From there you can boot up this image in a VM and install Windows just like you normally would with any virtualization product. Once your baseline install is done, you shut down the VM and go back to the ILIO management console. Then underneath your baseline image, you build out an entire tree structure that represents definitions for various virtual images. So maybe you have "baseline" at the top, and then "regional offices" below that, and then "groups" below that, or whatever. Each of these leaf nodes is a fully functional desktop image that can be booted at any time. But again, when you boot one of these desktops, a disk image file never actually exists. Instead it's all created in real time based on the database of flocks.
You can then configure these leaf nodes to present pools of VM images, persist disk images, persist for individual users... whatever you want. At any time, you can activate and boot up any leaf-level of the tree and make changes to that image, and those changes are inherited by all lower nodes. (And again, the changes are not copied or replicated, because that's not what's happening with ILIO. Those changes would live at the higher level node and be incorporated when VMs connecting to lower level nodes are booted.)
In other words, you have your leaf nodes for your baseline, your regional settings, your specific branch office settings, specific departments, etc. You can boot up the baseline node and apply a Windows hotfix which will then be integrated into all of its child nodes. Or you can boot up your branch office node and copy in some files that will be incorporated into all of its child nodes. This can be done at any level, and the ILIO appliance automatically creates all the flocks it needs to handle the changes at different levels.
I haven't had a chance to actually test Atlantis' ILIO product yet, although we did spend some time working through running demo environments. What I really like about Atlantis is that they're not trying to create another VDI platform. They're not creating another connection broker or protocol. Instead, they're looking at a big problem that's plaguing big VDI deployments and they're fixing it. And they're fixing it in a way that can be used with Citrix, VMware, Microsoft, Quest, Ericom, or anyone else's VDI system.
Why Atlantis ILIO will be huge
Atlantis is solving one of the problems that I thought only Microsoft could have solved. I guess on one hand, it's too bad that we had to wait for a third-party vendor to fix this problem. Then again, I can't that imagine Atlantis will remain a stand-alone company too long. With Atlantis, you don't need Citrix Provisioning Server or VMware View Composer, so I'm not sure that either of those two companies would pick them up. But maybe this would be a good fit for Microsoft?
From what I've seen, Atlantis' ILIO looks to be the only way to truly have one disk image (or one set of disk image parts) across your entire environment. (Just think about what ILIO for Virtual Servers could do for your Citrix servers too?)
Another interesting aspect of ILIO is in the performance area. The ILIO virtual appliance caches flocks in memory, reducing the IO bandwidth needed to go back to disk. Like Provisioning Server, this could potentially allow you to run your whole environment off a lower cost NAS with a big front end server instead of a giant SAN.
Thinking ahead a few years, imagine a future with WAN delivery of flocks? Individual flocks could be moved around as needed. As bare-metal Type 1 client hypervisors take hold, the ILIO virtual appliance could run locally on the client, and you could have a bootable flock-based image that's offline.
In fact, everything could ultimately be flock-based. This is the real layer cake approach to Windows.