Hardware startup Nutanix unveiled their first product last night, a server/storage highly-scalable combination that just might be the perfect VDI host. And just like that, Nutanix is my current favorite startup vendor. (I can't remember who I last had a crush on, but some of the past vendors have been Ardence, Atlantis, and Unidesk.)
The DNA of Nutanix is based on the "big data" trend of combining computing and storage capabilities into single entities. Nutanix sells server nodes with local storage built-in, but their magic is in the software that combines all the storage of all the nodes into a single giant storage pool, with any data from any node available from any server. They have a master-less architecture with no concurrency locking, and they can support advanced VMware features like vMotion. Oh, and they repeatedly assured me that when loading up VDI VMs on these things, IOPS will not be a bottleneck (even for unique persistent 1-to-1 disk images).
To understand the what, why, and how, let's take a step back.
Storage trend: combining compute and storage into one system
In the world of "big data" (which is just a fancy yet generic term that describes when there's so much data that traditional methods for dealing with it don't work), there's a trend to combine storage and compute.
In the traditional computing model, your computation happens on one system (the server), and your data is stored on another system (the SAN). But as things get faster and bigger, this split architecture becomes a bottleneck. One way to handle this (and a growing trend) is to move some of the computing capability onto the storage system. I mean after all, if a SAN has Intel processors, why can't those guys use some of their excess capacity to help out the computing layer?
This concept began with simple things. Maybe instead of a SQL query running on a database server, the application layer could pass the query directly to the storage tier which could run it locally?
There have been hundreds of companies which built solutions like this. A team of engineers who helped Oracle accelerate some of their highly scalable databases thought, ""Hey, why not ask some of the storage cells in the EMC to do some basic work?" And in 2005, that idea turned into Aster Data Systems, a massively parallel processing database that was about pushing the app logic down into the database.
Meanwhile Google was realizing the same thing. The team that originally built the Google File System originally built it in two tiers, but eventually they realized "Why not smash these together?"
And you can find examples of this everywhere now. Amazon Dynamo, Hadoop, the systems that run Facebook, Google, Bing, MySpace, Akamai… they all do this. But those systems were all built for very specific purposes. Is it possible to bring that architecture to the masses? That's exactly what Nutanix hopes to do. The three cofounders all come from Aster Data, including their VP of Engineering. And before that one was the lead designer of the Google File System, while another architected massively scalable databases for Oracle.
Nutanix's goal is to leverage the trend of consolidating compute and storage but for a general-purpose platform of regular VMs. Their goal is to keep the system completely transparent to VMs. Nutanix is just providing block level access that's really no different (from the VM's standpoint) than any existing storage system.
This is different than today's "consolidated" systems from Cisco, HP, Dell, and VCE. Those systems are essentially just bundles of traditional storage and computing. Nutanix is doing is true convergence.
The Nutanix hardware
Nutanix has 2U enclosures with four server nodes. Each node is a dual socket with 48-192GB of RAM, a Fusion-io controller with SSD, plus 5TB of SATA storage.
All of the nodes are completely seamless. The fact there are four per 2U appliance is just a form factor. Each node runs VMware ESXi and acts as your VM host, and then a controller VM running on each node acts as the iSCSI interface to the storage and basically turns the whole thing into a distributed SAN. There's a 10gig Ethernet connection for the storage traffic which is separate from the regular network traffic.
The controller VM decides where in the system to place the data. There's always one copy local plus another copy somewhere else in the cluster. Nutanix calls this "Cluster RAID," and it's fully compatible with VMware HA and vMotion. There's a distributed cache using the Fusion-io with SSD, as well as a persistent SATA tier.
Then the distributed MapReduce system does all the maintenance for them. Everything is completely transparent, and the whole system is lock-free and everything can be concurrent. There's no single master and no shared cache. They have true scale-out with their storage metadata (which lives on every node), and the system continues to scale as you add more nodes.
Nutanix views themselves as a serious enterprise alternative to NetApp and EMC. They're not just about taking two nodes and turning them into a SAN. This is not for small or medium business. This is serious enterprise class storage. (And thankfully they haven't done any hardware engineering. They sell the hardware for the sake of testing and consistency, but the hardware itself is nothing special.)
I want to reiterate that Nutanix absolutely feels that IOPS will not be the bottle for VDI environments running on their systems. I asked that about whether they support single instance block level storage (a.k.a. "inline dedupe), and their answer was "no." Although after they said no, they spent a considerable amount of time discussing key issues such as variable block length and strong versus weak hashes, so it was obvious they were familiar with the concept.
At this point I'm looking forward to getting some experience with one of these in the real world. If they can build modular-yet-scalable essentially unlimited IOPS environments for VDI, then this is a huge thing. My only real fear is that they'll be swallowed up by some huge company and more-or-less destroyed.
So that's my crush for now. What do you think?