Is local storage THE solution for your VDI storage problem?
In deploying a Virtual Desktop Infrastructure (VDI) an important consideration is the cost for storage. Storage can take more than 40% of the total capital expenditure (CAPEX) of a VDI deployment. The reason for these high costs are typically due to the high requirements VDI has on storage performance. Besides, bad storage designs also cause a bad user experience making the adoption of VDI difficult. A way of reducing storage costs is using storage solutions in the server (host) itself. This could be in the form of Solid State Drive (SSD), PCI flash interfaces or RAM. Quite some people in the industry think that the use of local storage for VDI is THE answer on how to solve the big impact storage has on VDI. Frequently the response to lower the VDI cost is: “Just use some locally installed SSDs, Flash based PCIe interfaces or a ‘VDI appliance’ and my VDI storage problem is solved” or “We will just use 2 x 600GB SSDs and we’re ready to rumble with 150 Virtual Machines (VMs) on one host.” Well, there are huge benefits of using local storage for VDI but it isn’t that simple.
The goal of this series of blog post is helping with design principles, sizing guidelines and best practices using local storage for VDI for various VDI solutions such as Citrix XenDesktop and VMWare View. The blog also describes the impact of each layer in the complete VDI storage profile chain, both from a performance/IOPS and size/GB perspective. But when you understand what you’re talking about, you will see that local storage (RAM, SSD and PCIe flash based) solutions will solve the IOPS problem explained here and here. But the use of local storage for VDI can introduce serious challenges around capacity, functionality and will have impact in the business case.
My advice: understand the challenges, design wisely and achieve the best performing VDI solution with the lowest impact on CAPEX.
What is the issue?!
Past years IOPS was the big factor in VDI storage infrastructure. Some people learned that the hard way. Do you understand how storage design has a big impact on your VDI? Using the right analysis, best practices, calculation and the specific amount of disks/spindles, the projected performance and user experience can be achieved. Because the focus was mainly on IOPS calculations it resulted in instantaneously available needed storage (GB) capacity. With the availability of fast flash based storage solutions, IOPS isn’t an issue anymore. Using local flash based solutions introduced another important challenge and questions:
- Performance: Does my VDI solution need 100.000 IOPS/host? Does the flash based solution fit in my blade servers? Can I boot from the PCIe interface? How many interfaces are needed and what is the size?
- Capacity: What is the capacity (GB) of the flash solution? How much capacity do I need? What is the capacity impact when using – stateful Virtual Desktops? Does adding a 1.2TB PCIe flash interface, which can double the price of a host, means you will double the amount of users on that host. Teaser: it can decrease the amount of users on the host by 100% as well;
- Control: How does the VDI - VM stack looks like from Image Management, Hypervisor, User and Application perspective. Can I determine and control the growth of the capacity needed?
- Availability: Do I need a higher available VM across multiple virtualization hosts? Is disaster recovery or backup for my desktop VMs needed?
- Endurance: SLC, MLC, TLC and their impact in endurance, support and use cases
- Cost: What does this solution cost (e.g. one 1.2TB MLC PCIe solution easily costs 20.000 EUR) and what’s the impact in the VDI business case?
- Optimize: Can I optimize the space being used by leveraging both local spindles and flash based solutions? Can I use technologies such as application virtualization, single image management or layering technologies to reduce the storage capacity impact?!
The answers will diverge when you compare different vendors and their virtualization solutions, both from a VDI and hypervisor perspective. Understanding these questions and finding the answers is key when you want your local storage for VDI done right!
Four categories and their impact in storage
Four main categories have impact on the storage capacity. They will have impact on designing, building and supporting the VDI storage solution. These categories are:
- Hypervisor / virtualization host
- (Single) image management
User and applications
Hypervisor layer – virtualization host
The default behavior for VMware vSphere hypervisor is to create a virtual swap file. The size of this virtual swap file is equal to the size of the RAM assigned to each VDI-VM. The swap file could be heavily used depending on the amount of memory over commitment. It is possible to adjust the memory allocation to reduce or remove the creation of the swap file. Just a simple example: running 150VMs with 2GB of RAM will result in 300GB capacity of the local storage solution. Quite a pity when you just installed two 600GB SSD drives in RAID1. Be-aware of both the functional impact of memory management and its total in storage capacity.
(Single) image management
A first level of reducing read IO on storage arrays itself can simply be accomplished. In a VDI scenario in most cases a lot of different virtual machines point to the same location to read the VDI master image. This master image contains the Windows installation and applications that are the starting point for users when they first connect to a VDI infrastructure. Image management from the different VDI vendors such as Citrix, Quest, Microsoft and VMware will bring huge advantages to the table. With a single image, multiple virtual desktops can use this single ‘read only’ image saving 100s of GBs of storage capacity. Typically with VMware View Linked Clones, Citrix Machine Creation Services (MCS) or Provisioning Services (PVS) a single master image is used for many individual virtual machines delivering image management to VMs.
But the Windows operating system, applications and user activity within the system need storage capacity to write changes as well. The different image management solutions, the differentiation within the layers, the way to store blocks on various storage locations and the impact in storage design- size- and support will be covered in part 2 of this series of blogs posts..
The system layer is the layer where all ‘read and write’ activities – needed for the system to function - are generated by the Windows Guest operating system. The following components are part of the system layer;
- Log files;
- Windows event logs
- various logs
- Registry changes;
- Page file;
- Monitoring tools.
A big advantage of this system layer is that controlling the size of these files is easier. Determining and setting up a maximum capacity for log files, page file etc. is pretty straight forward. In part 2 of this blog post series the capacity guidelines and best practices will be explained.
User and Applications Layer
When the user logs into the system a local Windows user profile is created. There are different scenarios to use these profiles: local, roaming, mandatory, hybrid etc. The scenario being used has impact on the functionality and needed storage capacity. Redirection of certain folders to shared locations can be a solution to lower the impact on storage capacity. Redirecting ‘application data’ can be dangerous to do - that’s a different subject out of this scope. In essence profile management as part of User Environment Management is an established approach to create hybrid user profiles that can balance management and freedom. In all user profile scenarios there is a need for storage:
- User profile;
- application data;
- settings (registry changes)
- temporary internet files
- Printer spooler;
- User cache for virtual applications.
Especially in analyzing, designing and maintaining the user layer, huge challenges will arise. Do you know what happens when a user prints a 30MB Microsoft PowerPoint file?! The impact, the variables and best practices of the user and application layer will be explained in detail in part 2 of this blog series. The impact of the user cache for virtual applications and the role of App-V 5.0 new functionality called ‘shared content store’ will be explained later.
What is the Impact of application virtualization?
Application virtualization is a key component in the VDI solution stack. There are many advantages and also some downsides with application virtualization. In this context one of the main advantages of application virtualization is reducing the amount of different images while adding dynamic delivery of applications to users using the same shared, golden, image. With different application virtualization solutions, e.g. VMware ThinApp, Microsoft App-V, the local cache size, delivery and maintenance will be different and can have substantial impact in storage capacity both on VM and user level. Sharing and maintaining common application binaries on a central repository is one of the benefits of VMware ThinApp and wasn’t that great with App-V v4.x until the release of Microsoft App-V 5.0.
App-V shared content store
Microsoft Application Virtualization 5.0 is available and ready to use for customers. One of the main questions is: ‘What are the benefits for current and new customers’ using App-V 5.0. There are multiple, sometimes radical, changes and additions to the App-V solution. One that has the biggest positive impact in the context of remote desktop services (RDS), both session host (Terminal Server) and virtualization host (VDI) is App-V 5.0 shared content store. The three main benefits of App-V 5.0 shared content store are:
- Optimal disk space usage by leveraging a shared and central content store;
- Management and delivery of applications to clients is transparent with both ESD solutions such as SCCM, App-V standalone mode and App-V Management Server;
- The admin can easily pre load packages while updating from a central point is still possible.
Any universal naming convention (UNC) share, any path or other drive can be used with the shared content store. Network speed and latency to the store needs to have local area network (LAN) characteristics, so fast and reliable. In earlier versions of App-V symlink was needed to update content of the App-V Read-only cache resulting in various management and update challenges. With the Microsoft App-V 5.0 shared content store things have changed!
App-V 5.0 makes it possible to turn off local application storage, dramatically reducing disk requirements for RDS while leaving the application provisioning and update process unchanged. The App-V 5.0 content is stored centrally on a file share and never persistent in the Client VM. In part 2 of the blog post the more detailed amount of storage savings will be explained and calculated. Overall the conclusion is that the App-V 5.0 shared content store will save 100s of GB on a remote desktop virtualization host which has huge impact on the CAPEX of VDI storage.
Closing - part 1
The goal of the first part of this blog post series is to make you aware of the fact that the impact of storage design for VDI is changing. Considering the usage of local storage solutions is good. The discussion around storage sizing is shifting from IOPS to GB and it’s important to find the right balance and sweet spot in capacity (IOPS/GB), size, user experience and costs. Running 150 VMs on a 640GB local storage flash based solution is possible, it’s all about the context!
Just a simple example; when the VDI solution as a whole is using vSphere Memory Management while the user layer is generating a couple of GBs of storage capacity, the 640GB flash based storage solution is way too small for 150 VMs. It’s key to understand the whole “VDI + Storage = Deep Impact” discussion, its impact in sizing guidelines and business case. It’s a mistake to say: ’Just use some local SSD, Flash or RAM solutions in the virtualization host and delivering the best and cost effective VDI solution is achieved’.
In the last part of this blog post the four different layers and their impact on storage are briefly explained. The sizing guidelines, best practices and design parameters of these layers will be described in detail in part 2 of the blog post series. <To be continued>