Is it time for enterprise IT to use DevOps for smart automation of virtual desktop delivery?

During my work life of the last decade at Login Consultants I have come across many large-scale RDSH and VDI environments. From my point of view, a lot of these environments suffer from similar structural issues.

During my work life of the last decade at Login Consultants I have come across many large-scale RDSH and VDI environments. From my point of view, a lot of these environments suffer from similar structural issues.

Over and over again, I see RDSH and VDI projects that do not live up to the sky-high expectations of the IT organization. Projects are over time, over budget and the resulting environments are way more complex and time consuming to manage than expected.

It is peculiar that this still happens so often even though technologies have significantly matured in the past 10 years. Even more so since plenty of impressive innovations, white papers, blog post and blueprints have become available from vendors, consulting companies and community leaders on how to do it right… Right?!

So what is it that makes these larger enterprise scale hosted desktop infrastructure projects falter in my opinion?

The reality in today’s enterprise datacenters is that many of the windows systems, including the hosted desktop environments, are still largely deployed and maintained manually. Truth is that this is happening in way more ‘high-end’ enterprise datacenters than we would like to admit.

Furthermore, with so many imaging, cloning, streaming and layering technologies available nowadays, it almost seems like it has become the standard to simply manually create or update a desktop or server image and then just replicate it. My point is that manual administration seems to be promoted as the ultimate ‘easy’ button while it complicates management long term in many ways.

So what are my issues with manual change?

  • Manual changes cannot be repeated 100% identically, period! Fact is, we humans are not as consistent as we would like to be. Doesn’t matter if we are performing seemingly simple tasks on multiple targets or executing a long checklist on a single target; we are simply not good at it. The only consistency in manual work is inconsistency.
  • To apply manual changes properly, one needs to understand the context and the underlying technology, especially in a VDI/RDSH context. As a result, chances increase considerably that you ‘screw up’ when you do not have the required expertise when you are performing a manual update.
  • Manual changes are typically not properly documented. It is therefore very likely that over time, you can’t remember exactly how you applied your changes or even what those changes were all about. Documentation and actual configuration drift further apart and your systems start behaving in an increasingly unpredictable manner. At this stage, reproducing your systems for testing purposes becomes next to impossible. The devil is in those details!
  • It is tricky and sometimes very difficult to (fully) revert manual changes.
  • You would be amazed how much time goes into seemingly simple tasks like importing some configuration settings or updating windows with the latest security fixes in an operating system image. And when you do this manually, I can assure you that it will never stop. It will only get more with each image, persona, department, datacenter you add to your environment.
  • Manual changes (even small ones) drive work outside business hours since you have to be present while you perform them. Often, users can not be bothered during office hours so there you go… Can become quite costly when overtime becomes the norm.
  • Most organization have invested heavily into automation solutions: but the reality is that operating those automation solution often require a lot manual input for (re-configuration).

    I have to admit: it feels weird writing about manual administration anno 2014. But it is still a reality today, especially in larger and more complicated organizations. This is in stark contrast to what we see how modern infrastructures and applications are managed and operated by cloud providers.

    I strongly feel that extensive automation of the deployment and maintenance of these environments is key. Why don’t you take all the time and effort you would normally put into documenting and manually executing changes into the automation of them?

    Automation workflows are generally not open for interpretation; they either work or they don’t. Computers are, unlike people, extremely good at performing the same task over and over again or executing a huge list of instructions without skipping any or executing them ‘in their own way’.

    So why aren’t all these environments automating the crap out of their installations? And why is it that even if IT departments do automate, the way they do it is often not very efficient.

    Let me explain what I mean by ‘not very efficient’. Automation of the installation of an infrastructure component or application package based on MSI is not overly complicated. However, automation scripts and tooling are often only configured for a specific environment like ‘production’, ‘department A’ or ‘Customer 9’.

    Scripts and packages often lack the essential parameters so they can effectively be reused in different environments or ‘flow’ as part of the change process from one environment to the next.

    Another problem with most automation efforts we see is that the product methodologies that are being used require a considerable amount of effort and back-end infrastructure themselves. This is where we have a typical hen and egg problem. For example, if you use System Center Configuration Manager you need a SQL Server before you can actually start installing SCCM and before you can actually start automating. Once you have finally built a functional infrastructure it is difficult (if not impossible) to extract the configuration data and reproduce your work somewhere else.

    Let’s continue on the SCCM example. Sure, you can export a package. But all the parameters we discussed before are locked up in a package or hardcoded in some scripts. So if you have a test environment and create an application package with specific settings and variables and you import it into the next SCCM environment they will not be automatically transformed. So now you’ve got an automation solution but you basically still manually configuring your work.

    Yep, this is the reality I see too many times today.

    So how can we fix all this?

    Maybe we can learn from another movement that is happening in an area where similar problems exists: software development. This movement is called DevOps! The concept of DevOps is bringing two teams together – the development and the operations team. The use of very short development cycles enables constant validation if the developed code is usable and can be operated. This is done in general by complete automation of the whole operations process from developer to end-user. From an organizational standpoint it is also about bringing the responsibility for the whole chain together. Not one team is responsible for step a, b and c and the next team for step d and e but both teams are responsible for the end result.

    If we want to improve our quality and agility, what are the main concepts that we could to borrow from DevOps? To me, these concepts are:

    • Cross silo automation (and teams)
    • Reusable automation
    • Incremental updates
    • Automated testing
    • Transparency

    They may not all be easily implemented and most people I talk to are quite skeptical about the feasibility but I have seen this work and I have also seen the results! With fairly straightforward tools and ideas in the hands of the right people you can achieve something great.

    Let’s dive deeper into these concepts:

    Cross-silo automation

    With silo in this context I mean that organizations are still too often divided into departmental silos or often called towers. For example one department is responsible for Active Directory, one for virtual machines, one for the OS, one for Terminal Server and so forth. What I see often is that each silo has a certain maturity of automation but they are isolated. This isolation leads to many issues during the process of application provisioning. Imagine a standard change for an application rollout in a hosted desktop environment. One team is responsible for the installation of the application, the other team for the active directory group, another team for the publishing. Now imagine that during the weekend the change should happen. These kind of changes often fail because if only one team makes a small mistake the whole change fails – sounds familiar? With cross silo automation all aspects of this change are in one package and handled by one team. This dramatically increases the success rate of such changes.

    Reusable automation

    But having this package alone without cross environment automation gives you another problem. In this context an environment can either be a staging environment like a test or production environment*. How do you know, if the package really works and which results it might have in your production environment. As described above I have seen examples where you export a package from an SCCM environment and import it to the next. But because the variables of this package are not visible you might forget to define the collection variables and the installation of the package fails, this will lead to frustration and that these people switch back to manual changes.

    You need a smart automation solution that can handle multiple environments. Optimally, that solution allows you to easily create multiple separated environments that are not linked in any way to each other to prevent accidental changes to the production environment while you are working in your test environment. The export/import of a package should be extremely easy and it should be possible to automate it. The reality is that most automation products are great at automating a change in a single environment, but truly supporting the flow a change through multiple environments during the staging process is lacking.

    * I believe the best are 4 stages, Development, technical acceptance test, User acceptance test and production

    Incremental Updating

    In current times of AppStores, Cloud infrastructures and continuously updating applications the business in general does not accept release changes once every 3 months any more. But why don’t we continuously apply changes to terminal servers or VDI master images? Is it because we are afraid they might break the environment and fixing simply takes too long? Or is it because testing our releases and the involvement of the business in user acceptance test takes a lot of time? Imagine you have automated the entire process of building and deploying changes and you have broken down your work into manageable and measurable chunks so you can make them available to any user in any environment automatically!

    Of course you still need the base engineering and automation of the changes. But the next steps can be more dynamic. What about this real life example: a new application is automatically installed in a technical test environment (TAT) where it’s deployment and configuration is tested. Next, the application is automatically transferred to a UAT environment while the business owner of that application receives an email stating that he can logon to a portal and test the application. If the business confirms, the application is scheduled to be deployed to production, fully automated.

    With this approach you can much faster and employ much shorter cycles release cycles for changes like new applications. What about weekly? Sounds great?

    I have customers who are taking this route. They have even enabled the business to change minor things during the release process, like for example a registry key. This leads to a dramatic reduction in time and material for the whole release process and allows to business to become much more Agile.

    Change should be a routine!

    Automated testing

    A lot of tests in the TAT can be automated.* For starters, setting up your test environment and testing the availability of basic functionality after deploying one or more changes. Are all services still running? Can I still log in to the environment? Is that new application appearing on the website? Over time, automated testing will greatly reduce the amount of manual testing but it will also help you catch and fix problems early.

    When you are thinking of setting up automated testing make sure that the testing framework is easily extendible, because setting up tests like this is a lot like implementing a good monitoring solution. First make sure you monitor the big picture and don’t trip over all the details. Start by automating the build of your test environment and monitoring your ability to log on to your environment to make sure your applications are running and the performance is up to par. Then as time progresses, you can start to dive into more detailed monitoring.

    * This is the point where people normally start listing the things that cannot be automated.

    Transparency

    Now for what may very well be the biggest productivity killer in Enterprise IT – transparency!

    Reality is that business and end-users often have no clue what is happening on the IT side. What is the status of my request? Why can’t I have application ABC? Who’s working on my changes and what is taking them so long? How long should I wait until they are available in production?

    I have had major incident calls with enterprise customers where the back-end system teams were blamed for releases not being ready in time while it was totally unclear what was expected from IT. Same story towards the business. Why didn’t they hear about our issues until the very last minute?

    Imagine you have automated your delivery process and that you have turned ‘change’ into something you routinely do. What if you had, for example, a number of staging environments like development, TAT, UAT and production and that you could easily see where a specific change currently resides and even predict when the change will become available.

    Communication and transparency make all the difference. Involve the business in your process and let them help you decide what has priority. Make your progress measurable and visible. After a couple of weeks you’ll be able to improve your forecast which changes will make it to the next release.

    Conclusion

    I believe enterprise IT is missing out and I believe enterprise IT should take on an approach of continuous deployment and far better transparency. These are core principles of DevOps and we enterprise administrators can learn a lot from it. My story is about smart automation and bringing teams and responsibilities together to foster a predictable stream of change that is much more closely aligned to immediate business needs.

    What do you think? Should enterprise IT take the next step in optimizing their delivery process? Is DevOps or a derivative the answer to these age old issues?

    Join the conversation

    13 comments

    Send me notifications when other members comment.

    Please create a username to comment.

    In the typical Enterprise datacenter, there is simply no environment where so many changes occur than in VDI or RDSH environments. But we keep doing the things the clumsy way we do, and wonder why businesses are looking at Cloud and DaaS as an alternative.


    Agile and DevOps changed development in so many ways, it is makes the bureaucratic ITIL and Prince 2 practices we typically see in IT enterprise today so incredibly old-fashioned.


    DevOps is no magic button that easily can be translated into VDI or RDSH management, but without doubt it is inspiring. It will require a different approach how you organize IT departments responsibilities (teamwork!), what management tooling you use and how you implement change processes.  


    Cancel

    Amen to that, Matthias!


    Very well written article. I can 100% resonate all of the above points.


    One of the other inhibiting factors for automation and DevOps in large multi-nationals is that the silos or towers not only are defined by a specific area of expertise. IT organizations these days have become essentially more complex due to outsourcing, FUIT, 'departmentalization' of IT teams, etc., which makes the 'cross silo automation' and 'transparency' aspect of the DevOps proposal much more complicated!


    I am therefore of the opinion that a lot of large Enterprise IT shops must radically consolidate their organizational structure and also strengthen their due diligence for outsourcing/service providers contracts.


    After the only successful examples of large scale RDSH/VDI DevOps (VdiOps??! ;) ) teams that I know are based on teams with extremely small head counts of extremely smart people managing extremely large infrastructures


    ... and yes, extremely smart people also make mistakes sometimes, which is why the 'Incremental Updating' paradigm is so important. Because it is better to make small mistakes more often than to only make one extremely big mistake once!


    Cancel

    I don't think it's about being extremely smart (it obviously helps) but I think it's about having the right mindset. I have seen such teams emerge in huge multinationals but they generally died after the project ended and the team was dissolved.


    I believe we need to rethink how we measure progress in these kinds of projects and what we are actually trying to achieve. Is it about a perfectly running SQL cluster and a perfectly running hypervisor or is it about that application the end user requested? If we shift our focus towards the application that really matters to the end-user wouldn’t this automatically mean that we align our team-membership accordingly?


    Cancel

    As we see that the publishing of applications is not only Computer based anymore, but it moving towards Users and their own demands, it is in my opinion not a question anymore if we need DevOps but how to implement it in the most secure way with the least amount of risk that manual configuration is still needed. A DevOps approach that can also handle DTAP environments and secure that the next step within DTAP will be identical has my strong recommendation.


    In the recent past, I had to convince a project manager in an enterprise environment that his Citrix farm (with 500+ servers) never should need manual configuration. I don't want to call this way of thinking old fashioned, but it shows that what Matthias describes is still a way of thinking.


    Cancel

    @Jeroen - I think with DevOps for infrastructure we Change also Project approaches - no long design phases but fast deployment with continious improvement involving all teams. And right this is hard work - no Magic :-)


    @Christoph - I like VDIOps :-) - you hit the Point small Teams that manage everything for their Environment. I suggest also kind of building block architecture then. Building better 4 farms than one huge. If everything is automated no problem and with drastic reduced risk.


    @Dennis - yes - you Need the right Tools and the right mind set. - but also skilled People - but I also see them a lot in large Enterprises - there are only often frustrated because of Segmentation.


    @Mark - yeah - and it will take time until People will go new ways - I believe that only execellent Performance like Scrum/DevOps based development can Show will convince. There will be also challanges towards the customer. by the way the rest of the audience DTAP means Development, Test, Acceptance Production and is a technical Change process that we liek to  implement in Projects.


    Cancel

    This is by the way the most interesting post on BM I have had the pleasure of reading in the past 2 or 3 years!


    I really don't understand why this particular topic has not truely been discussed outside of the Dutch/German community?!


    Login has for me always been on the forefront in this space ever since I've briefly met a couple of you guys 10 years ago. ;)


    I just feel that everybody in the comments section implicitly understands what the main goals are for improving RDSH/VDI/AppVirt in general!


    Strangely enough though, none of these goals or underlying  core principles have changed very much in the last 10 years. Sure, our tools are much more refined and probably a bit more efficient these days than they were in 2004. But as Mark said in his comment, a lot of people working in influential roles in IT have not changed their way of thinking at all over the last 10-15 years!


    Which is why so many large IT shops have not been able to take advantage of the technological improvements, because they have been held back by their superiors who are very reluctant to change the way they do their work.


    I believe that it also takes more of a generational shift in order to achieve DevOps/NetOps/VdiOps/Scrum :D


    That doesn't mean that the younger generation is meaning to threaten the "IT establishment", but the existing IT structures need to be challenged! Either by influential non-IT staff, or by very energetic, driven and innovative IT teams!


    So, Dennis is absolutely correct in saying that the mindset of each individual IT person has a LOT to do with being able to successfully embark on a *-Ops journey! :D


    Cancel

    This is one of the best, most thought-provoking articles I have read here for a very long time.


    Accepting that DevOps is really about developing and releasing application code to production, rather than application packages, and while releasing 10 or more code updates per day is desirable in some circumstances, we would get lynched for attempting to deliver 10 updates to an application package in a month. The guiding principles behind DevOps can go a long way to improving quality and reducing operating expense.


    Five years ago  the right tools to attempt to use DevOps techniques to manage the VDI or RDSH environments simply did not exist. However, today with the more sophisticated layering tools that are available, not only is it possible to do this, but in many respects it is easier to take a DevOps approach to VDI and RDSH management than to slavishly follow past approaches.


    It will be interesting to see what nod to DevOps Citrix offers in Workplace Services when it ships.


    Cancel

    Guys,


    thanks for the positive Feedback. I really appreciate that. As Christoph stated, the concepts we use are already there for a long time – I can remember that we presented them ages ago on a briforum - but now more and more software is more easy to automate and is also more independent from each other like for example the concept from side by side or application virtualization. This opens much more the doors for these kind of concepts. Often when I demonstrated this concept to enterprises they said “this is nice but we cannot use it, because it will involve too many towers”. But DevOps proved that it works –on a different level – but I think with the pillars I described above you can transform that to infrastructure as well. As Simon stated, we can really do plenty of changes, which we proved in real enterprises with this concept. Especially if you involve business. If they are allowed to do minor changes on their own but fully managed, the DTAP process can be extremely fast. About layering - honestly I am not that big fan of layering technologies but more about layering concepts. Because if the layering technology breaks you are in big trouble. It is indeed more about the concepts I mentioned – for example cross silo Automation. I can built with our “Automation Machine” with one fingertip a full AD with terminal Server, groups, PA and so forth - really ready to let users in - and I can move this Environment to every customer because of the reusable automation – next important part of the concept is “application centric” – everything that you need to get an application up and running is with the application package, which easy allows incremental updating - so I can take it out from one Environment (test-) and move it to the next - with this application the AD group, the publishing and so forth is also moved to the new environment and will be automatically created there.


    I am also curious what the other software vendors for deployment tools will give us. But what I can see is that they will not imprint this concept in the next few years. I believe the problem for them is, that they don’t see this lack of functionality every day at the customer like we do.  


    Cancel

    Really interesting read. Thanks.


    This an area that also mystified as to why this was not already practice more widely in End User Computing. As someone who came from running Operations for very large DataCenters away from EUC - Windows Server and Unix Server Ops between 2005-2009-  this methodology is already very widely practiced- and actually inspired DevOps and arguably led to the Cloud Era.


    Even the big vendors recognized this problem many years ago via IBM Tivoli, Microsoft acquiring Opalis (now System Center Orchestrator) and VMware acquiring DynamicOps- all of them heard their large customers saying that they wanted to remove the 'fat fingers' approach- where even smart people make mistakes. These approaches made the Cloud era happen- without automation on the Server side, there is no EC2, no Azure- none of the forward charges made on Server side in the last decade. Problem is EUC has lagged massively in both understanding the methodologies that exist, the tools that general exist in their organizations already - and from personal experience this is because a wall seems to exist between Server Side and Client Side guys- even when you are a 'Client guy' working on Servers.


    If practiced with rigour around ITIL practice- such as well adopted Change Management, Release Management and Continual Service Improvement it can lead to huge reductions in firefighting, streamline costs, and deliver further service improvement and innovation at a faster pace.


    The irony of all of this is most organizations currently have the tools to make this happen tomorrow. If you have System Center 2012- you have System Center Orchestrator and you can standardize your Reusable automation (especially since Orchestrator tightly integrates into other System Center products such as SCCM). It also general enables Cross-silo automation by creating a common execution platform for automation- and that leads to less territoriality.


    Also the automation suites popularly used by Server Ops have Self Service Portals- System Center uses Service Manager, vCloud Automation Center has a self-service portal. This means not only can EUC operations teams leverage automation but so can end users to request things such as specific application packages, custom VDI specs. All this can in turn be fed into a Change Control workflow enabling proper oversight and even cross charging for this. Better yet- it usually requires no additional software- it’s just using the tools you have better, and leveraging those tools to manage everything else in the environment.


    I've been blessed to see very mature implementations of this during my time at AppSense as some organizations have led this charge and prompted some unexpected innovations that have dramatically increased productivity (both in IT and with their Users), and reduced project CapEx and OpEx. These innovations are also the key reason we published our APIs, and will continue to do so- rather than trying to create our own Automation Silo, we want integrate with whichever platform our customers choose to use.


    This can and will lead to a reduction in skill level required to do mundane tasks- freeing the better qualified to work on more exciting work, and streamlining the time from inception to delivery of strategies to deliver value to users and wider businesses. If we care, as EUC professionals, about both delivering to our businesses and delivering to our customers, then adopting cross-silo automation processes isn’t a nice to have- it’s mission critical.


    Cancel

    @Ian: For clarification - when I alluded to the fact that very smart people are prone to making mistakes, I was not thinking of 'fat finger syndrome' at all! :D I was actually thinking of a friend who has now twice proven over the last couple months, that AWS's  EC2 capacity planning algorithm is broken!


    Maybe we should come up with a DevOps 'manifesto' for VDI/RDSH .... and the guidelines in the manifesto should be aimed at business people as opposed to IT people ...


    1. Practice makes perfect


    2. Don't fall prey to shiny new technology. Understand the underlying concepts instead


    3. Fire your IT manager and make IT hiring a COO/CEO/CFO decision!!!


    Please feel free to add to the list! :D


    Cancel

    Hey Christoph,


    your comment brings me to another point (endless story). It looks a little like advertisement for our VSI benchmark tool but it can be used exactly for that. Again for large Enterprises the question arises how do you scale? You can now go to your architects and they will read whitepapers etc. from vendors to see limitations or get input about capacity planning - this takes time and often does not meet real world data - especially for new stuff (I just saw a question about scaling for pure RDSH Gateway servers ). So we believe in creating fast a PoC environment with only few design basics. And then test – stress test the infrastructure – find bottlenecks and optimize – much faster and more realistic approach. Also this mechanism you can use to automate your TAT – seeing if infrastructure changes will impact production. Find a right size for a building block and then build this n+1 times. This will lead to extreme stable and predictable production environments.  The funny story – you can convince customers to do that the best if their huge farm for thousands of users broke even if everything is HA because of change impact .


    Cancel

    Unfortunately it comes down to leadership.  Places that have CIO/CTO, IT directors that are stuck in the past or have too much job security don't care.  


    Been to too many places where many smart people who under very outdated bosses and forced to fix issues manually or even with smart automation to go and put out fires.


    Too many IT depts don't value solid work because those who knows what they're doing don't create problems and the problem makers are the ones getting resources and promotion for putting out fires.  


    It's all too common, from the biggest companies I ever worked for to the smallest places.  


    Cancel

    Excellent read, Matthias. (Belated birthday wishes, btw) I sympathize with your views and as well with the majority of the comments.


    As Dennis mentioned, the mindset will make or break it - taking skills and cleverness for granted of course.


    DevOps denotes a cultural change not only between Developers and Administrators but between all persons involved within the delivery process of software from end (idea) to end (bucks). If you will, it's about a mindset that helps businesses to compete and win in our fast-paced times. And such a mindset won't do any harm at all when it comes to permanently apply the concepts you listed.


    Businesses seriously needs to embrace the need of change - being borne by C-level executives that lead by example. Well, one can always dream. I rather share Christoph's opinion: it's rather a matter of a generation change...


    Cancel

    -ADS BY GOOGLE

    SearchVirtualDesktop

    SearchEnterpriseDesktop

    SearchServerVirtualization

    SearchVMware

    Close