So you decided to implement Virtual Desktop Infrastructure (VDI). Virtual desktops and app delivery sound sexy, but once you’ve started delving into the nitty gritty, you quickly realize that VDI has many variables. In fact, so many, that you start to feel overwhelmed.
At this point you have a couple of options. First, you can keep doing this yourself, but that will take valuable time. You can hire a VDI engineer to your team, but that also tales time and money to find a great engineer.
Another option is to hire a Value Added Reseller that has done VDI a hundred times. Great idea – I will love you forever, and will do great VDI for you. But I can be expensive.
One particular sticking point in VDI is the sizing of the hardware for the environment. If you undershoot the amount of compute, storage, memory or networking, you risk having unhappy users with underpowered virtual desktops. If you overshoot, you may be chastised for overspending.
Too often I have seen the user profile not properly examined, sized etc. The result is that the derived virtual desktop is low on memory or CPU. The user immediately blames the new technology, not even assigning blame to something they may have done. But the real performance problem culprit may lie somewhere else. However, the user just had his shiny physical machine taken away, and it was replaced with something intangible. Of course, all the problems, whether related to VDI or not, will be blamed on VDI, and possibly the VDI sizing. The bad buzz spreads through the company. Such buzz kills your VDI project faster than performance problems.
So, what is one way to avoid thinking about sizing? Hyper-Converged.
Hyper-Converged means a node in a cluster has a little bit of everything – compute, storage, memory, network. Each node is generally the same but there could be different types of nodes – for example, Simplivity has some nodes with everything, and some nodes only doing compute.
Since most nodes are the same, once you figured out how many average Virtual Desktops in a specific profile fit on a node, you can just keep adding nodes for scalability.
In fact, Nutanix capitalized on that brilliantly when they announced the famous guarantee – once the customer says how many users they want to put on Nutanix, the vendor will provide enough Hyper-Converged nodes to have a great user experience. The guarantee was hard to enforce on both the customer end and Nutanix end. But the guarantee sure had lots of marketing power. Time and time again I heard it from customers and other VARs. The guarantee was a placebo for making VDI easier.
Consequently, you should not just rely on a guarantee for VDI sizing. Sizing should be verified with load simulation tools like LoginVSI and View Planner. Then, the profile of your actual user should be evaluated by collecting user experience data with a tool like Liquidware FIT or Lakeside SysTrack.
Once the data is collected and analyzed, you can decide what number of Hyper-Converged nodes to buy. Hyper-Converged makes the sizing easy because you always deal with uniform nodes.
Once you are in production, you should be monitoring user experience constantly with a tool like Liquidware UX. UX will allow you to always have a solid idea of what your user profiles to. As a result, you can confidently say, “On my Hyper-Converged node I can host up to 50 users.” Thus, if you grow to 100 users, you need 2 Hyper-Converged nodes.
Saying the above is the holy grail of scalability. And therein lies the lure of Hyper-Converged – as a basic VDI building block. That is why Hyper-Converged companies started with strong VDI stories, and only later began marketing for Virtual Server Infrastructure.
And any technology that makes VDI easier, even by one iota, makes VDI more popular. Hail Hyper-Converged for VDI!
For VSI, we established that using analysis tools was a necessity, and VMware provided wonderful Capacity Planner tool. However, it soon became evident that for VDI, it is even more important to use analysis tools. That is because for VDI, when you buy hardware and software, the investment is generally higher. You need a lot more, faster storage. You need many servers and a fast network. So the margin of error is smaller.
Consequently, using Liquidware FIT or Lakeside SysTrack is essential. There are now a few more tools on the market, like ControlUp or Login PI. However, the new entrants have not been battle tested yet.
So how do you analyze your physical desktops for VDI?
First, buy a license for the Liquidware FIT tool (per user, inexpensive), or buy an engagement from your friendly Valued Added Reseller or Integrator who is a Liquidware partner. If you buy a service from a partner, then usually up to 250 desktop license will be included with the service.
Here, I will talk about services of the partner because that is what I do. However, if you are doing this yourself, just apply the same steps.
You will need to provide your partner’s engineer with space for 2 small Liquidware virtual appliances. The only gotcha is that you want them on the fastest storage you have (SSD preferable). That is because on slower storage, it takes much longer to process any analysis or reports.
The engineer will come and install the 2 appliances into your vSphere. Then, the engineer will give you an EXE or MSI with an agent. Usually, you can use the same mechanism you already use to install software on your desktops to distribute the agent. For example, distribution tools like Microsoft SCCM, Symantec Altiris, LANDesk, and even Microsoft Group Policy will all be good. If you don’t have a mechanism for software distribution, then your engineer can use a script to install the agents on all PCs.
Make sure to choose a subset of your PCs, and at least some from each possible group of similar users (Accounting, Sales, IT, etc.). Your sample size could be about 10-25% of total user count. Obviously, the higher the analysis percentage, the more accuracy you get. But the goal here is not 100% accuracy – it’s impossible to achieve 100%. Assessment and performance analysis is an art as much as a science. Thus, you need just enough users to get a ballpark estimate of what hardware you need to buy. Also, run the assessment for 1 month preferably, or at a bare minimum 2 weeks. The time of the start of the data collection above should start from the time you deploy your last user with the Liquidware agent.
Your partner engineer will need remote access, if possible, to check on the progress of the installation. First, the engineer will check if the agents are reporting successfully back to the Liquidware appliances. During the month, the engineer will make sure agents are reporting and data can be extracted from the appliance.
In the middle of the assessment, engineer will do a so-called “normalization” of the data. That is to make sure the results are compatible with rules of thumb for analysis. If necessary, the engineer will readjust thresholds and recalculate the data back to the beginning.
At the end of 30 days, the engineer will generate a machine-made report on the overall performance metrics, and will present the report to you.
At some partners, for an extra service price, the engineer will go further, and will analyze the report for the amount and performance parameters of hardware you need. In addition, the engineer will create a written report and present all the data to you.
In either case, you will know which desktops have the best score for virtualization, and which ones you should not virtualize. If you go with more advanced report services from your partner, then you will also understand how to map the results to hardware and further insights.
One way of mitigating bad VDI sizings is to also use a load simulation tool like LoginVSI. However, LoginVSI is only useful for clients who can afford to buy similar equipment for the lab that they will buy for production. Using LoginVSI, you can test robotic (fake) users doing tasks that normal users will do in VDI. LoginVSI allows you to have a ballpark hardware number that is good. However, the LoginVSI number does not have real user experience data. For that, you need tools like Liquidware FIT and associated work to determine proper VDI strategy.
Understanding what your current user experience is, and also how that experience could be accommodated with virtual desktops is essential to VDI. You should do this assessment before buying your hardware. Doing an assessment ensures that your users get the same experience or better on the virtual desktop as they have on the physical desktop (the holy grail of VDI).
It is now a common rule of thumb that when you are building Virtual Server Infrastructure (VSI), you must assess your physical environment with analysis tools. The analysis tools show you how to fit your physical workloads onto virtual machines and hosts.
The golden standard in analysis tools is VMware’s Capacity Planner. Capacity Planner used to be made by a company called AOG. AOG was analyzing not just for physical to virtual migrations, but was doing overall performance analysis of different aspects of the system. AOG was one of the first agentless data collections tools. Agentless was better because you did not have to touch each system in a drastic way, there was less chance of drivers going bad or performance impact to the target system.
Thus, AOG partnered with HP and other manufacturers, and was doing free assessments for their customers, while getting paid by the manufacturer on the backend. AOG tried to sell itself to HP, but HP, stupidly, did not buy AOG. Suddenly, VMware came from nowhere and snapped up AOG. VMware at the time needed an analysis tool to help customers migrate to the virtual infrastructure faster.
When VMware bought AOG, VMware dropped AOG’s other analysis business, and made AOG a free tool for partners to analyze migrations to the virtual infrastructure. It was a shame, because AOG’s tool, renamed to Capacity Planner, was really good. Capacity Planner relies solely on Windows Management Instrumentation (WMI) functions that is already built into Windows and is collecting information all the time. Normally, WMI discards information like performance, unless it is collected somewhere by choice. Capacity Planner just enabled that choice, and collected WMI performance and configuration data from each physical machine.
When VMware entered the Virtual Desktop Infrastructure (VDI) business with Horizon View, it lacked major pieces in the VDI ecosphere. One of the pieces was profile management, another piece was planning and analysis, another piece was monitoring. Immediately, numerous companies sprang to life to help VMware fill the need. Liquidware Labs (where the founder worked for VMware) was the first to come up with a robust planning and analysis tool in Stratusphere FIT, then with monitoring tool in Stratusphere UX. Lakeside SysTrack came on the scene. VMware used both internally, although the preference was for Liquidware.
Finally, VMware realized that the lack of analysis tool for VDI, made by VMware, was hindering them. But what they failed to realize, was that such tool already existed at VMware for years – Capacity Planner. The Capacity Planner team was neglected, so rarely would any updates were done to the tool. However, since Capacity Planner could already analyze physical machines for performance, it was easy to modify the code to collect information on virtualizing physical desktops, in addition to servers.
Capacity Planner code was eventually updated with desktops analysis. All VMware partners were jumping with joy – we now had a great tool and we did not have to relearn any new software. I remember that I eagerly collected my first data, and began to analyze the data. After analysis, the tool told me I needed something like twenty physical servers to hold 400 virtual desktops. Twenty desktops per server? That sounded wasteful. I was a beginner VDI specialist then, so I trusted the tool but still had doubts. Then I did a few more passes at the analysis, and kept getting wildly different numbers. Trusting my gut instinct, I decided to redo one analysis with Liquidware FIT.
Of course, Liquidware FIT has agents, so I used it, but always thought that it would be nice not to have agents. So VMware’s addition of desktop analysis to agentless Capacity Planner was very welcome. So, back to my analysis, after running Liquidware FIT, I came up with completely different numbers. I don’t remember what they were – perhaps 60 desktops per physical server, or something else. But what I do remember was that Liquidware’s analysis made sense, where Capacity Planner did not. My suspicions about Capacity Planner as a tool were confirmed by VMware’s own VDI staff, who, when asked if they use Capacity Planner to size VDI said, “For VDI, avoid Cap Planner like the plague, and keep using Liquidware FIT.”
As a result, I kept using Liquidware FIT since then, and never looked back. While FIT does have agents, now I understand that getting metrics like Application load times and User Login delay is not possible without agents. That is because Windows does not include such metrics in WMI. Therefore, a rich agent is able to pick up many more user experience items, and thus do much better modeling.
Greetings CIOs, IT Managers, VM-ers, Cisco-ites, Microsoftians, and all other End-Users out there… Yury here. Yury Magalif. Inviting you now to take another virtual trip with me to the cloud, or at least to your data center. As Practice Manager at CDI, your company is depending on my team of seven (plus or minus a consultant or two) to manage the implementation of virtualized computing including hardware, software, equipment, service optimization, monitoring, provisioning, etc. And you thought we were sitting behind the helpdesk and concerned only with front-end connectivity. Haha (still laughing) that’s a good one!
VDI: OUR JOURNEY BEGINS HERE
Allow me to paint a simple picture and add a splash of math to illustrate why your CIO expects so much from me and my team. Your company posted double-digit revenue growth for three years running and somehow, now, in Q2 of year four, finds itself in a long fourth down and 20 situation. (What? You don’t understand American football analogies? Okay, in the international language of auto-racing, we are 20 laps behind and just lost a wheel.) One thousand employees need new laptops, docking stations, flat panel displays, and related hardware. Complicating the matter are annual software licensing fees for a group of 200 but with only five simultaneous concurrent users worldwide. At $1,500 per user times 1,000, plus the $100 fee, your CIO has to decide how it will explain to the board that it plans to spend another 1.5 million dollars on IT just after Q1 closed down 40 percent and Q2 is looking to be even worse.
To read the rest of this blog, where I try something different, please go to my work blog page:
“At the moment, vSphere Client does not support the renaming of virtual disks”
How do you go around the message?
- Lookup the name of your Datastore and your VM in the GUI.
- Start SSH service.
- Login as root to your ESXi host.
- In a SSH session type the following commands. Substitute the name of your Datastore for STORAGENAME and your VM for VMNAME.
- cd /vmfs/volumes/STORAGENAME/VMNAME
- Substitute the name of your old VMDK for OLDNAME and your new VMDK for NEWNAME. Remember – everything is case sensitive.
- vmkfstools -E ./OLDNAME.vmdk ./NEWNAME.vmdk