It's important to understand that Windows desktops use far more IOPS in normal operation than Windows and Linux servers. Storage solutions that work well for server implementations will often not work well for desktop OS VMs. The Windows desktop OS does a lot of disk write activity relative to servers, and consumes more I/O overall.
A typical desktop PC spinning hard disk drive delivers 50-100 IOPS, and the desktop OS has access to all of that. A desktop-class solid-state disk (SSD) drive can deliver 5,000 or more IOPS. If your VDI goal is to deliver performance comparable to the physical PCs you're using now, you need to consider the IOPS characteristics of your current PC fleet. If you plan on deliver 30 IOPS per VM to your staff who are used to a 3,000-IOPS SSD will perceive VDI as a much slower solution and be more resistant to using it.
How many IOPS do you need?
There's no hard and fast number you can point to in literature and say "we'll need 120 IOPS per user" and be done with it. You may be in an organization where your disk usage is relatively light, and you can architect around 5 IOPS per user. You may be in an organization where 100 IOPS per user isn't enough. The only way to know is to gather actual usage data on the physical PCs in your environment. I'll discuss how you can do that later.
Windows 7 IOPS During Boot
Given a virtually "unlimited" number of IOPS, a Windows 7 desktop will consume up to 5,200 IOPS and boot in 12 seconds (per Atlantis Computing's Windows 7 IOPS for VDI - A Deep Dive). The entire startup process will consume around 24,134 IOPS total. To estimate how long the boot process will take for one of your VMs, divide that figure by the number of IOPS you're planning to offer per user. If you planned 50 IOPS per user, boot times will average around 483 seconds (or 8 minutes). That's a very slow boot time! (The vast majority of IOPS in the boot process are reads.)
When you hear VDI adminsitrators talk about "boot storms" they are referring to this high-IOPS load situation, when many VMs are being booted at one time.
One way to mitigate or distribute the IOPS load associated with the startup of VDI virtual machines is to stagger the timing of their startups. For example, if you have 200 users who will arrive at the office between 8am and 9am, you might pre-boot 200 VMs during the hours of 6-8am to have them ready when the users arrive. If it takes 10 minutes for each of those VMs to boot, your users will never know this because they're up and running when they arrive. As long as large numbers of users don't try to reboot their VMs during normal work hours, you could survive on a lower number of peak IOPS in your storage solution.
Determining the number of IOPS you need is a matter of understanding the number of VMs you expect to boot or reboot simultaneously, and the maximum acceptable length (boot time) you want for those VMs. If you want a 12-second boot time, plan for 5,200 IOPS per VM being booted. If you're comfortable with a 60-second boot time, plan for 1,040 peak read IOPS per VM.
Windows 7 IOPS During Logon
Windows 7 IOPS During Application Launches
The number of IOPS consumed during software launches will vary considerably from one application to another. Atlantis Computing studied the IOPS activity of launching Microsoft Word, Outlook, and Excel. Peak IOPS during this were just over 450 read IOPS and just under 150 write IOPS. This should give you some idea of what to expect during normal PC usage. The bulk of this activity is read IOPS.
Windows 7 "Steady State" IOPS
Because this steady-state condition is the one your users will be experiencing most often, this is probably the most important case you'll need to analyze and plan around. This represents the time period when users are actively operating the VMs and doing "real work" with them, and you want them to be as productive during these hours as possible.
Antivirus Scanning and Its Effect on IOPS
Windows 7 Logoff and Shutdown IOPS
Shutdown peaks at 450 write IOPS and about 70 read IOPS.
The Risk in Using Published IOPS Figures
- Basic: Only 2 apps are opened simultaneously, and those are some combination of Internet Explorer, Microsoft Word, and Microsoft Outlook. These users will need 6-8 steady-state IOPS and 3GB of disk space.
- Standard: Up to 5 applications are opened simultaneously. The applications used will include Outlook, Internet Explorer, Word, Excel, 7-zip, and a PDF reader. This user is estimated to need 9-10 IOPS in steady state, and 3.75GB of storage capacity.
- Premium: Up to 8 applications are opened simultaneously, from a mix of the same application types as above. This user is estimated to need 10-15 IOPS in steady state and 6GB of disk capacity.
Let's imagine that I decide all of my users are "Premium" users and need 15 IOPS each. I'll architect a storage solution with 3,000 total IOPS for 200 users. Give all the IOPS numbers above, how will this environment perform (assuming my users in steady-state never exceed 15 IOPS)?
- Boot Time: 5200 IOPS per each of the 200 VMs is 1,040,000 IOPS total to boot the VMs. My 3,000-IOPS solution will boot them in 346 seconds or about 6 minutes.
- Login: 2000 IOPS per each of the 200 users is 400,000 IOPS. My storage will deliver those in about 133 seconds, or 2 minutes and 13 seconds.
- Application Launches: Loading Word, Outlook, and Excel consumes around 2,000 IOPS, so launching those applications will take my users around 2 minutes and 13 seconds also, assuming they launch them all at approximately the same time.
- Antivirus Scans: Assuming Avira antivirus and 309,004 IOPS per VM, that works out to an antivirus scan time of 343 minutes per VM, or an elapsed time of nearly 6 hours. If I used Microsoft Security Essentials with its estimated IOPS load, it would take almost 16 hours to scan the VMs. Ouch!
I can't speak for your users, but I'm pretty sure that if mine had to come in and wait 10 minutes for Windows to boot, the login to finish, and their applications to launch, they'd be pretty upset.
Establishing Your Own IOPS Figures
- Launch Microsoft Word. Open a 50K document. Run a spell check. Save it to disk.
- Launch Microsoft Outlook. Create an email message. Send it.
- Launch Microsoft Excel. Open a 100K spreadsheet. Add cells to it, switch between worksheets, insert a chart. Save it to disk.
- Switch back to Word and start the process over.
This activity may be a great representation of what your real-world users do, or it might be very far off the mark. For example, if your typical documents and spreadsheets are much larger than those used in the sample, your IOPS will be correspondingly higher than the simulated users. If you use different applications with different I/O requirements, your IOPS load may differ considerably.
The best way to estimate what your own steady-state needs are is to monitor your own users. One free tool for doing this is Sysinternals Process Monitor from Microsoft. Launch Process Monitor at the start of a user's session and allow it to capture activity as the user works through a normal work day. Stop the capture at the end of their work day. Look at the Tools menu, under File Summary.
We see that for the session captured, which in this case was 5 minutes long, there were 3,752 reads and 17,114 writes. That's a total of 20,866 operations over 300 seconds, or 69.55 IOPS. Assuming that this was a typical user in my organization, this is going well beyond that typical "Premium" user in the Cisco document who used 10-15 IOPS. Had I used Cisco's estimate to specify steady-state storage needs for my users, my virtualization project might well have failed.
If you're thinking that this is just a theoretical case and that real-world numbers might be much closer to the Citrix figures, let me offer some anecdotal evidence. Last year, we received a trial of Liquidware Labs' Stratusphere FIT software. FIT is designed to help you analyze the real-world usage of PCs and VMs in your environment, enabling you to size your VDI environment more accurately. We used it to analyze nearly all of our desktop PC fleet to gather performance information. We learned that in order to deliver the disk needs of 60% of our PC users, we needed to budget for 60 IOPS per user. That's four times what Cisco considers a "Premium" user case.
Our real-world measurement total over a 90-day period worked out to about 39,120 steady-state IOPS for those 952 users. Had we sized our storage based on that 10-15 IOPS Premium user, we'd have been in trouble. We'd have designed for 14,280 IOPS in steady-state. By the time we'd gotten to 348 users, we'd have saturated our storage unit and still had over 600 systems to virtualize!