Wednesday, May 27, 2015

Implementing a Locked-Down Desktop

When my organization used Windows 95/98 systems, there wasn't a good solution for locking down the desktop.  Because of that, we had all kinds of problems.  Our users would things like:
  • Download and install software they shouldn't (e.g., trial versions, products from home, unsupported tools, drivers).
  • Get malware infections that spread from one machine to another.
  • Uninstall or disable their antivirus software because they thought it slowed down the PC.
  • Uninstall or disable their firewall.
  • Modify security settings for Windows itself, making malware infection more likely.
  • Bring devices and drivers in from home that crashed the company PC.
We made the decision when we switched to Windows NT 4.0 to lock down the desktop and not grant users administrator access.  This implied a number of changes to things:
  • Users would have to call us for things that they might have done for themselves in the past.
  • We would have to install all software, updates, and drivers for the users.
  • Some staff would need to be granted power user or administrator access due to job specific needs.
  • Some staff would complain that they no longer had the freedom they had enjoyed with Windows 9x.
At the same time we implemented the lock-down, we also implemented remote control software to allow our help desk staff to actually see the customer's problem and help them without having to run to their offices.  In the end, fewer than 10% of the staff could justify the need for administrator access for business reasons.  A few of the remaining 90% complained about a loss of flexibility, but most were happy to be able to just call the help desk and get things fixed for them.

Some of the benefits we've accrued from the lockdown include:
  • Fewer malware infections overall
  • Because staff run from standard user accounts, malware is typically confined to a single user profile when it does infect a system, making cleanup easier
  • Certain infections, like the Conficker worm that spread from one machine to another, barely registered on our radar.  (I think maybe one or two systems got the infection, but it didn't spread.)
  • Since users can't change security settings, systems retain their secure default configuration
  • License compliance improved significantly, since users could not install "any old thing" anymore
  • Although the number of help desk calls went up briefly as users adjusted to the situation, the locked-down configuration actually reduced help desk calls by eliminating some of the sources of past problems (such as users downloading the wrong drivers and installing them, or applying incompatible software updates)
  • Support became easier, because you could assume (within reason) that every machine was configured to the corporate standard, used a consistent set of drivers, etc.
When I talk with colleagues at other companies, I'm often forced to shake my head in sympathy.  They tell me how a user installed infected software off the web that corrupted the machine so badly it had to be reimaged.  They talk about how a software audit left them in trouble because users installed things the company wasn't licensed for.  They talk about hackers getting in because someone opened an infected attachment that gave the hackers administrator control over the box.  They tell me about long nights spent cleaning up any number of problems.  I don't have many stories to share with them, because we don't have those kinds of problems.

A locked down desktop isn't a panacea.  You do still wind up with issues you have to deal with:
  • Although Windows 7 prevents standard users from installing software into the Program Files directories and Windows itself, users can still install software into their own user profiles.
  • Software that may require licensing, but is delivered in a zip file, can be used without installing it.
  • Malware still gets in, and gets past antivirus - it's just (usually) easier to cleanup
  • Some staff need applications that require administrator access, and you have to work around those, sometimes by adjusting file/registry permissions, sometimes by providing ways to launch the application with administrator permissions
  • Some staff try to fight the lock-down by bringing in devices from home (e.g., Macs or personal laptops), and you have to be prepared to deal with this - especially if the devices are connected to the corporate LAN
  • Despite corporate policies to the contrary, staff will still bring in USB devices with portable applications, portable hard drives, and other peripherals from home
  • No matter how flexible you try to be and how much power you offer users short of administrator access, there will always be some staff who complain that they would be "much more productive" if they had administrator access.  If they can demonstrate a true business case for it, our IT Security people will usually approve the request, so this is more a nuisance than a real problem.
If you are considering locking down your corporate desktops, here are some things you'll want to think about and do:
  • Determine the process and criteria for justifying the need for administrator access.  Under which conditions will you provide this access to your staff.  Who will approve the requests?  How will you create and name administrator accounts to distinguish them from standard accounts?  How, and how often, will you review the accounts and the need for them?
  • Consider disabling auto-update features in applications, as users may no longer be able to apply the updates.  If they are unable to apply the updates, this can cause annoyance (from the constant alerts to update) and frustration (at the inability to apply the updates).
  • Test all of your applications under a standard user account to ensure that they run properly.  You may find that you need to adjust file, folder, or registry permissions to get them working. You might even have to install them outside the Program Files folders.  Failure to properly test the software is something that could quickly kill your lock-down effort.  (Sysinternals Process Monitor can be a big help identifying where permissions changes may be needed.)
  • Test all of your standard peripherals and devices to ensure that they work properly.  Few devices trigger a UAC prompt for administrator credentials, but they do exist.
  • Establish a policy that administrator accounts should only be used as needed.  Web browsing, reading email, and other mundane activities should be conducted from a standard user account to avoid system-wide infection.  Decide how your organization will handle violations to this policy.
  • Consider implementing application control and privilege management software, such as Arellia, Avecto, or Dell Privilege Manager.  This products can prevent users from installing and using unauthorized software, and can also eliminate the need for many administrator accounts by automatically elevating sensitive applications to administrator (for users who have only standard accounts).
  • For users who will be granted administrator accounts, how will you audit what those users do with the accounts?
  • How will you implement the lockdown?
    • Will you give users a standard account and an administrator account, then expire the administrator account after some period of time?
    • Will you deploy the lockdown in conjunction with an operating system upgrade (e.g., Windows 7 to Windows 10)?
    • Will you implement the lockdown gradually across the organization, or all at once?
    • How will you notify the users about the lockdown?
    • How will you identify and troubleshoot application problems resulting from the lockdown versus those occurring normally because of issues with the software?
You may identify other issues specific to your organization that you want to test or document before implementing a lockdown.

Wednesday, May 20, 2015

Should you use software auto update features?

A lot of security advice I've seen recently urges companies to enable auto-update mechanisms built into the software they use, such as the auto-update services for Adobe Flash Player and Oracle's Java Runtime Environment.  The reasoning behind this advice is sound.  It tends to be along the lines of "Once you turn on this mechanism, you won't have to worry as much.  The service will keep your PCs on the latest release, with all the latest bug fixes and anti-exploit technology, so you'll be much safer."  While I agree with this thinking in principle, I'm not sure that I'm as comfortable with it in practice.

Every few days, we hear about another security breach.  The web servers at Company X have been hacked, and the bad guys have stolen account information, passwords, credit card numbers, or personal data.  Sometimes the bad guys break into a site, like a popular blog or new site, not to steal anything from the site but to spread malware onto visitors' computers.  It's this point that makes me leery of auto-update tools.  Why?

Imagine that you're the system administrator responsible for 2,000 PCs.  You decide to make your job a little easier by allowing Adobe Flash Player to update itself as new releases become available.  With all due respect to Adobe, we've all seen how frequently the bad guys have been able to find exploits in Flash Player, Adobe Reader, and other products.  This is the same company that's maintaining a web site and/or FTP site to distribute Flash Player updates.  If the bad guys find a chink in Adobe's armor and manage to get their garbage into the auto-update repository, what happens next?  Your 2,000 PCs reach out to Adobe's site, find a new "update" and deploy it immediately.  Suddenly, all 2,000 of your PCs are infected and doing the bidding of that malware author.  That's a cleanup job none of us wants to tackle.

Now, let's imagine that the vendor's security is rock-solid and no one ever breaks into their update sites.  You're still managing that network of 2,000 PCs.  Let's imagine we're looking at the Oracle JRE now.  You've turned this on to allow automatic updates.  Overnight, Oracle releases a new JRE.  Your 2,000 machines dutifully download and install this update.  This time it's a perfectly malware-free update.  The problem is, you've got critical internal applications written in Java that are broken by a security fix contained within this new JRE.  Staff who don't use that application are still having trouble, because it's blocking certain Internet sites they use from running Java applets.  The help desk is flooded with calls and you're looking like the bad guy.  Yet again, you have another mess to clean up.

Even when both of the above situations don't happen, these mechanisms present some other issues:


  • There are probably a hundred different applications in use at our company which have some kind of auto-update option, including products from Adobe, Mozilla, Google, Microsoft, and others.  If all these were enabled and running, PCs would be constantly polling for updates, consuming some amount of CPU, disk, and network resources.  All of this would in addition to the components in our patch management system.
  • If you're following good security practice, your end users generally don't have administrator access.  Some of these auto-update mechanisms assume that the user does have administrator permission.  If you enable auto-updates for these users, they'll have to call your help desk every time an update comes out, so that it can be installed for them.
  • Although many of these auto-update mechanisms randomly disperse their update-checking times (e.g., PC1 might check at 3:00am, PC2 at 3:05am, and PC3 at 3:06am), some do not.  This means that all of your PCs will reach out at once to the vendor's site to check for, download, and apply the update.  If this happens during work hours, your network bandwidth could take a sizable hit.
  • We've all seen stories about how one update or another has caused a problem.  A Windows Update might cause a blue-screen at reboot.  An antivirus update might cause the AV product to crash or classify itself as malware.  A bug in the update might cause the application to stop working.  When this happens to a few of your users, it's a nuisance.  When it happens to 2,000 at once, it's a crisis.


I'm not suggesting that no organization, under no set of circumstances, should ever turn on auto-updating.  All I'm saying is that you want to analyze and weigh that decision carefully.  Some of the factors you might want to consider include:


  • Do I think it's likely that this vendor's web or FTP sites could become compromised and used to distribute malware?  If so, it may be wise not to enable auto-updates.
  • Are the vendor's auto-update installers digitally signed and encrypted or hashed, so that even if a malware operator gets access to the update distribution point, it won't do them any good because their code will be unsigned or will provide a bad hash to the update mechanism?
  • How are update checks scheduled in the product?  Can we stagger them out so that the entire fleet of PCs isn't trying to update the software at one time?
  • Do the update installations require administrator access?  Do our users have that access?  If they don't, do we really want to field the calls this will create?
  • Is the risk of infection from an exploit of the product (e.g., Flash Player) greater to us than the risk of the distribution point being compromised by the bad guys?
  • Do I have the necessary tools, resources, and procedures to handle updating this software in a timely manner already?  If so, it may be better to use my existing tools.  If resources are limited, the potential risks of using auto-updates may be less than the costs of distributing the updates myself.
  • Are there corporate policies or regulations that require me to enable auto-updates, or require all updates to be applied to systems within a certain timeframe?  Can I meet that requirement without auto-updating?
  • How often are you seeing malware infections that exploit this particular software?  If you're seeing such infections daily, auto-updating might help reduce the frequency.  If you're rarely seeing the software exploited, you might disable it.  (Note that this isn't always easy to know because malware vendors don't always share information about the vulnerability that malware is exploiting - and they may not even know for sure.)
  • How often do updates appear for this product?  You may decide that, over time, the effort involved in deploying all the updates for a given product is outweighed by the possible effort involved in a cleanup of a hypothetical infection distributed through the auto-update mechanism.  Perhaps the updates are so infrequent that you don't want to be bothered with deploying them yourself?
  • Do you have large numbers of machines that are frequently disconnected from your LAN?  If so, auto-updating may be a valuable way to keep them current.  If all your machines are always 


You may have other criteria you wish to consider in your evaluation.

It's also important to note that auto-update mechanisms usually aren't an "all or nothing" or "every PC or no PC" option.  You might conclude that it's better to enable automatic updates for your overseas staff and remote workers, to ensure that the updates go to them in a timely manner - while turning them off for the bulk of systems inside your headquarters.  You might decide that it's worthwhile to have auto-updates enabled for Flash Player because it's a common infection vector in your organization and the risk of a problem is low, while disabling them for Java because you have too many applications that are dependent on specific older releases of the software.

In my organization, we have a centralized patch management tool which can deploy patches for any software we use.  We use that patch management tool to deploy updates to our entire fleet of PCs and virtual machines.  The most serious patches are deployed as soon as possible after release.  Slightly-less-critical patches are deployed within 48 hours of release.  Patches with low criticality are deployed gradually over a period of a week or two from release.  We disable most auto-update mechanisms by default, to reduce the incidence of software compatibility, infection, or crash problems, but do enable some auto-updates.  For example, sales staff, remote workers outside our main offices, and machines not joined to our Windows domain have auto-updates enabled for certain software that is often exploited.  These machines are typically harder for us to patch with our normal tools and processes, so the auto-update mechanisms enable us to keep those machines current even when we can't reach them.  Antivirus updates are automatic, too, to maximize the protection they provide.

My point isn't that auto-update mechanisms are inherently bad or good, or that they should always or never be activated.  I'm only suggesting that before you enable one, you take a serious look at the potential risks, costs, and benefits of enabling it.  If you decide that risk and cost is low relative to the benefit, then by all means enable it.  If not, turn it off.

Wednesday, May 13, 2015

Atlantis Computing ILIO for VDI Storage

In 2014, my employer asked me to look into the Atlantis Computing ILIO product.  There's not a lot out there on the Internet about it.  In this post, I want to share some of what I learned about ILIO during an extended trial period with the software.

What is ILIO?

Atlantis ILIO is a software product designed to reduce physical disk I/O and storage requirements associated with Virtual Desktop Infrastructure (VDI) implementations.  It achieves these goals by acting as a software layer between the virtual machine's host (in my case VMware ESX) and the physical disk storage (such as NAS or SAN).  As VMs read and write to their virtual hard disks, the hypervisor passes those I/O requests to ILIO, which performs in-line deduplication and decompression of the data.

There are two ILIO variants:  Diskless and Persistent.  A Diskless ILIO instance uses RAM to store all the VM hard disk information.  If the ILIO VM is rebooted, the VMs within it are lost.  Diskless ILIO is a good solution for non-persistent VDI implementations where users have no expectation of VM changes being retained across sessions.  Persistent ILIO uses physical storage to retain the compressed and deduplicated data permanently.  It's a better fit for persistent VDI implementations, where VM configurations are intended to be retained for longer periods of time.

A Diskless ILIO instance can use physical storage for the "SnapClone" feature.  The VM stored in this disk space is updated every four hours from the in-RAM image.  In the event that the instance is shutdown or fails, the disk-based image can be used to restore the VM to operation quickly.

How Does ILIO Work?

If you think about it, there are many, many redundant files stored on the physical PCs and VMs on your corporate LAN.  For instance, every Windows 7 PC with Microsoft Office contains a full copy of the Windows 7 OS and the files associated with Office.  If you virtualized all those PCs into persistent VMs, you'd need enough storage to keep that many copies of Windows and Office.  If five of these VMs were powered on, each one would read the same Windows 7 startup files from their respective virtual hard disks.  That's five times each program and DLL is loaded from the physical disk.

ILIO sits between the Windows VM and the physical disk.  It stores only one copy of all those DLLs and programs.  If five VMs are booted with ILIO in the middle, ILIO loads one copy of those files from the physical disk. It then caches that in RAM and feeds it to the other four VMs when they ask for it.  Reading a file from RAM is much, much faster than reading it from disk.  ILIO therefore reduces physical I/O (keeping only one copy of the file on disk and only reading it once) and speeds up I/O for the virtual machines (by delivering data from RAM as often as possible).

ILIO is implemented as a Linux virtual machine running on the VMware host.  This Linux VM is configured to look like a data storage device to the VMware host.  As far as ESX is concerned, the ILIO VM is just another disk it can write to.  Configuring the ILIO VM is done through the ESX console at a text-based interface on the VM itself.

ILIO VMs communicate with an "ILIO Center" VM that provides a web-based management and monitoring interface to the product.  This monitor provides not only status information, but allows you to see I/O activity and how ILIO is minimizing that.  You can also use the web interface to fast-clone VMs within ILIO.

Does ILIO Deliver on Its Promises?

When Diskless ILIO was used to provision virtual machines, we saw a reduction in provisioning time from 16 minutes on our NAS system to 23 seconds with ILIO.  That's a reduction of approximately 98%.  That means the difference between cloning a group of VMs in minutes versus hours.

ILIO reduced physical read I/O on our NAS by approximately 90%.  Write I/Os were reduced by 70%, and overall I/O was reduced by 80%.  This reduction in I/O activity is like increasing the read/write capacity of your physical storage by 4-5x.

ILIO reduced the physical storage requirements for nearly-identical VMs by 95%.

ILIO reduced the "boot to desktop" times for VMs from 53 seconds to under 30 seconds.  (Bear in mind that our servers were not purpose-built for VDI and aren't ideally suited to it, or boot times might have been significantly better.)

ILIO increased disk throughput for our individual VMs from a maximum of approximately 25 MB/second on SAN storage to approximately 80MB/second.  Diskless ILIO achieved 120MB/second storage performance on our hardware.

ILIO increased total IOPS on our NAS storage from 5,000 to around 18,000.  Write IOPS were increased from around 1,000 to 15,000.  Write response times decreased from 13ms for NAS to under 1ms for NAS with ILIO.

Overall VM performance scores in the PCMark7 benchmark increased from 1642 for a NAS-hosted VM to 2249 for a NAS+ILIO hosted VM, and a maximum of 2396 for a Diskless ILIO VM.

In my experience, ILIO delivered on every promise made by Atlantis Computing.

What Else Should I Know About ILIO?

ILIO instances require at least 32GB of server RAM.  If your servers aren't equipped with that much extra memory, you'll want to consider cost of adding RAM to them.  Generally speaking, a Diskless ILIO instance needs twice the RAM of a Persistent ILIO instance, since it has no disk to store data when it's not needed.

ILIO instances require at least one physical CPU to be reserved for them, in order to ensure optimum performance.  When running lots of I/O activity, an ILIO instance may consume an additional CPU core.  This is not unreasonable, but is something you need to consider when designing with ILIO as a solution.  You may have to give up a couple of cores to get the performance you want.  That reduces the number of VMs you can deploy per host.

ILIO instances and the VMs they host must run on the same server.  This doesn't rule out things like VMware Distributed Resource Scheduling (DRS) or High Availability (HA), but it does add some complexity to those activities.  VMs must "move" with their ILIO instance.

VMs are completely dependent on their ILIO instance.  If the ILIO instance crashes or reboots, the VMs hosted within that instance stop working (just like having a NAS box crash or become unplugged in a non-ILIO situation).

In my limited experience with ILIO, it seemed somewhat picky about its network configuration.  Seemingly minor changes to the network configuration (e.g., changing the subnet or IP address) seemed to cause the instance to stop working.  This required some troubleshooting at the Linux command prompt to resolve.  Getting the ILIO VMs to talk with the ILIO Center management console seemed problematic at times, also.  This may have been my inexperience with it.

A single ILIO instance seems to deliver the best performance when it's servicing a nearly-identical set of VMs.  This means that you might need multiple ILIO instances per host when running groups of differently configured VMs.

In our environment, a 64GB RAM Diskless ILIO instance was able to house 79 VMs with our standard Windows 7 configuration.  A 32GB RAM Persistent ILIO instance with 150GB of physical storage housed the same number of VMs.

ILIO becomes a point of failure in your VDI environment, just as any other technology you implement.  Given that ILIO deduplicates and compresses the data stored on physical disk, any corruption of the physical disk is likely to render a more-significant chunk of the ILIO storage corrupted.

Conclusion

Atlantis ILIO promises to exchange some CPU, RAM, and disk space on your hypervisor host for greatly improved IOPS and reduced disk storage requirements.  In these respects, it seems to deliver on those promises.  We saw significant reductions in disk usage and corresponding increases in disk IOPS regardless of the back-end storage used.  

In terms of total IOPS, the EMC Isilon storage combined with an Avere edge filter delivered comparable IOPS performance to ILIO with NAS.  The Isilon+Avere combination delivered approximately 17,000 total IOPS to ILIO+NAS at around 18-19,000 IOPS.  How the costs of those two solutions compare, I can't say.  However, since the Isilon+Avere solution doesn't require a VM on the host and behaves like any other shared storage, it wouldn't complicate HA and DRS activity in the VMware environment.

I like Atlantis ILIO quite a bit.  It delivers on its performance promises, even on the relatively old hardware we tested it with.  It was relatively easy to setup and use, and didn't crash or fail during the time we tested it.  If you need to accelerate your VDI storage and the ILIO design fits in with your VDI architecture, I'd recommend checking it out.  They offer free trials, which should give you the opportunity to see how it works for you.





Wednesday, May 6, 2015

Choosing Storage for a VDI Implementation - Part 4 - Selecting a Storage Type

In the previous three posts, we discussed how to identify the number of IOPS you will need for your VDI implementation, and how to begin estimating the capacity of storage you'll need to house your VMs.  Now that you know "how much" and "how fast" your storage should be, you're ready to look at storage solutions that meet those needs.

In this post, I'll try to cover some of the many storage options available on the market.  For each one, I'll try to explain what to expect from that solution with respect to its performance, its management requirements, its scalability, and its cost.  Bear in mind that this isn't meant to be an exhaustive analysis of every possible storage option, nor is it meant to convey that you must choose a specific type or vendor.  I'm providing some general guidelines and information that you'll want to consider when making your own analysis of the solution that best meets your performance, capacity, and cost requirements.

When you select storage for your VDI implementation, you should consider:

  • Performance:  Plan for storage that is dedicated to VDI if possible.  Mixing VDI and non-VDI workloads often results in performance issues.
  • Management:  Decide how your storage solution will be managed.  Who will be responsible for it?  What are their skills and experience levels with storage management?  This will help you choose a solution that you have (or can quickly develop) the skills to manage.
  • Scaling:  Try to estimate the eventual number of VDI desktops you will implement.  Some storage solutions work well for small VDI implementations but fail for large implementations. Some are too costly for small implementations but become cheaper for large environments.
  • Cost:  Consider the above criteria in conjunction with the cost per VDI desktop of the storage solution.  A solution that performs 10% better but costs 50% more may not be a good choice.  A solution that is economical for a 2,000 desktop implementation might be too expensive for a smaller implementation.
Below are some of the storage options in the marketplace at the time of this writing (early 2015).  For each, we'll look at performance, management, scale, and cost relative to the others.  
  • Traditional Enterprise Storage Arrays:  These provide decent performance but can become overwhelmed with the number of writes needed by Windows desktop OS VMs.  Management of these arrays is often complex and may be too much for VDI teams to add to their workload.  They are expensive to acquire and operate.  For small deployments in organizations where the storage already exists, they may be the only option available.  They're best suited to VDI implementations with fewer than 500 clients, or deployments up to 1500 clients if there are only light IOPS requirements.
  • Software Aggregated Storage:  This is where a layer of software, such as VMware's Virtual SAN (vSAN) abstracts the physical storage from the VDI environment.  These can provide good performance without the write limitations of traditional storage arrays.  Management of the solution is usually handled through VDI consoles and hypervisors, and thus is compatible with existing VDI team skills.  Cost is usually lower and can be scaled as the deployment grows.
  • Converged Storage:  These are generally based on storage physically incorporated into the VDI server.  As new servers are added to the environment, storage comes along with them and adds to capacity.  Management of these converged systems usually happens through a single console, so it's easier for VDI teams to handle.  Servers with converged storage tend to be more expensive than traditional servers, but done require separate storage arrays.  They might also reduce power, cooling, and space requirements.  They work for all but the smallest VDI deployments, where the cost of the added storage makes the per-desktop cost high.
  • Solid-State Storage Arrays:  These provide high performance, but also at high cost.  Management is usually easy, with centralized tools provided by the vendor.  The cost is usually out of reach for small VDI implementations, but when the implementation is large enough, the per-desktop cost can become reasonable.  They work best when there is a high peak or steady-state IOPS workload.

Depending on the scale of your VDI implementation, you'll want to look at one or more of these options and decide what's within your budget - as well as what meets your needs.


Monday, May 4, 2015

Fixing Cisco AnyConnect Error "The VPN client driver encountered an error." on Windows 10

In an earlier Windows 10 build, the Cisco AnyConnect VPN software stopped working.  In this case, we saw the error "Failed to initialize connection subsystem."  It was necessary in that release to use Windows 10 File Explorer (a.k.a. Windows Explorer) to navigate to the Program Files, Cisco, Cisco AnyConnect Secure Mobility Client directory and then right-click on the vpnui.exe program to configure it for Windows 8 compatibility mode.


This fix worked until somewhere around Windows 10 Preview Build 10069.  Since then, and including Build 10074, AnyConnect has not worked properly.  Any attempt to initiate a VPN connection would generate the error messages:

  • The VPN client driver encountered an error. Please restart your computer or device, then try again.
  • AnyConnect was not able to establish a connection to the specified secure gateway. Please try connecting again.
There turns out to be a simple fix for this issue as well.

First, boot the PC and login with an account that has administrator privileges (or ensure that you have administrator privileges to the system in question).  In the Cortana "Ask me anything" box, enter:
Network and sharing
Select the Network and Sharing Control Panel from the search results Cortana offers.  (You can also get to this through right-clicking the wired or wireless network icon in the system tray near the clock display.)


In this window, in the upper-left-hand portion, you'll see "Change Adapter Settings".  Click that link.


Locate the "Cisco AnyConnect Secure Mobility Client Connection" icon in the list.  Right-click that icon and choose Properties.



Locate the "Internet Protocol Version 4 (TCP/IPv4)" item in the list.  Click to select it and then click the Properties button.


Most likely, you will see IP address information where gray smudges appear above.  If so, this is most likely your problem.

Click the "Obtain an IP address automatically" button and (if available) the "Obtain DNS server address automatically" button.  Click OK to close the properties.

If you have and use a TCP/IPv6 network, do the above steps for the "Internet Protocol Version 6 (TCP/IPv6)" item in the connection properties as well.  If you don't use TCP/IPv6 or don't know what I'm talking about, you can probably skip that step.

Close all of the Network Connection windows that you've opened in Windows 10 to this point.

Attempt to make a VPN connection now.

If the VPN server on the other end updates your AnyConnect software to a new version, you may need to perform the above steps a second time in order to get past the problem (as the update can cause the settings to "stick" again).



Wednesday, April 29, 2015

Choosing Storage for a VDI Implementation - Part 3 - Capacity

In the previous two posts in this series, I talked about IOPS (I/O Operations Per Second).  We discussed how Windows 7 generates disk I/O during startup, login, application launch, general use, logoff, and shutdown.  We discussed how to use tools to measure your own real-world IOPS usage, and how decisions you make in managing your VDI infrastructure can impact your IOPS needs.

IOPS tells you how fast your storage should be in order to meet your design goals and your users' expectations for performance.  Storage capacity is the other half of that discussion, helping you decide just how much of this storage you'll need.

Estimating your storage capacity is not as simple as picking a per-user figure and multiplying it by the number of users you're planning to virtualize.  Just as our discussion of IOPS included a need to identify how we planned to handle patching, antivirus scans, and other management issues, our discussion of storage capacity must do the same.

Following are some of the elements of your VDI infrastructure design that you'll want to consider when estimating storage:

  • Windows Configuration:  Ideally, you'll want to strip as many of the optional features out of the Windows desktop OS image as possible.  You'll leave in the features that your users need, but remove those they don't.  For example, you will probably want to leave out the Windows Backup component, since you'll most likely back up your VDI infrastructure as part of your datacenter backup strategy.  Backing up the individual VMs is probably redundant.
  • Application Deployment:  If you plan to load applications onto the VMs as you would in a physical PC implementation, you'll need to set aside enough storage to hold all those applications in addition to the OS itself.  If you plan to handle application deployment another way (such as providing a desktop shortcut to a ThinApp virtualized version of the application), this may reduce your disk capacity requirement for VDI.
  • Patch Strategy:  If you plan to keep your VMs current with respect to patches from Microsoft and the application vendors, you'll want to estimate how much storage is consumed by patches over time.  You'll need to incorporate room in your storage architecture for growth related to these patches and updates.
  • Persistent vs. Non-Persistent Images:  Most VDI solutions allow for both persistent VMs and non-persistent VMs.  A persistent VM is created, remains in use for an extended period of time, and is only destroyed when it's no longer needed or develops a problem.  A non-persistent VM is created when it's needed and destroyed when the user logs off (or on some similar schedule).  Non-persistent VMs only require storage while they're active, and will generally require less storage than their persistent counterparts.
  • User Data Storage:  If user data is kept in a folder redirection share or is not stored within the VM itself, you'll need less storage on the VM.  If the VMs are persistent and user data is stored directly on the VM, you'll need to allocate enough storage to enable the users to save the files they need directly on the VM.  You'll also need to account for the growth of the users' data over time.
  • Antivirus Strategy:  If you'll be running an antivirus package inside the VM, you'll need storage space for that.  You'll also need storage space for virus definition files, logs, virus quarantines, etc.  This can be estimated by examining the storage used in your current physical PCs and monitoring its growth over time.  If you plan to use an antivirus tool at the hypervisor level, you can reduce your storage capacity associated with the antivirus client.
  • Number of Types of Images:  If you'll use several different VM images, configured for different users or groups of users, you'll need to allocate storage for those images.  Remember to account for storage for the images themselves, as well as VMs cloned from the images.
  • VDI Deployment Technology:  Some VDI technologies allow for the use of a single full "master" image on disk.  When VMs are created from the master, a relatively small "delta" file is created for each VM.  When something is saved to that VM's hard disk by the OS or an application, the changed data is saved to this "delta" disk rather than to the master image.  This design is common in non-persistent VM architectures.  Because only a single master image is stored, the storage capacity needed is lower.  Consult the vendor's documentation for help estimating your capacity requirements.

As you architect your VDI solution, you may encounter other design elements that impact storage capacity.  Be sure to go back and adjust your estimates when this happens, to ensure that you budget for enough storage to handle your needs.


Wednesday, April 22, 2015

Choosing Storage for a VDI Implementation - Part 2 - More on IOPS

In the previous post, we talked about how to estimate your need for peak and steady-state IOPS by identifying a target boot time for your VMs, target login time, and steady-state disk usage.  In this post, we'll look at other factors that impact IOPS.  Many of these are situations unique to virtual desktops when compared with physical desktops, while some may apply to both.  My guidance here is going to be a bit more general, in part because there are many ways each of these things can be handled.  The exact products, design options, etc., you select will have a material impact on the IOPS needed for VDI.

For example, you'll want to consider:
  • Patching and Rebooting Strategy:  If you plan to patch all of the VMs at the same time, this will increase the total number of IOPS you'll need for the environment.  If you randomize the patching of VMs, you can reduce your IOPS need but will increase the length of your patch window.  If you plan to reboot at the end of patching, you'll increase IOPS requirements, too.  Rebooting all your VMs at one time will require a lot more IOPS than staggering the schedule.
  • Antivirus:  How you choose to handle antivirus in your VDI environment also affects IOPS.  Running a traditional full desktop antivirus tool in your VMs means that there will be regular antivirus scans (consuming a lot of IOPS), regular downloading and application of signature and software updates (consuming more IOPS), etc.  Running an antivirus tool on the hypervisor level to scan VMs can reduce the IOPS consumed per host somewhat, but may increase the cost of the licensing.
  • Application Deployment:  If applications will be installed directly on the VMs, as they are in a traditional physical PC, more storage capacity will be needed (since each VM will store a full copy of the application).  It the applications will be virtualized and run from another server, storage requirements will be reduced considerably as only one copy of the application is being stored.  If applications are layered, the storage and IOPS impact will depend on the solution being used.  You'll want to measure the IOPS and disk capacity associated with your selected application deployment strategy.
  • RAID Usage:  If your storage solution will incorporate RAID 5 protection, you need to account for the additional IOPS associated with RAID activity.  Given the typical 80% write load of Windows desktop IOPS, you need to account for 4 times the number of writes to physical storage in a RAID 5 configuration. So, if your desktops need 20 IOPS to perform properly, you'll need 4 read IOPS and 16 write IOPS. With RAID 5, that becomes a back-end IOPS load of 4 reads and 64 writes, or 68 IOPS per desktop!  You'll need to size the environment accordingly.
Choosing to patch your VMs on a gradual strategy, using a hypervisor-side antivirus scanner, and using applications hosted on a server can all reduce your IOPS needs.  Choosing to patch all your VMs at one time and reboot them immediately will improve your security, but also results in more IOPS consumed.

If at all possible, you'll want to do a proof-of-concept or pilot VDI implementation using the same patch/reboot strategy, antivirus tool, and application deployment technologies you plan to use in your production implementation, and use those same IOPS analysis tools to ensure that you validate how those decisions impact your IOPS consumption.

A Warning About Analysis Tools

In the previous post, I talked about workload analysis tools like Sysinternals Process Monitor and Liquidware Labs Stratusphere FIT.  These tools can provide an accurate estimate of the IOPS consumed by your users in their real-world PC usage, and provide an extremely valuable input into your storage design process for VDI.

Be careful when using tools like these that you consider your typical user's workday.  For example, if you have the tool monitoring the PC 24 hours a day, 7 days a week, but your staff only work from 8am to 5pm on weekdays, your average and minimum IOPS estimates are likely to be extremely low.

Take the following example.  We have a group of users who work from 8am to 5pm daily, taking a lunch break from 12pm to 1pm.  While they're actively working with software on their PCs, they are generating 40 IOPS of disk activity.  When they're at lunch or at home, the PC is using a token amount of about 3 IOPS.  If we monitored this on a 24-hour basis, we'd see:

  • 8-9am: 40 IOPS
  • 9-10am: 40 IOPS
  • 10-11am: 40 IOPS
  • 11am-12pm: 40 IOPS
  • 12pm-1pm: 3 IOPS (the user is at lunch)
  • 1pm-2pm: 40 IOPS
  • 2pm-3pm: 40 IOPS
  • 3pm-4pm: 40 IOPS
  • 4pm-5pm: 40 IOPS
  • 5-6pm: 3 IOPS (the user has gone home)
  • 6-7pm: 3 IOPS
  • 7-8pm: 3 IOPS
  • 8-9pm: 3 IOPS
  • 9-10pm: 3 IOPS
  • 10-11pm 3 IOPS
  • 11pm-12am: 3 IOPS
  • 12-1am: 3 IOPS
  • 1-2am: 3 IOPS
  • 2-3am: 3 IOPS
  • 3-4am: 3 IOPS
  • 4-5am: 3 IOPS
  • 5-6am: 3 IOPS
  • 6-7am: 3 IOPS
  • 7-8am: 3 IOPS

If our tool averages the user's IOPS over the 24-hour period, it will report an average usage of 15 IOPS for this user.  If instead we look at the user's typical work day of 8am to 5pm, and ignore the hours when the user isn't present (including their lunch hour), the figure is 40 IOPS. That's a huge difference!  If we architect our storage needs around the 24-hour 15-IOPS average, we're going to experience storage issues well before we get all our users virtualized.

Unless you can adjust your users' work schedules so that they're able to make use of the 24 hours in a day, rather than the usual 8-hour workday, make sure that the tool you're using to estimate a user's workload is looking at the hours when that person is actually using the PC.  Otherwise you will grossly underestimate your IOPS requirements.

Conclusion

The way you plan to manage your VDI infrastructure impacts your IOPS requirements.  You'll want to make some decisions around that before choosing your storage solution, and try to validate the effect of those changes with analysis tools.  Just be sure that your analysis tools are measuring the times of day when the VMs are actually being used, and not the full 24x7 period - or you will wind up drastically underestimating your IOPS needs.