u1

Which isolation response should I use?

Duncan Epping · May 31, 2012 ·

I wrote this article about split brain scenarios for the vSphere Blog. Based on this article I received some questions around which “isolation response” to use. This is not something that can be answered by a simple “recommended practice” and applied to all scenarios out there. Note that below has got everything to do with your infrastructure. Are you using IP-Based storage? Do you have a converged network? All of these impact the decision around the isolation response.

The following table however could be used to make a decision:

Likelihood that host will retain access to VM datastores	Likelihood that host will retain access to VM network	Recommended Isolation policy	Explanation
Likely	Likely	Leave Powered On	VM is running fine so why power it off?
Likely	Unlikely	Either Leave Powered On or Shutdown	Choose shutdown to allow HA to restart VMs on hosts that are not isolated and hence are likely to have access to storage
Unlikely	Likely	Power Off	Use Power Off to avoid having two instances of the same VM on the VM network
Unlikely	Unlikely	Leave Powered On or Power Off	Leave Powered on if the VM can recover from the network/datastore outage if it is not restarted because of the isolation, and Power Off if it likely can’t.

An introduction to Storage DRS

Duncan Epping · May 22, 2012 ·

Today someone asked for a Storage DRS intro, I wrote one for our book a year ago and figured I would share it with the world. I still feel that Storage DRS is one of the coolest features in vSphere 5.0 and I think that everyone should be using this! I know there are some caveats (1, 2) when you are using specific array functionality or for instance SRM, but nevertheless… this is one of those features that will make an admin’s life that much easier! If you are not using it today, I highly suggest evaluating this cool feature.

*** out take from the vSphere 5.0 Clustering Deepdive ***

vSphere 5.0 introduces many great new features, but everyone will probably agree with us that vSphere Storage DRS is most the exciting new feature. vSphere Storage DRS helps resolve some of the operational challenges associated with virtual machine provisioning, migration and cloning. Historically, monitoring datastore capacity and I/O load has proven to be very difficult. As a result, it is often neglected, leading to hot spots and over- or underutilized datastores. Storage I/O Control (SIOC) in vSphere 4.1 solved part of this problem by introducing a datastore-wide disk-scheduler that allows for allocation of I/O resources to virtual machines based on their respective shares during times of contention.

Storage DRS (SDRS) brings this to a whole new level by providing smart virtual machine placement and load balancing mechanisms based on space and I/O capacity. In other words, where SIOC reactively throttles hosts and virtual machines to ensure fairness, SDRS proactively makes recommendations to prevent imbalances from both a space utilization and latency perspective. More simply, SDRS does for storage what DRS does for compute resources.

There are five key features that SDRS offers:

Resource aggregation
Initial Placement
Load Balancing
Datastore Maintenance Mode
Affinity Rules

Resource aggregation enables grouping of multiple datastores, into a single, flexible pool of storage called a Datastore Cluster. Administrators can dynamically populate Datastore Clusters with datastores. The flexibility of separating the physical from the logical greatly simplifies storage management by allowing datastores to be efficiently and dynamically added or removed from a Datastore Cluster to deal with maintenance or out of space conditions. The load balancer will take care of initial placement as well as future migrations based on actual workload measurements and space utilization.

The goal of Initial Placement is to speed up the provisioning process by automating the selection of an individual datastore and leaving the user with the much smaller-scale decision of selecting a Datastore Cluster. SDRS selects a particular datastore within a Datastore Cluster based on space utilization and I/O capacity. In an environment with multiple seemingly identical datastores, initial placement can be a difficult and time-consuming task for the administrator. Not only will the datastore with the most available disk space need to be identified, but it is also crucial to ensure that the addition of this new virtual machine does not result in I/O bottlenecks. SDRS takes care of all of this and substantially lowers the amount of operational effort required to provision virtual machines; that is the true value of SDRS.

However, it is probably safe to assume that many of you are most excited about the load balancing capabilities SDRS offers. SDRS can operate in two distinct modes: No Automation (manual mode) or Fully Automated. Where initial placement reduces complexity in the provisioning process, load balancing addresses imbalances within a datastore cluster. Prior to vSphere 5.0, placement of virtual machines was often based on current space consumption or the number of virtual machines on each datastore. I/O capacity monitoring and space utilization trending was often regarded as too time consuming Over the years, we have seen this lead to performance problems in many environments, and in some cases, even result in down time because a datastore ran out of space. SDRS load balancing helps prevent these, unfortunately, common scenarios by making placement recommendations based on both space utilization and I/O capacity when the configured thresholds are exceeded. Depending on the selected automation level, these recommendations will be automatically applied by SDRS or will need to be applied by the administrator.

Although we see load balancing as a single feature of SDRS, it actually consists of two separately-configurable options. When either of the configured thresholds for Utilized Space (80% by default) or I/O Latency (15 milliseconds by default) are exceeded, SDRS will make recommendations to prevent problems and resolve the imbalance in the datastore cluster. In the case of I/O capacity load balancing, it can even be explicitly disabled.

Before anyone forgets, SDRS can be enabled on fully populated datastores and environments. It is also possible to add fully populated datastores to existing datastore clusters. It is a great way to solve actual or potential bottlenecks in any environment with minimal required effort or risk.

Datastore Maintenance Mode is one of those features that you will typically not use often; you will appreciate it when you need. Datastore Maintenance Mode can be compared to Host Maintenance Mode: when a datastore is placed in Maintenance Mode all registered virtual machines, on that datastore, are migrated to the other datastores in the datastore cluster. Typical use cases are data migration to a new storage array or maintenance on a LUN, such as migration to another RAID group.

Affinity Rules enable control over which virtual disks should or should not be placed on the same datastore within a datastore cluster in accordance with your best practices and/or availability requirements. By default, a virtual machine’s virtual disks are kept together on the same datastore.

For those who want more details, Frank Denneman wrote an excellent series about Datastore Clusters which might interest you:

Part 1: Architecture and design of datastore clusters.
Part 2: Partially connected datastore clusters.
Part 3: Impact of load balancing on datastore cluster configuration.
Part 4: Storage DRS and Multi-extents datastores.
Part 5: Connecting multiple DRS clusters to a single Storage DRS datastore cluster.
Part 6: Aggregating datastores from multiple storage arrays into one Storage DRS datastore cluster.

Some other articles that might be of use:

SDRS and Auto-Tiering solutions – The Injector (Duncan)
Storage DRS Load Balance Frequence (Frank)
SDRS Out-Of-Space avoidance (Frank)
Storage vMotion and the mirror-mode driver (Duncan)

The following video will give an overview of the above mentioned features… worth checking.

Scripts release for Storage vMotion / HA problem

Duncan Epping · Apr 17, 2012 ·

Last week when the Storage vMotion / HA problem went public I asked both William Lam and Alan Renouf if they could write a script to detect the problem. I want to thank both of them for their quick response and turnaround, they cranked the script out in literally hours. The scripts were validated multiple times in a VDS environment and worked flawless. Note that these scripts can detect the problem in an environment using a regular Distributed vSwitch and a Nexus 1000v, the script can only mitigate the problem though in a Distributed vSwitch environment. Here are the links to the scripts:

Perl: Identifying & Fixing Virtual Machines Affected By SvMotion / VDS Issue (William Lam)
PowerCLI – Identifying and fixing VMs Affected By SvMotion / VDS Issue (Alan Renouf)

Once again thanks guys!

Limiting stress on storage caused by HA restarts by lowering restart concurrency?

Duncan Epping · Apr 16, 2012 ·

I had a question last week, and it had me going for a while. The question was if “das.perHostConcurrentFailoversLimit” could be used to lower the hit on storage during a boot storm. By default this advanced option is set to 32. Meaning that a max of 32 VMs will be restarted by HA on a single host. The question was if lowering this value to for instance 16 would help reducing the stress on storage when multiple hosts would fail, or for instance in a blade environment when a chassis would fail.

At first you would probably say “Yes of course it will”. Having only 16 restarts concurrently vs 32 should cut the stress in half… Well not exactly. The point here is that this setting is:

A per host setting and not cluster wide
Addressing power on attempts

So what is the problem with that exactly? Well in the case of the per host setting, if you have a 32 node cluster and 8 would fail, there would still be a max of 384 VMs power on attempts concurrently. (32 – 8 failed host) * 16 VMs max restart per host. Yes it is a lot better than 768, but still a lot of VMs hitting your storage.

But more importantly, we are talking power-on attempts here! A power-on attempt does not equal the boot process of the virtual machine! It is just the initial process that flips the switch of the VM from “off” to “on”, check vCenter when you power on a VM, you will see the task as completed during the boot process of your VM. Reducing this number will reduce the stress hostd, but that is about it. In other words, if you lower it to 16 you will have less power-on attempts concurrently, but they will be handled fast by HOSTD and before you know it 16 new power-on attempts will be done, and near simultaneous!

The only way you can really limit the hit on storage and virtual machines sharing this storage would be by enabling Storage IO Control. SIOC will ensure that all VMs who are in need of storage resources will get it in a fair manner. The other option is to ensure that you are not overloading your datastores with a massive amount of VMs and not the IOPS to back the boot storm process up. I guess there is no real need to be overly concerned here though… How often does it happen that 50% of your environment fails? If it does, are you worried about that 15 minute performance hit, or worried about those 50% of the VMs being down?

Clarifying the SvMotion / VDS problem

Duncan Epping · Apr 14, 2012 ·

<Update>I asked William Lam if he could write a script to detect this problem and possibly even mitigate it. William worked on it over the weekend and just posted the result! Head over to his blog for the script! Thanks William for cranking it out this quick! For those who prefer PowerCLI… Alan Renouf just posted his version of the script! Both scripts provide the same functionality though!</Update>

I think there is some confusion around the SvMotion / VDS problem I described a couple of days back. Let me try to clarify it in a couple of simple steps.

First of all, this only applies to virtual machines that have been Storage vMotioned by vCenter 5.0 and are connected to a Distributed vSwitch. This could be either manually or using Storage DRS. So what is the exact problem?

When a VM is attached to a dvPortgroup it is connected to a port. This information is stored locally on the host and on the VMFS volume this VM is stored on.
This volume will contain a file which is named equal to the port number of this VM.
When the VM is Storage vMotioned to a different datastore this file is not created on the destination datastore
When the host fails on which the Storage vMotioned VM resides HA will attempt to restart that VM.
In order for HA to restart it and connect it to the dvPortgroup this file is required.
As the file is not available the restart fails.

You can simply resolve this by connecting the impacted VMs to a different dvPortgroup temporarily and then reconnect them back to the original portgroup. As soon as you’ve done that the file will be created on the datastore. For now this is a manual task, but I am sure some of my teammembers are working on a scripted solution as we speak… right Alan / William? 🙂