vsphere metro storage cluster

New whitepaper available: vSphere Metro Storage Cluster Recommended Practices (6.5 update)

Duncan Epping · Oct 24, 2017 ·

I had many requests for an updated version of this paper, so the past couple of weeks I have been working hard. The paper was outdated as it was last updated around the vSphere 6.0 timeframe, and it was only a minor update. I looked at every single section and added in new statements and guidance around vSphere HA Restart Priority for instance. So for those running a vSphere Metro Storage Cluster / Stretched Cluster of some kind, please read the brand new vSphere Metro Storage Cluster Recommended Practices (6.5 update) white paper.

It is available on storagehub.vmware.com in PDF and for reading within your browser. Any questions and comments, please do not hesitate to leave them here.

VMs not getting killed after vMSC partition has lifted

Duncan Epping · Jan 12, 2017 ·

I was talking to a VMware partner over the past couple of weeks about challenges they had in a new vSphere Metro Storage Cluster (vMSC) environment. In their particular case they simulated a site partition. During the site partition three things were expected to happen:

VMs that were impacted by APD (or PDL) should be killed by vSphere HA Component Protection
- If HA Component Protection does not work, vSphere should kill the VMs when the partition is lifted
VMs should be restarted by vSphere HA

The problems faced were two-fold, VMs were restarted by vSphere HA, however:

vSphere HA Component Protection did not kill the VMs
When the partition was lifted vSphere did not kill the VMs which had lost the lock to the datastore either

It took a while before we figured out what was going on, at least for one of the problems. Lets start with the second problem first, why aren’t the VMs killed when the partition is lifted? vSphere should do this automatically. Well vSphere does this automatically, but only when there’s a Guest Operating system installed and an I/O is issued. As soon as an I/O is issued by the VM then vSphere will notice the lock to the disk is lost and obtained by another host and kill the VM. If you have an “empty VM” then this won’t happen as there will not be any I/O to the disk. (I’ve filed a feature request to kill VMs as well even without disk I/O or without a disk.) So how do you solve this? If you do any type of vSphere HA testing (with or without vMSC) make sure to install a guest OS so it resembles real life.

Now back to the first problem. The fact that vSphere HA Component Protection does not kick in is still being debated, but I think there is a very specific reason for it. vSphere HA Component Protection is a feature that kills VMs on a host so they can be restarted when an APD or a PDL scenario has occurred. However, it will only do this when it is:

Certain the VM can be restarted on the other side (conservative setting)
There are healthy hosts in the other partition, or we don’t know (Aggressive)

First one is clear I guess (more info about this here), but what does the second one mean? Well basically there are three options:

Availability of healthy host: Yes >> Terminate
Availability of healthy host: No >> Don’t Terminate
Availability of healthy host: Unknown >> Terminate

So in the case you where you have VMCP set to “Aggressively” failover VMs, it will only do so when it knows hosts are available in the other site or when it does not know the state of the hosts in the other site. If for whatever reason the hosts are deemed as unhealthy the answer to the question if there are healthy hosts available or not will be “No”, and as such the VMs will not be killed by VMCP. The question remains, why are these hosts reported as “unhealthy” in this partition scenario, that is something we are now trying to figure out. Potentially it could be caused by misconfigured Heartbeat Datastores, but this is still something to be confirmed. If I know more, I will update this article.

Just received confirmation from development, heartbeat datastores need to be available on both sites for vSphere HA to identify this scenario correctly. If there are no heartbeat datastores available on both sites then it could happen that no hosts are marked as healthy, which means that VMCP will not instantly kill those VMs when the APD has occured.

vMSC and Disk.AutoremoveOnPDL on vSphere 6.x and higher

Duncan Epping · Mar 21, 2016 ·

I have discussed this topic a couple of times, and want to inform people about a recent change in recommendation. In the past when deploying a stretched cluster (vMSC) it was recommended by most storage vendors and by VMware to set Disk.AutoremoveOnPDL to 0. This basically disabled the feature that automatically removes LUNs which are in a PDL (permanent device loss) state. Upon return of the device a rescan would then allow you to use the device again. With vSphere 6.0 however there has been a change to how vSphere responds to a PDL scenario, vSphere does not expect the device to return. To be clear, the PDL behaviour in vSphere was designed around the removal of devices, they should not stay in the PDL state and return for duty, this did work however in previous version due to a bug.

With vSphere 6.0 and higher VMware recommends to set Disk.AutoremoveOnPDL to 1, which is the default setting. If you are a vMSC / stretched cluster customer, please change your environment and design accordingly. But before you do, please consult your storage vendor and discuss the change. I would also like to recommend testing the change and behaviour to validate that the environment returns for duty correctly after a PDL! Sorry about the confusion.

KB article backing my recommendation was just posted: https://kb.vmware.com/kb/2059622. Documentation (vMSC whitepaper) is also being updated.

Stretched Clusters: Disable failover of specific VMs during full site failure

Duncan Epping · Oct 21, 2015 ·

Last week at VMworld when presenting on Virtual SAN Stretched Clusters someone asked me if it was possible to “disable the fail-over of VMs during a full site failure while allowing a restart during a host failure”. I thought about it and said “no, that is not possible today”. Yes you can “disable HA restarts” on a per VM basis, but you can’t do that for a particular type of failure.

The last statement is correct, HA does not allow you to disable restarts for a site failure. You can fully disable HA for a particular VM though. But when back at my hotel I started thinking about this question and realized that there is a work around to achieve this. I didn’t note down the name of the customer who asked the question, so hopefully you will read this.

When it comes to a stretched cluster configuration typically you will use VM/Host rules. These rules will “dictate” where VMs will run, and typically you use the “should” rule as you want to make sure VMs can run anywhere when there is a failure. However, you can also create “must” rules, and yes this means that the rules will not be violated and that those VMs can only run within that site. If a host fails within a site then the impacted VMs will be restarted within the site. If the site fails then the “must rule” will prevent the VMs from being restarted on the hosts in the other location. The must rules are pushed down to the “compatibility list” that HA maintains, which will never be violated by HA.

Simple work-around to prevent VMs from being restarted in another site.

vSphere Metro Storage Cluster with vSphere 6.0 paper released

Duncan Epping · Jul 8, 2015 ·

I’d already blogged about this on the VMware blog, but I figured I would share it here as well. The vSphere Metro Storage Cluster with vSphere 6.0 white paper has been released. I worked on this paper together with my friend Lee Dilworth, it is an updated version of the paper we did in 2012. It contains all of the new best practices for vSphere 6.0 when it comes to vSphere Metro Storage Cluster implementations, so if you are looking to implement one or upgrade an existing environment make sure to read it!

VMware vSphere Metro Storage Cluster Recommended Practices

VMware vSphere Metro Storage Cluster (vMSC) is a specific configuration within the VMware Hardware Compatibility List (HCL). These configurations are commonly referred to as stretched storage clusters or metro storage clusters and are implemented in environments where disaster and downtime avoidance is a key requirement. This best practices document was developed to provide additional insight and information for operation of a vMSC infrastructure in conjunction with VMware vSphere. This paper explains how vSphere handles specific failure scenarios, and it discusses various design considerations and operational procedures. For detailed information about storage implementations, refer to documentation provided by the appropriate VMware storage partner.