• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

vsphere metro storage cluster

vSphere HA respecting VM-Host should rules?

Duncan Epping · Mar 5, 2015 ·

A long time ago I authored this white paper around stretched clusters. During out testing the one thing where we felt HA was lacking was the fact that it would not respect VM-Host should rules. So if you had these configured in a cluster and a host would fail then VMs could be restarted on ANY given host in the cluster. The first time that DRS would then run it would move the VMs back to where they belong according to the configured VM-Host should rules.

I guess one of the reasons for this was the fact that originally the affinity and anti-affinity rules were designed to be DRS rules. Over time I guess we realized that these are not DRS rules but rather Cluster rules. Based on the findings we did when authoring the white paper we filed a bunch of feature requests and one of them just made vSphere 6.0. As of vSphere 6.0 it is possible to have vSphere HA respecting VM-Host should rules through the use of an advanced setting called “das.respectVmHostSoftAffinityRules”.

When “das.respectVmHostSoftAffinityRules” is configured then vSphere HA will try to respect the rule when it can. So if there are any hosts in the cluster which belong to the same VM-Host group then HA will restart the respective VM on that host. Of course as this is a “should rule” HA has the ability to ignore the rule when needed. You can imagine that there could be a scenario where none of the hosts in the VM-Host should rule is available, in that case HA will restart the VM on any other host in the cluster. Useful? Yes, I think so!

VPLEX Geosynchrony 5.2 supporting up to 10ms latency with HA/DRS

Duncan Epping · May 14, 2014 ·

I was just informed that as of last week VPLEX Metro with Geosynchrony 5.2 has been certified for a round trip (RTT) latency up to 10ms while running HA/DRS in a vMSC solution. So far all vMSC solutions had been certified with 5ms RTT and this is a major breakthrough if you ask me. Great to see that EMC spent the time certifying this including support for HA and DRS across this distance.

Round-trip-time for a non-uniform host access configuration is now supported up to 10 milliseconds for VPLEX Geosynchrony 5.2 and ESXi 5.5 with NMP and PowerPath

More details on this topic can be found here:

  • http://kb.vmware.com/kb/2007545
  • http://logicalblock.wordpress.com/2014/05/13/vmsc-support-now-extended-to-10-msec-rtt/

VMworld session on vSphere Metro Storage Cluster on youtube!

Duncan Epping · Nov 16, 2013 ·

I didn’t even realize this, but just found out that the session Lee Dilworth and I did at VMworld on the subject of vSphere Metro Storage Clusters can actually be viewed for free on youtube!

There are some more sessions up on youtube, so make sure you have a look around!

Disable “Disk.AutoremoveOnPDL” in a vMSC environment!

Duncan Epping · Nov 8, 2013 ·

** UPDATE 20-March 2016 **

When using vSphere 6.0 or higher, please be advised that Disk.AutoremoveOnPDL needs to be set to 1 (default value) in order for “PDL Scenarios” to be handles correctly in vMSC based infrastructures. Please do not change the default value, or when upgrading to vSphere 6.x please set this value to 1 when changed in previous version.

** UPDATE 20-March 2016 **

Last week I tweeted the recommendation to disable the advanced setting Disk.AutoremoveOnPDL in a vSphere 5.5 vMSC environment:

If you are upgrading to 5.5 in a vMSC environment, please for now set “Disk.AutoremoveOnPDL” to “0”. So that it is disabled.

— Duncan Epping (@DuncanYB) October 28, 2013

Based on this tweet I received a whole bunch of questions. Before I explain why I want to point out that I have contacted the folks in charge of the vMSC program and have requested them to publish a KB article asap on this subject.

With vSphere 5.5 a new setting was introduced called “Disk.AutoremoveOnPDL”. When you install 5.5 this setting is set to 1 which means it is enabled. What it does is the following:

The host automatically removes the PDL device and all paths to the device if no open connections to the device exist, or after the last connection closes. If the device returns from the PDL condition, the host can discover it, but treats it as a new device. Data consistency for virtual machines on the recovered device is not guaranteed.

(Source: http://pubs.vmware.com/vsphere-55/index.jsp?topic=%2Fcom.vmware.vsphere.storage.doc%2FGUID-45CF28F0-87B1-403B-B012-25E7097E6BDF.html)

In a vMSC environment you can understand that removing devices which are in a PDL state is not desired. As when the issue that caused the PDL has been solved (from a networking or array perspective) customers would expect the LUNs to automatically appear again. However, as they have been removed a “rescan” is needed to show these devices again instantly, or you will need to wait for the vSphere periodic path evaluation to occur. As you can imagine, in a vSphere Metro Storage Cluster environment (stretched storage) you expect devices to always be there instantly on recovery… even when they are in a PDL or APD state they should be available instantly when the situation has been resolved.

For now, I recommend to set Disk.AutoremoveOnPDL to 0 instead of the default of 1:

Hopefully soon this KB on the topic of Disk.AutoremoveOnPDL will be updated to reflect this.

EMC VPLEX and Storage DRS / Storage IO Control

Duncan Epping · Nov 1, 2013 ·

At VMworld various people asked me why VMware did not support the use of Storage DRS and Storage IO Control in a VPLEX Metro environment. This was something new to me and when someone pointed me to a KB article I started digging.

When discussing it with the various teams the following is what we concluded for EMC VPLEX, this is what I drafted up. I have requested the KB to be updated in a more generic fashion (text all the way down below) so that the support statement will apply for all vMSC configurations. Hopefully will be published soon. The EMC specific statement, which I provided to the EMC VPLEX team, will look roughly as follows:

EMC VPLEX supports three different configurations, namely VPLEX Local, VPLEX Metro and VPLEX Geo. This KB article describes the supported configurations for VPLEX Local and VPLEX Metro with regards to Storage DRS (SDRS) and Storage I/O Control (SIOC). VMware supports Storage DRS and Storage IO Control on EMC VPLEX in each of the two configurations with the restrictions described below.

VPLEX Local:
In a VPLEX Local configuration VPLEX volumes are contained within site/location. In this configuration the following restrictions apply:
– Storage IO Control is supported
– Storage DRS is supported
– A Datastore Cluster should only be formed out of similar volumes
– It is recommended to run Storage DRS in “Manual Mode” to control the point in time migrations occur

VPLEX Metro:
In a VPLEX Metro configuration VPLEX volumes are distributed across locations/sites. In this configuration the following restrictions apply:
– Storage IO Control is not supported
– Storage DRS is only supported when “IO Metric” is disabled
– It is recommended to run Storage DRS in “Manual Mode” to control the point in time migrations occur
– Each location/site should have a Datastore Cluster formed only out of dvols (Distributed VPLEX volumes) which are part of the same consistency groups, and only with site bias to that particular location / site!
– Example: Site A will have Datastore Cluster A which contains all dvols with bias to Site A.

The more generic support statement will roughly look like this:

This KB article describes the supported configurations for vSphere Metro Storage Cluster (vMSC) environments with regards to Storage DRS (SDRS) and Storage I/O Control (SIOC). VMware supports Storage DRS and Storage IO Control with the restrictions described below.

In a vMSC configuration volumes are distributed across locations/sites. In both uniform and non-uniform configurations the following restrictions apply:
– Storage IO Control is not supported
– Storage DRS is only supported when “IO Metric” is disabled
– It is recommended to run Storage DRS in “Manual Mode” to control the point in time migrations occur
– Each location/site should have a Datastore Cluster formed only out of stretched datastore and only with site bias to that particular location / site
– Example: Site A will have Datastore Cluster A which contains all stretched datastores with bias to Site A.

Hopefully this will help the folks implementing vMSC today to make the decision around the usage of SDRS. KB team has informed me they are working on the update and as soon as it has been published I will update this article.

** KB Article has been updated: http://kb.vmware.com/kb/2042596 **

  • « Go to Previous Page
  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to Next Page »

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive", the “vSphere Clustering Technical Deep Dive” series, and the host of the "Unexplored Territory" podcast.

Upcoming Events

May 24th – VMUG Poland
June 1st – VMUG Belgium

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2023 · Log in