VSAN and Network IO Control / VDS part 2

About a week ago I wrote this article about VSAN and Network IO Control. I originally wrote a longer article that contained more options for configuring the network part but decided to leave a section out of it for simplicity sake. I figured as more questions would come I would publish the rest of the content I developed. I guess now is the time to do so.

In the configuration described below we will have two 10GbE uplinks teamed (often referred to as “etherchannel” or “link aggregation”). Due to the physical switch capabilities the configuration of the virtual layer will be extremely simple. We will take the following recommended minimum bandwidth requirements in to consideration for this scenario:

  • Management Network –> 1GbE
  • vMotion VMkernel –> 5GbE
  • Virtual Machine PG –> 2GbE
  • Virtual SAN VMkernel interface –> 10GbE

When the physical uplinks are teamed (Multi-Chassis Link Aggregation) the Distributed Switch load balancing mechanism is required to be configured as:

  1. IP-Hash
    or
  2. LACP

It is required to configure all portgroups and VMkernel interfaces on the same Distributed Switch using either LACP or IP-Hash depending on the type physical switch used. Please note all uplinks should be part of the same etherchannel / LAG. Do not try to create anything fancy here as a physically and virtually incorrectly configured team can and probably will lead to more down time!

  • Management Network VMkernel interface = LACP / IP-Hash
  • vMotion VMkernel interface = LACP / IP-Hash
  • Virtual Machine Portgroup = LACP / IP-Hash
  • Virtual SAN VMkernel interface = LACP / IP-Hash

As various traffic types will share the same uplinks we also want to make sure that no traffic type can push out other types of traffic during times of contention, for that we will use the Network IO Control shares mechanism.

We will work under the assumption that only have 1 physical port is available and all traffic types share the same physical port for this exercise. Taking a worst case scenario approach in to consideration will guarantee performance even in a failure scenario. By taking this approach we can ensure that Virtual SAN always has 50% of the bandwidth to its disposal while leaving the remaining traffic types with sufficient bandwidth to avoid a potential self-inflicted DoS. When both Uplinks are available this will equate to 10GbE, when only one uplink is available the bandwidth is also cut in half; 5GbE. It is recommended to configure shares for the traffic types as follows:

 

Traffic Type Shares Limit
Management Network  20 n/a
vMotion VMkernel Interface  50 n/a
Virtual Machine Portgroup  30 n/a
Virtual SAN VMkernel Interface  100 n/a

 

The following diagram depicts this configuration scenario.

Disable “Disk.AutoremoveOnPDL” in a vMSC environment!

Last week I tweeted the recommendation to disable the advanced setting Disk.AutoremoveOnPDL in a vSphere 5.5 vMSC environment:

Based on this tweet I received a whole bunch of questions. Before I explain why I want to point out that I have contacted the folks in charge of the vMSC program and have requested them to publish a KB article asap on this subject.

With vSphere 5.5 a new setting was introduced called “Disk.AutoremoveOnPDL”. When you install 5.5 this setting is set to 1 which means it is enabled. What it does is the following:

The host automatically removes the PDL device and all paths to the device if no open connections to the device exist, or after the last connection closes. If the device returns from the PDL condition, the host can discover it, but treats it as a new device. Data consistency for virtual machines on the recovered device is not guaranteed.

(Source: http://pubs.vmware.com/vsphere-55/index.jsp?topic=%2Fcom.vmware.vsphere.storage.doc%2FGUID-45CF28F0-87B1-403B-B012-25E7097E6BDF.html)

In a vMSC environment you can understand that removing devices which are in a PDL state is not desired. As when the issue that caused the PDL has been solved (from a networking or array perspective) customers would expect the LUNs to automatically appear again. However, as they have been removed a “rescan” is needed to show these devices again instantly, or you will need to wait for the vSphere periodic path evaluation to occur. As you can imagine, in a vSphere Metro Storage Cluster environment (stretched storage) you expect devices to always be there instantly on recovery… even when they are in a PDL or APD state they should be available instantly when the situation has been resolved.

For now, I recommend to set Disk.AutoremoveOnPDL to 0 instead of the default of 1:

Hopefully soon this KB on the topic of Disk.AutoremoveOnPDL will be updated to reflect this.

Startup News Flash part 8

Part 8 already of the Startup News Flash. It is just a short one, not too many new things but some worth mentioning in my opinion.

Infinio just announced Infinio Accelerator 1.0. I wrote about what Infinio is and does in this article, in short: Infinio has developed a virtual appliance that sits in between your virtual machine storage traffic and your NFS datastore. The Infinio virtual appliance enhances storage performance by caching IO. Their primary use case is to do caching in memory. Infinio’s primary marketing message is: “100% software only – No new hardware, no reboots, no downtime”. It will accelerate any workload type running on NFS and is available for a shockingly (if you ask me) low price of 499, and they offer a free 30-day trial.

Recently a new startup was revealed named Coho Data, formerly known as Convergent.io. I wrote an introduction about them a couple of weeks ago which I suggest reading to get a better understanding of what Coho is or does. In short: Coho Data built a scale-out hybrid storage solution (NFS for VM workloads). With hybrid meaning a mix of SATA and SSD. This for obvious reasons, SATA bringing you capacity and flash providing you raw performance. Today I read an article about a new round of funding, 25 Million lead by Andreesseen Horowitz. Yes, that is no pocket change indeed. Hopefully this new round of funding will allow Coho to bring things to the next level! Congratulations…

Just a short one this round, hopefully more news next time… I would suspect so as Storage Field Day 4 is scheduled the week of the 13th.

EMC VPLEX and Storage DRS / Storage IO Control

At VMworld various people asked me why VMware did not support the use of Storage DRS and Storage IO Control in a VPLEX Metro environment. This was something new to me and when someone pointed me to a KB article I started digging.

When discussing it with the various teams the following is what we concluded for EMC VPLEX, this is what I drafted up. I have requested the KB to be updated in a more generic fashion (text all the way down below) so that the support statement will apply for all vMSC configurations. Hopefully will be published soon. The EMC specific statement, which I provided to the EMC VPLEX team, will look roughly as follows:

EMC VPLEX supports three different configurations, namely VPLEX Local, VPLEX Metro and VPLEX Geo. This KB article describes the supported configurations for VPLEX Local and VPLEX Metro with regards to Storage DRS (SDRS) and Storage I/O Control (SIOC). VMware supports Storage DRS and Storage IO Control on EMC VPLEX in each of the two configurations with the restrictions described below.

VPLEX Local:
In a VPLEX Local configuration VPLEX volumes are contained within site/location. In this configuration the following restrictions apply:
- Storage IO Control is supported
- Storage DRS is supported
- A Datastore Cluster should only be formed out of similar volumes
- It is recommended to run Storage DRS in “Manual Mode” to control the point in time migrations occur

VPLEX Metro:
In a VPLEX Metro configuration VPLEX volumes are distributed across locations/sites. In this configuration the following restrictions apply:
- Storage IO Control is not supported
- Storage DRS is only supported when “IO Metric” is disabled
- It is recommended to run Storage DRS in “Manual Mode” to control the point in time migrations occur
- Each location/site should have a Datastore Cluster formed only out of dvols (Distributed VPLEX volumes) which are part of the same consistency groups, and only with site bias to that particular location / site!
- Example: Site A will have Datastore Cluster A which contains all dvols with bias to Site A.

The more generic support statement will roughly look like this:

This KB article describes the supported configurations for vSphere Metro Storage Cluster (vMSC) environments with regards to Storage DRS (SDRS) and Storage I/O Control (SIOC). VMware supports Storage DRS and Storage IO Control with the restrictions described below.

In a vMSC configuration volumes are distributed across locations/sites. In both uniform and non-uniform configurations the following restrictions apply:
- Storage IO Control is not supported
- Storage DRS is only supported when “IO Metric” is disabled
- It is recommended to run Storage DRS in “Manual Mode” to control the point in time migrations occur
- Each location/site should have a Datastore Cluster formed only out of stretched datastore and only with site bias to that particular location / site
- Example: Site A will have Datastore Cluster A which contains all stretched datastores with bias to Site A.

Hopefully this will help the folks implementing vMSC today to make the decision around the usage of SDRS. KB team has informed me they are working on the update and as soon as it has been published I will update this article.

** KB Article has been updated: http://kb.vmware.com/kb/2042596 **

vSphere Metro Storage Cluster using Virtual SAN, can I do it?

This question keeps on coming up over and over again lately, vSphere Metro Storage Cluster (vMSC) using Virtual SAN, can I do it? (I guess the real question is “should you do it”.) It seems that Virtual SAN/VSAN is getting more and more traction, even though it is still beta and people are trying to come up with all sorts of interesting usecases. At VMworld various people asked if they could use VSAN to implement a vMSC solution during my sessions and the last couple of weeks this question just keeps on coming up in emails etc.

I guess if you look at what VSAN is and does it makes sense for people to ask this question. It is a distributed storage solution with a synchronous distributed caching layer that allows for high resiliency. You can specify the number of copies required of your data and VSAN will take care of the magic for you, if a component of your cluster fails then VSAN can respond to it accordingly. This is what you would like to see I guess:

Now let it be clear, the above is what you would like to see in a stretched environment but unfortunately not what VSAN can do in its current form. I guess if you look at the following it becomes clear why it might not be a such a great idea to use VSAN for this use case at this point in time.

The problem here is:

  • Object placement: You will want that second mirror copy to be in Location B but you cannot control it today as you cannot define “failure domains” within VSAN at the moment.
  • Witness placement: Essentially you want to have the ability to have a 3rd site that functions as a tiebreaker when there is a partition / isolation event.
  • Support: No one has tested/certified VSAN over distance, in other words… not supported

For now, the answer is to the question can I use Virtual SAN to build a vSphere Metro Storage Cluster is: No, it is not supported to span a VSAN cluster over distance. The feedback and request from many of you has been heard loud and clear by our developers and PM team… And at VMworld it was already mentioned by one of the developers that he was intrigued by the use case and he would be looking in to it in the future. Of course, there was no mention of when this would happen or even if it would ever happen.