I had a couple of questions around the exact settings for vSphere Metro Storage Clusters with vSphere 5.5. It was the third time in two weeks I shared the same info about vMSC with vSphere 5.5 so I figured I would write a quick blog making the information a bit easier to find through google. Below you can find the settings required for a vSphere Metro Storage Cluster with vSphere 5.5. Note that in-depth details around operations / testing can be found in this white paper: version 5.x // version 6.0.
- VMkernel.Boot.terminateVMOnPDL = True
- Das.maskCleanShutdownEnabled = True
- Disk.AutoremoveOnPDL = 0
I want to point out that if you migrate from 5.0 or 5.1 that Host Advanced Setting “VMkernel.Boot.terminateVMOnPDL” replaces disk.terminateVMOnPDLDefault (/etc/vmware/settings). Das.maskCleanShutdownEnabled is actually configured to “true” by default as of vSphere 5.1 and later, but personally I prefer to set it anyway so that I know for sure it has been configured accurately. Then there is Disk.AutoremoveOnPDL, this setting is new in vSphere 5.5 as discussed here. Make sure to disable it, as PDLs are likely to be temporary there is no point removing the devices and then having to do a rescan to have them reappear, it only slows down your process recovery. (EMC also recommends this by the way, see page 21 of this PDF on vMSC/VPLEX).
** UPDATE 20-March 2016 **
When using vSphere 6.0 or higher, please be advised that Disk.AutoremoveOnPDL needs to be set to 1 (default value) in order for “PDL Scenarios” to be handles correctly in vMSC based infrastructures. Please do not change the default value, or when upgrading to vSphere 6.x please set this value to 1 when changed in previous version.
** UPDATE 20-March 2016 **

This week I had the pleasure of talking to fellow dutchy
The conversation of course didn’t end there, lets get in to some more details. We discussed the use case first. PeaSoup is a hosting / cloud provider. Today they have two clusters running based on Virtual SAN. They have a management cluster which hosts all components needed for a vCloud Director environment and then they have a resource cluster. The great thing for PeaSoup was that they could start out with a relatively low investment in hardware and scale fast when new customers on-board or when existing customers require new hardware.
Harold pointed out that the only down side of this particular Fujitsu configuration was the fact that it only came with a disk controller that is limited to “RAID O” only, no passthrough. I asked him if they experienced any issues around that and he mentioned that they had 1 disk failure so far and that is resulted in having to reboot the server in order to recreate a RAID-0 set for that new disk. Not too big of a deal for PeaSoup, but of course if possible he would prefer to prevent this reboot from being needed. The disk controller by the way is based on the LSI 2208 chipset and it is one of things PeaSoup was very thorough about, making sure it was supported and that it had a high queue depth. The “HCL” came up multiple times during the conversation and Harold felt that although doing a lot of research up front and creating a scalable and repeatable architecture takes time, it also results in a very reliable environment with predictable performance. For a cloud provider reliability and user experience is literally your bread and butter, they couldn’t afford to “guess”. That was also one of the reasons they selected a VSAN Ready Node configuration as a foundation and tweaked where their environment and anticipated workload would require it.



