I’d already blogged about this on the VMware blog, but I figured I would share it here as well. The vSphere Metro Storage Cluster with vSphere 6.0 white paper has been released. I worked on this paper together with my friend Lee Dilworth, it is an updated version of the paper we did in 2012. It contains all of the new best practices for vSphere 6.0 when it comes to vSphere Metro Storage Cluster implementations, so if you are looking to implement one or upgrade an existing environment make sure to read it!
VMware vSphere Metro Storage Cluster Recommended Practices
VMware vSphere Metro Storage Cluster (vMSC) is a specific configuration within the VMware Hardware Compatibility List (HCL). These configurations are commonly referred to as stretched storage clusters or metro storage clusters and are implemented in environments where disaster and downtime avoidance is a key requirement. This best practices document was developed to provide additional insight and information for operation of a vMSC infrastructure in conjunction with VMware vSphere. This paper explains how vSphere handles specific failure scenarios, and it discusses various design considerations and operational procedures. For detailed information about storage implementations, refer to documentation provided by the appropriate VMware storage partner.
Right on Duncan, I was just looking for this information today. Thx
Hello Duncan
Thanks for the link. While I was reading this paper, I wondered why there is no way in (S)DRS to make ‘should rules’ linking DRS VM groups with datastores or DS clusters?
Any clue if this is in the vSphere 6 roadmap?
Chris
Not to my knowledge Chris. I will put in a request,
Thanks Duncan, that would be great for matching with vplex consistency groups or similar solutions.
Hi Duncan
I am also looking for a way to link DRS VM groups to SDRS. Basically, I would want to force DRS to always svmotion the files along with any vmotion it initiates. This would simplify a VPLEX-based Metro cluster setup considerably. At the moment we round-robin place new VMs manually in the correpsonding clusters in the same site so that the VM and the files are always in the same site. This limits the value of DRS and makes automation with vRO rather difficult.
Have you got any news on whether this is on the roadmap?
Thanks!
Yes understood, and we are working on enhancements in this space, but I cannot comment on when / if / how they will be released unfortunately at this point yet.
Duncan – I recently discovered a vCenter related bug with vSphere 6.0 / Metro Cluster with VPLEX during an Implementation.
Essentially, when there’s an APD event, VMs randomly either restart or don’t restart. They enter in an invalid state with a VM Hardware version 1 error. The case is presently with engineering and I am quite surprised at the slow or limited feedback. Needless to say, not a very happy end customer.
I am more than willing to provide the details offline.
Is there a licensing requirement for this? Would this work with a Robo Advanced site license?
Considering the high costs from a hardware perspective, I have never met a customer who used anything lower than vSphere Enterprise. Stretched storage typically isn’t cheap. Yes this would work with lower SKUs as well, but some functionality like VM/Host rules may not work as expected.
Hi Duncan,
Which is the maximum number of hosts in stretched clusters in vsphere 6?
Best regards
I am not sure to be honest, I assumed it is 64 just like a regular vSphere HA/DRS cluster. I have not seen anything else.
I had read that the maximum was 32 for stretched clusters but I don’t remember where 🙁
and didn’t found any further info.
just found this kb that says 64: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007545
Thanks a lot!! 🙂
Hello Duncan! Could you please give me, just in brief, the simple explanation when it is better to use metro-storage cluster instead of disaster recovery solution (SRM)? I have found some information in the “Stretched Clusters and VMware vCenter Site Recovery Manager” document,
http://www.vmware.com/resources/techresources/10262
but it is pretty old.Mar 20, 2012
Hello Duncan,
I always see the example in vSphere Metro Storage Cluster with datacenter A and datacenter B (as DR site), but this solution is possible if we want two datacenters active-active (DC A and DC B) and also a third site as the Disaster Recovery?
If it is possible how it would be?
Thanks a lot
Sure this is possible, we have customers doing this today. It just like a DR solution, but then attached to stretched, so you just combine the two…
You mean for example implement between Datacenter 1 and Datacenter 2 vMSC and use SRM with the DR site?. Is this the best practise? or it is better keep DC1 and DC2 without “vMSC solution” (both just active) and use the vMSC with the DR site (replicate both DC to the DR) ?
I do not know if i explain myself ok 🙂
THanks a lot
Duncan your paper calls for disabling DiskAutoremoveOnPDL, yet the VMware KB (https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2059622) says for vMSC is should be enabled in vSphere 6+. I’ve had mixed results – in my VPLEX vMSC it works fine, but in my VMAX SRDF/Metro vMSC it causes problems (failure for the device to come back except through reboot). Just curious why the different recommendations. Thanks.
Please read this blog post:
http://www.yellow-bricks.com/2016/03/21/vmsc-disk-autoremoveonpdl-vsphere-6-x-higher/
I requested the KB and I requested an update of the paper. Unfortunately the white paper process is slow.
Hi Duncan,
we are looking into Metro cluster but are currently using SRM. we Run Vsphere 6. Are they still mutually exclusive? I see no new documentation on there interaction since the 2012 paper quoted earlier.
Even dough I would not see how they could work together you never know.
You can actually use a metro cluster with SRM, however it will not be a single vCenter Server based stretched solution, it will be a stretched storage solution with 2 vCenter instances and 2 clusters.
each cluster using either SRM or stretch cluster function but not both at the same time? Still no interaction between SRM and Geo cluster…. they just coexist in the same environment but not on the same cluster… Am I right?
Read this one: https://www.vmware.com/files/pdf/products/SRM/vmware-site-recovery-manager-whats-new.pdf
Very Enlightening. Thanks Duncan. This will help me understand better. As this is the next step the Architecture people want to dig in…