Stretched Clusters and Site Recovery Manager

Duncan Epping · Mar 23, 2012 ·

My colleague Ken Werneburg, also known as “@vmKen“, just published a new white paper. (Follow him if you aren’t yet!) This white paper talks about both SRM and Stretched Cluster solutions and explains the advantages and disadvantages of either. It provides a great overview in my opinion on when a stretched cluster should be implemented or when SRM makes more sense. Various goals and concepts are discussed and I think this is a must read for everyone exploring implementing a Stretched Clusters or SRM.

http://www.vmware.com/resources/techresources/10262

This paper is intended to clarify concepts involved with choosing solutions for vSphere site availability, and to help understand the use cases for availability solutions for the virtualized infrastructure. Specific guidance is given around the intended use of DR solutions like VMware vCenter Site Recovery Manager and contrasted with the intended use of geographically stretched clusters spanning multiple datacenters. While both solutions excel at their primary use case, their strengths lie in different areas which are explored within.

Comments

Duncan says

23 March, 2012 at 11:51

Yes, DR but no HA.. chalk and cheese. L2 stretch clusters proved HA across the entire lifecycle including OS patching.

I would not recommend confusing the two capabilities. SRM does not provide a complete HA solution.
Marko says

26 March, 2012 at 08:44

Duncan,

thank you for sharing the link to this whitepaper. Could you explain a little bit more what why a stretched cluster and SRM couldn’t be used together?

Thanks, Marko
Duncan says

26 March, 2012 at 14:53

Because a stretched cluster means you only have 1 vCenter Server. SRM requires two vCenter Server instances 🙂
Marko says

26 March, 2012 at 17:31

Okay, let’s assume there are 3 (three) datacenters, DC A, DC B and DC C. A stretched cluster between A and B is configured so I can’t use SRM at the same time for A and B.
If there is a stretched cluster between A and B it should be possible to use SRM to replicate (AB) to DC C. Right?
- Dave Gogerly says
  
  25 May, 2014 at 13:50
  
  Hi,
  Has any body configured above solution with vPlex please? Ex Site A and B being a stretched cluster , and Site C being the DR site for both sites A & C. Incase any of the sites A or B fail, the failed site can be recovered at SITE C through SRM. If so can some one please give me a break down on the configuration.
  Many thanks,
  Dave
Doug says

26 March, 2012 at 17:38

I think the key discussion points I have, once we get past explaining the requirements and differences are some limitations of the ‘magic’ with stretched HA/DRS clusters:

1) Not site aware (a VM and its storage may be at opposite sites. You potentially take the latency hit for EACH I/O)

2) Will not automatically keep workloads at their appropriate/optimal site (a majority of your users may be accessing the VM from site A, but DRS could relocate the VM to site B to balance load. Now, your users get to take the latency hit… makes troubleshooting ‘slowness’ a lot of fun!)

3)Cannot handle dependencies between virtual machines (consider a 3-tier application/service. What happens when the web tier and database tier are at site A and the app tier is at site B? What if the storage for the DB tier is also at site B?)

Fun stuff, especially considering lack of integration with the underlying storage.
Duncan Epping says

26 March, 2012 at 18:30

Yes that should be possible Marko
Marko says

27 March, 2012 at 07:46

@Duncan
Thank you, now it’s clear to me.

@Doug
I know, but sometimes you need to accept some discomfort to get a bigger solution running.
Imho 1), 2) and 3) should be manageable by an VMware solution. Maybe we see such features in the future? Duncan?
Duncan says

27 March, 2012 at 15:54

All of the issues mentioned can be worked around by simple use of DRS Affinity Rules and Datastore Clusters. I am writing a whitepaper on the topic as we speak which will give architectural and operational guidance. Hopefully out in a couple of weeks.
Manuel says

11 April, 2012 at 12:39

@ Duncan
As usual, your posts are very interesting and helpful. Thanks a lot.

As I am designing a new vSphere 5 environment and determining the possible shot-distance failover solution where I consider stretched clusters and long-distance failover with SRM like Marko mentioned I am highly interested in your mentioned white paper.

Please let us know as soon as your work is done.
Duncan Epping says

11 April, 2012 at 13:09

I will let you guys know for sure. Working on editing the docs right now.
Bob Greenway says

12 April, 2012 at 15:51

While I do have reservations regarding stretched clusters,(mainly down to I/O latency/LUN location already mentioned) DRS groups are a way to effectivly allow HA of a site, while stil maintaining site affinity for specific servers.

And having implemented both, as the WP says, they each have their pros and cons
Justin says

3 May, 2012 at 00:22

Im actually looking to do this for a DR strategy until VPLEX and SRM are supported together.

I plan on using VPLEX metro for replication and using the stretch cluster to manage the hosts/vm’s on the prod and dr locations.

Disclaimer: it shouldnt be called DR since the locations are 15miles apart 🙂 More like TR
Munishpal says

24 May, 2012 at 08:42

If I am not wrong, With the latest version of VPLEX 5.1 Stretched vSphere Clusters and DR with SRM ,both can be achieved.

http://virtualgeek.typepad.com/virtual_geek/2012/05/stretched-vsphere-clusters-dr-with-srm-why-not-both.html
Munishpal says

24 May, 2012 at 09:16

I have also started open discussion about the same. Please feel free to post your comments

http://vcommunique.blogspot.in/2012/05/emc-vplex-srm.html
Duncan Epping says

24 May, 2012 at 09:29

Yes latest VPLEX release will support that…
Ratnadeep Bhattacharya says

26 June, 2012 at 08:47

Hi Duncan,

I have one question. It may be silly but I don’t have the infrastructure at my disposal currently to work this one out.

Let’s say I create a stretched cluster. Would it be possible to set up SRM on this cluster for failover to my DR site which is, say, a third vCenter?

Would be interesting to find out.

Regards,
Deep
Duncan says

14 July, 2012 at 09:29

@Ratnadeep Bhattacharya: Yes that should be possible.

Related

Reader Interactions

Comments