VMworld #NotSupported lightning talk slides – Hacking SRM

I presented this 15 minute talk at VMworld about hacking SRM or actually hacking the Storage Replication Adapter which is part of SRM. I noticed William Lam shared his slides so I figured I would do the same. This slidedeck was based on two articles I did a while back around hacking the SRA, you might want to read them as well. ( 1 , 2 )

I hope they are useful. Once again, thanks to Randy Keener for coming up with this excellent idea and thanks to the brownbag guys for helping hosting this great initiative. Lets hope we will see more of this next year at VMworld,

Where should you get SRA’s from?

I received Michael White’s(VMware BCDR Specialist SE) weekly newsletter over the weekend and the following is a question I also receive on a regular basis so why not blog it?!

I had a disagreement with a friend about where to get SRA’s from. He was under the impression that we didn’t have the arrays in our premises for all of the SRA’s on the market and so it was OK to take an SRA from a vendor as they could test it.  The fact is we do have most, or all of  the arrays for each SRA in-house but that is actually not relevant.  It is important to only take SRA’s from the VMware web site for a different reason.  When a vendor finishes updating or writing an SRA, it is run against a special program that produces a log.  The SRA and log are sent to VMware and we check them out.  Sometimes they are sent back for improving or fixes.  This continues until the SRA passes and then it is posted on our web site.  If you took the SRA from the vendor you may accidentally get an SRA that in a week or a month we might decline and send back to be fixed.  So please, make sure you get the only safe copy of an SRA available, and that is from our web site!

SRM 4.0; where is my SRA?

I received the following question a couple of times already; where is my <random vendor+array> SRA?

The answer is simple, if it is not on the VMware it has not been certified yet. Every SRA needs to go through a quality assurance process to validate/certify it. If this hasn’t been completed the SRA will not be available for download on the VMware website.

Most of the questions I received where about the EMC SRDF SRA by the way. I have been told that it should be available really soon, I am not allowed to disclose any details though but who knows Chad might chip in and give a hint…

SRM 4.0, Howto’s by Cormac Hogan

One of my colleagues, Cormac Hogan, posted some excellent guides for SRM 4.0:

  1. Steps to setup VMware vCenter Site Recovery Manager 4.0 with IBM SVC
    VMware vCenter Site Recovery Manager 4.0 is a disater recovery product which uses array replication technologies to failover from one site to another. This particular document looks at one particular vendor (IBM) and one particular array model (SVC). The document will take you through the replication and snapshot setup steps.
  2. Steps to setup VMware vCenter Site Recovery Manager 4.0 with EMC Celerra NAS Replications
    VMware vCenter Site Recovery Manager 4.0 is a disater recovery product which uses array replication technologies to failover from one site to another. This particular document looks at one particular vendor (EMC) and one particular array model (Celerra). This document also focuses on configuring a new feature introduced in SRM 4.0, namely NAS (NFS) replictions. The document will take you through the replication and snapshot setup steps using only the command line interface (CLI).
  3. Steps to setup VMware vCenter Site Recovery Manager 4.0 with NetApp NAS Replications
    VMware vCenter Site Recovery Manager 4.0 is a disater recovery product which uses array replication technologies to failover from one site to another. This particular document looks at one particular vendor (NetApp) and uses their FAS simulator. However the steps are also applicable to their standard arrays. This document also focuses on configuring a new feature introduced in SRM 4.0, namely NAS (NFS) replictions. The document will take you through the replication and snapshot setup steps using only the command line interface (CLI).

Make sure to read them if you are implementing SRM in combination with one of the mentioned arrays!

SRDF SRA and the SPC-2 bit

I was at a customer site helping out configuring SRM yesterday. During the configuration of the EMC SRDF SRA(Storage Replication Adapter) we ran into a weird issue. Although we could see the paired arrays with a green “okay sign” we did not see any replicated LUNs. First things I usually check in these cases are:

  1. Is the LUN already formatted as VMFS
  2. Does it hold any VMs

In this case we met both requirements. After checking all the configuration settings on the SRA side, SRM side and the SAN we noticed that the SPC-2 bit was not enabled. Of course we knew that is was a required setting according to the FC San Config Guide(page 57) but this is definitely not the kind of behavior I would expect to see when it’s disabled. Anyway I did a quick search on our internal mailing list and as it appears we were not the first to encounter these issues.

The SPC-2 bit is something that comes up every now and then, so if you’ve got EMC Symmetrix storage and you are not sure whether you have applied VMware’s recommendations please read the FC San Config Guide and avoid future problems. Please bare in mind though that when you set the SPC-2 bit you might and probably will need to re-signature the disk.

SRM: Three updates, EMC / LSI / Hitachi SRA

Just noticed three SRAs were updated. Might be worth checking out:

  1. EMC Symmetrix Storage Replication Adapter
    Version 2.0.0.21 | Released 04/20/2009
    Fixed : SRDF Adapter for VMWARE Site Recovery Manager (SRM) times out while trying to enumerate LUNs from a local Symmetrix array.
  2. LSI Storage Replication Adapter
    Version 1.00.30.12| Released 04/01/2009
  3. Hitachi Storage Replication Adapter 2
    Version 1.0.8| Released 03/24/2009

I added one of the fixes which is included with the EMC Symmetrix SRA cause I blogged about it a couple of months ago and I’m glad this has been solved.

SRM 1.0 Update 1, patch 3

I just noticed this SRM update, it’s just a small patch but looking at the problems it corrects it might be beneficial to upgrade!

Site Recovery Manager 1.0 Update 1 Patch 3 Latest Release Version: 1.0.1.2587 | 04/09/09 | 154949

This patch corrects the following problems:

  • a problem that prevents protected virtual machines from following recommended Distributed Resource Scheduler (DRS) settings when recovering to more than one DRS cluster.
  • a problem observed at sites that support more than seven ESX hosts. If you refresh inventory mappings when connected to such a site, the display becomes unresponsive for up to ten minutes.
  • a problem that could prevent SRM from computing LUN consistency groups correctly when one or more of the LUNs in the consistency group did not host any virtual machines.
  • a problem that could cause the client user interface to become unresponsive when creating protection groups with over 300 members several problems that could cause SRM to log an error rmessage vim.fault.AlreadyExists when recomputing datastore groups.
  • a problem that could cause SRM to log an Assert Failed: “ok” @ src/san/consistencyGroupValidator.cpp:64 error when two different datastores match a single replicated device returned by the SRA.
  • a problem that could cause SRM to remove static iSCSI targets with non-test LUNs during test recovery several problems that degrade the performance of inventory mapping.