• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

BC-DR

Failover using SRM might be slow…

Duncan Epping · Jan 26, 2009 ·

I was just reading an excellent weekly technical digest by VMware’s Michael White and noticed the mention of a KB article on SRM. This KB article has the following describtion:

With VMware Site Recovery Manager 1.0 Update 1, recovery of a VM might take a long time.  The recovery time during a test or real recovery will be longer when more VM’s are involved.  The Change Network Settings task might time out during the test or real failover.  This is due to the serial fashion in which Site Recovery Manager waits until a guest heartbeat is seen prior to customizing the VM.

This problem can be encountered when running the following ESX versions:

  • ESX 3.5 Update 2 and Update 3
  • ESX 3.0.2 and 3.0.3

In other words, the behaviour of ESX has changed and it might be useful and beneficial for SRM to change this behaviour again. We are talking about a 5 minute delay, that’s 5 minutes for each VM. You can imagine that running a recovery plan can and will take a long time when this setting isn’t changed. Here’s the solution which has also been outlined in the KB article.

Set hostd heartbeat delay to 40.
Disconnect the host from VC (Right click on host in VI Client and select “Disconnect” )
Login as root to the ESX Server with SSH.
Using a text editor such as nano or vi , edit the file /etc/vmware/hostd/config.xml
Set the “heartbeatDelayInSecs” tag under “vmsvc” to 40 seconds as shown here:

<vmsvc>
<heartbeatDelayInSecs>40</heartbeatDelayInSecs>
<enabled>true</enabled>
</vmsvc>

Restart the management agents for this change to take effect. See Restarting the Management agents on an ESX Server (1003490).
Reconnect the host in VC ( Right click on host in VI Client and select “Connect” )

Storage Replication Adapter: discoverLuns…

Duncan Epping · Jan 20, 2009 ·

Today I was implementing Site Recovery Manager with a colleague(Thanks Andy!!). During the configuration of the HP EVA SRA(Storage Replication Adapter) we received the following error:

discoverLuns script failed to execute properly

The error indicates that that the first part of the SRA configuration “discoverArrays” worked but when discovering the LUN’s and it’s replica’s it bailed out(23%). So after checking the config files and log files we decided to run the scriptfile, that the SRA uses, manually and see what happens.

First we created an XML file which feeds the script. The XML file contained the following, which can be copied from the SRM Log files:

<?xml version=”1.0″ encoding=”ISO-8859-1″?>
<Command>
<Name>discoverLuns</Name>
<ConnectSpec>
<Name>HP StorageWorks EVA Virtualization Adapter</Name>
<Address>san.yellow-bricks.com</Address>
<Username>user</Username>
<Password>password</Password>
</ConnectSpec>
<ArrayId>YB-SAN-01</ArrayId>
<OutputFile>C:\TEMP\SAN.Log</OutputFile>
<LogLevel>trivia</LogLevel>
</Command>

Now we were able to run the script with the XML file as input:

perl command.pl < file.xml

In our case running the script manually with the XML file as input didn’t return an error. This gave us the idea that it might be account or permissions related. During the configuration of the SRA we entered domain credentials, which were the same as the account being used during the manual run of the script. So it wasn’t the SRA account that was causing these problems.

After diving into the configuration we stumbled upon the SRM Service. The SRM service was started with a Local System account. We decided to change the account used for the Service from “Local” to a domain account… and indeed problem solved.

One would expect this to be part of the SRA documentation, but it isn’t. We contacted VMware Support and they had the same configuration running in their test environment except for the fact that they weren’t using AD authentication. In their case the Local System account just worked fine.

I’ve emailed Support all the log files and according to them our suspicion was correct. It seems to be related to the HP EVA SRA. The HP SRA seems to use the wrong account for authentication at one point during the script. Next up: Contact HP Support and let’s see if they can a) fix this or b) update their documentation.

Site Recovery Manager Client Patch

Duncan Epping · Jan 15, 2009 ·

Yesterday VMware released a patch for the Site Recovery Manager Client:

Client Patch for Site Recovery Manager 1.0 Update 1
This client patch for Site Recovery Manager 1.0 Update 1 corrects a performance problem observed at sites that support more than seven ESX hosts.

If you are running SRM, it might worth downloading the patch and installing it! The download page contains a section on how to apply this patch, besure to read it cause it’s not a case of “next, next, finish”.

How to use trusted certificates with SRM

Duncan Epping · Jan 15, 2009 ·

When we were playing around with Site Recovery Manager last week we had the opportunity to ask a bunch of questions to Lee Dilworth. Lee is a Specialist System Engineer for Site Recovery Manager. During the discussion Lee told us about a document that Horst Mundt, also a VMware employee, wrote about using trusted certificates. We received the document via email and I wanted to share this with you. After a quick search on the internet I noticed that Horst already uploaded his document to VI:OPS:

SRM establishes a secure connection between the protected and the recovery site.

There are two options for authentication: Credential based or certificate based.

If you install SRM into an existing environment, make sure to choose the method that is appropriate for your environment.

If you have not changed the default certificates that were installed by the VMware vCenter server setup then go for credential based authentication. You do not need to read the this document.

If you have installed SSL certificates issued by a trusted CA on your VMware vCenter servers then go for certificate based authentication. The document explains how certificates need to be setup in order for this to work.

Site Recovery Manager and MSCS

Duncan Epping · Jan 13, 2009 ·

When reading several SRM docs I was wondering if Microsoft Clustering was supported or not. I knew that in version 1.0 it wasn’t supported. When reading the Release Notes I noticed the following:

Full Support for RDM devices
SRM now provides full support for virtual machines that use raw disk mapping (RDM) devices. This enables support of several new configurations, including Microsoft Cluster Server. (Virtual machine templates cannot use RDM devices.)

Microsoft Clustering Services is supported as of Update 1. But you will need to keep in mind when creating your Recovery Plan that all nodes of the cluster will belong to the same Protection Group and can possibly  be started up or shutdown at the same time….. I haven’t configured SRM in combination with MSCS so far, if any of you has any tips/tricks let me know.

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 10
  • Page 11
  • Page 12
  • Page 13
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Advertisements




Copyright Yellow-Bricks.com © 2025 · Log in