I regularly check William Lam’s section on the VMTN communities. William is definitely one of the most active contributors in terms of perl / vMA scripting. William wrote the famous ghettoVCB script which basically enables you to do full image level backups of your VMs. But that’s not the only script William wrote. He’s also written scripts for creating screenshots of VMs, resizing your vMA disks, hotplugging memory and CPUs, suspending VMs and a whole lot more. Definitely worth the bookmark: http://communities.vmware.com/docs/DOC-9852
vSphere CPU Scheduler whitepaper, this is it!!
This is the whitepaper I’ve been waiting for. By now we all know that the CPU Scheduler has changed. The only problem is that there wasn’t any official documentation about what changed and where we would benefit. Well this has changed. VMware just published a new whitepaper titled “The CPU Scheduler in VMware® ESX™ 4“.
The CPU scheduler in VMware ESX 4 is crucial to providing good performance in a consolidated environment. Since most modern processors are equipped with multiple cores per processor, systems with tens of cores running hundreds of virtual machines are common. In such a large system, allocating CPU resource efficiently and fairly is critical. In ESX 4, there are significant changes to the ESX CPU scheduler that improve performance and scalability. This paper describes these changes and their impact. This paper also provides details of the CPU scheduling algorithms in the ESX server.
I can elaborate all I want but I need you guys to read the whitepaper to understand why vSphere is performing a lot better than VI 3.5. (I will give you a hint: “cell”.)
Another whitepaper that’s definitely worth reading is “Virtual Machine Monitor Execution Modes: in VMware vSphere 4.0“.
The monitor is a thin layer that provides virtual x86 hardware to the overlying operating system. This paper contains VMware vSphere 4.0 default monitor modes chosen for many popular guests running modern x86 CPUs. While most workloads perform well under these default settings, a user may derive performance benefits by overriding the defaults. The paper examines situations where manual monitor mode configuration may be practical and provides two ways of changing the default monitor mode of the virtual machine in vSphere.
And while you arealready taking the time off to educate yourself you might also want to read the “FT Architecture and Performance” whitepaper. Definitely worth reading!
New book in town: vSphere Quick Start Guide
Months ago I shared an idea on Twitter about starting a “super blog”. This “super blog” should have contained the top articles of some of the best VMware or virtualization bloggers around. I asked myself what the added value would be and came to the conclusion that there already were many “super blogs” around. Although the “super blog” idea died a slow death the urge to collaborate with others on a challenging project still lived on. (Although Stephen Foskett picked it up and created Gestalt IT at the time.)
This is when the idea of a book series was born, which I for obvious reasons did not share with the twitter community. This book, and hopefully the rest of the following books, is written by six well known VMware community members. I don’t think they need further introduction so here are there names: Bernie Baker(thanks Chad for the tip), Thomas Bryant III, Stu Radnidge, Dave Mishchenko, Alan Renouf and myself of course.
The original idea was to publish a series of short topics, deep-dives, with each book limited to 150 pages, pocket size, sold at a minimum price and completely “diy”. For those unfamiliar with the term “diy” it means “do it yourself”. In other words we do not have a publisher helping us or funding it. We also don’t have a marketing budget, so we are relying on you guys to spread the word and we will be relying on Lulu.com for distributing/selling it.
When we started outlining our first book Ron Oglesby gave us the opportunity to rewrite the Quick Start Guide he and his team at RapidApp published in March 2007. We all agreed that this would be a good start of the series. Thanks again Ron! Of course this book will exceed the limit we agreed on of 150 pages, but it is worth it. We hope to get it done before VMworld and bring a couple of copies to VMworld but the clock is ticking fast and we are not there yet. We will keep you informed!
I just hope we can get all you guys as excited about new technology and VMware products in general as we are!
HA and Slot sizes
This has always been a hot topic, HA and Slot sizes/Admission Control. One of the most extensive (Non-VMware) articles is by Chad Sakac aka Virtual Geek, but of course since then a couple of things has changed. Chad commented on my HA Deepdive if I could address this topic, here you go Chad.
Slot sizes
Lets start with the basics.
What is a slot?
A slot is a logical representation of the memory and CPU resources that satisfy the requirements for any powered-on virtual machine in the cluster.
In other words a slot size is the worst case CPU and Memory reservation scenario in a cluster. This directly leads to the first “gotcha”:
HA uses the highest CPU reservation of any given VM and the highest memory reservation of any given VM.
If VM1 has 2GHZ and 1024GB reserved and VM2 has 1GHZ and 2048GB reserved the slot size for memory will be 2048MB+memory overhead and the slot size for CPU will be 2GHZ.
Now how does HA calculate how many slots are available per host?
Of course we need to know what the slot size for memory and CPU is first. Then we divide the total available CPU resources of a host by the CPU slot size and the total available Memory Resources of a host by the memory slot size. This leaves us with a slot size for both memory and CPU. The most restrictive number is the amount of slots for this host. If you have 25 CPU slots but only 5 memory slots the amount of available slots for this host will be 5.
As you can see this can lead to very conservative consolidation ratios. With vSphere this is something that’s configurable. If you have just one VM with a really high reservation you can set the following advanced settings to lower the slot size being used during these calculations: das.slotCpuInMHz or das.slotMemInMB. To avoid not being able to power on the VM with high reservations these VM will take up multiple slots. Keep in mind that when you are low on resources this could mean that you are not able to power-on this high reservation VM as resources are fragmented throughout the cluster instead of located on a single host.
Host Failures?
Now what happens if you set the number of allowed host failures to 1?
The host with the most slots will be taken out of the equation. If you have 8 hosts with 90 slots in total but 7 hosts each have 10 slots and one host 20 this single host will not be taken into account. Worst case scenario! In other words the 7 hosts should be able to provide enough resources for the cluster when a failure of the “20 slot” host occurs.
And of course if you set it to 2 the next host that will be taken out of the equation is the host with the second most slots and so on.
What more?
One thing worth mentioning, as Chad stated with vCenter 2.5 the number of vCPUs for any given VM was also taken in to account. This led to a very conservative and restrictive admission control. This behavior has been modified with vCenter 2.5 U2, the amount of vCPUs is not taken into account.
SRM FAQ!
I receive this great newsletter via email every week from Michael White, he’s one of our Specialist SE’s. Michael created a great VMware SRM document and this FAQ is part of it. I want to thank Michael for sharing it with the rest of the world.
Generic
I want to install SRM, what do I need to do?
It is important to understand the SRM installation overview. You must install using the order of operation as shown in the lab section of this document. You must do this on the protected site first, followed by the recovery side. Here is the outline:
- SRM application installed at Protected Site
- SRM application plug in installed in VI clients that connect with the Protected Site
- SRA installed at the Protected Site
- SRM application installed at Recovery Site
- SRM application plug in installed in VI clients that connect with the Recovery Site
- SRA installed at the Recovery Site
- SRM configured at the Protected Site
- SRM server pairing
- Array Configured – both Protected Site and Recovery Site
- Inventory Mapping
- Protection Group
- SRM configured at the Recovery Site
- Recovery Plan created
You should now test and tweak SRM. Remember the goal is to have the required VM’s running at the recovery site in the least amount of time.
What does an SRM lab require?
The ideal SRM lab requires the following:
- Two VirtualCenter servers
- Each VirtualCenter server would require at least one ESX server, and the Recovery should have two to show the integration with DRS as part of a recovery plan.
- Each of the two sites requires shared storage that replicates. And it needs to be on the compatibility list. This shared storage can be the NetApp simulator, HP / LeftHand VSA, or the EMC Simulator. It can also be actual hardare based shared storage that can replicate.
Some of the activities that can be shown would include:
- Test failover
- Actual failover
- Failover with IP customization
- Failover where multiple VM’s start on various ESX servers
- The use of a virtual switch that can connect VM’s on different ESX servers to a private network. This is very useful for testing. By using a VLAN testing is possible that doesn’t impact the public network. Remember that the test bubble network that can be used in SRM only provides for communications on a per ESX host basis.
When does SRM raise VC events?
SRM will raise VC events for the following conditions:
- Disk space low
- CPU use exceeded limit
- Memory low
- Remote Site not responding
- Remote Site heartbeat failed
- Recovery Plan Test started, ended, succeeded, failed, or cancelled
- Virtual Machine Recovery started, ended, succeeded, failed, or reports a warning
What are the recommended minimum alarm notifications?
We suggest the following alarm notifications. You can set them on the Alarm tab of the SRM status summary page. Most organization will utilize email notifications but there are other choices as well. Remember to set these suggest alarm notifications at both sides.
- Remote Site Down
- Remote Site Ping Failed
- Replication Group Removed
- Recovery Plan Destroyed
- License Server Unreachable
How do I plan for disk utilization due to SRM database?
Recently we brought out the database sizing tool. Find it at http://www.vmware.com/files/pdf/Site_Recovery_Manager_1.0U1_Database_Sizing_Calculator.xls.
Where can I find help for installing different array products?
The obvious is you can always visit the vendor of the array for help in the form of documents. You can also find information elsewhere. A VMware support person has written how to guides for a variety of different arrays. They can be bound at http://viops.vmware.com/home/people/chogan?view=overview . He has done an excellent job and I hope that his guides help you out.
How can I capture the log and configuration information for support to work with?
This is most easily done after Update 1 by the use of the “Generate Site Recovery Manager Log Bundle” command in the VMware VMware Site Recovery Manager Start Menu folder. Run this command on the SRM server. This command will produce a zipped file on your desktop. It will be in a MM-DD-YYYY-HH-MM.zip format where is it Month – Day – Year – Hours – Minutes. Always provide the logs with your request for help!
What is the account that is asked for during install used for?
The 1.0 installer prompted for a username during installation. This is the account SRM will use to communicate with the local VC server. Since SRM constantly monitors the local VC inventory, this user will be constantly logged into the local VC server. Changing the password for this account will make it impossible to use SRM. Please note that this should be an account in the Administrators group. By default, when you install SRM 1.0 or SRM 1.0 U1, all accounts in the Administrators group have complete access to SRM managed objects. Again, this has not changed with U1. Please try to use AD accounts when you install SRM, and when you log into SRM. Using local accounts can work, but it is a little tricky. If you need some guidance on using local accounts I can help. This account is NOT the account used by the system – the SRM service uses the Local System Account.
Can I change the IP information for the SRM server?
I would like to change the IP info for the SRM server once it is installed. Is this safe or is there a specific way to do this without issues? When changing the IP info for the SRM server, or if the credentials (account or password) need to be changed you will need to use a special utility to accomplish either of these changes. Once the change is done you will also need to pair the two sites again. You can find detailed info on how to do this on page 85, in Appendix C of the SRM Admin Guide.
How do I add a script to a Recovery Plan in a call out?
When you add a script to a call out in a recovery plan, it is an empty dialog. Use the information below to add a script that will work as expected. It is important to understand that the scripts or commands must be in the path of the VirtualCenter.
- Use full paths to all executables – for example “c:windowssystem32cmd.exe” instead of “cmd.exe”.
- You can use .exe or .com files only! Command line scripts can only call executables.
- To run a batch file you should start the shell command with “c:windowssystem32cmd.exe”. So it would look like “c:windowssystem32cmd.exe /c c:scriptsalarmscript.bat”.
How do I change the value for a script timeout?
You can increase or decrease this value by editing the SRM configuration file (vmware-dr.xml). Look for the following section:
<calloutCommandLineTimeout>600</calloutCommandLineTimeout>
Change value to the appropriate value.
During the configuration of SRM I receive a timeout after 300 seconds, how do I change the value for this timeout?
You can increase or decrease this value by editing the SRM configuration file (vmware-dr.xml). Look for the following section:
<CommandTimeout>300</CommandTimeout>
Change value to the appropriate value.
I would like to use trusted certificates with SRM – help!
You can use your own trusted certificates with SRM but it is more complicated than you might expect. There is some excellent information to help you be successful at http://viops.vmware.com/home/docs/DOC-1261 .
What happens if you move one of the protected VM’s to a datastore that is not part of the VM’s current protected group?
Protection will be revoked for the VM. It will have a small yellow triangle associated with it in its protection group. This will be true even if you move (such as storage VMotion) the VM to another different datastore that is replicated to the recovery site.
Can network customization work for operating systems other than Windows?
Yes. This includes operating systems from Novell, and Red Hat. The specific version information can be found in the SRM Compatibility Matrix document.
Understanding order of operation for bringing VM’s back online.
During the recovery period, the order of recovery VMs is not as obvious as it may suggest. Normal and Low priority protection groups (VMs) will be started one VM per ESX host at the same time. So you could have a number of Normal priority VM’s starting at the same time – but spread across various ESX servers. However, High priority starts VM’s serially regardless of how many hosts are involved. Misconfiguration of the security for storage arrays may impact the start order of VM’s. For example, if the security of the array means it cannot talk to a particular ESX host than that host will not be used to start VM’s during a recovery plan. It is possible to see this without any obvious error messages!
Can I fail-over VMs which have disks on two different arrays, for instance NetApp and EMC?
No, although you can install SRA’s of multiple vendors failing over a VM which has a disk on both arrays will not work.
What does the Repair button do?
The repair button is used when the protected site is not available, and some array reconfiguration is required. Normally it would be done at the protected site, but if it is not available than the repair button can be used.
Is it all over when the recovery plan fails?
You can have a recovery plan fail with some sort of error, but it will complete anything that it can complete. You could then address and solve the error, and run the recovery plan again and if you have correctly address the error your test may in fact correctly complete this time. It will not redo things that it has done correctly already. Once I had a problem with a VM starting and I let the replication finish, did a manual HBA refresh, and tried again. The two VM’s that had already started were not touched, but the third VM that had finished replicating now, was in fact started.
Troubleshooting
Where is the new Run and Test privileges?
After you update to Update 1 you should see a Run and a Test privilege in the roles and priviledges area but you may not. Restart VC and you will see them.
Where are the SRM server logs stored?
They can be found in:
C:Documents and SettingsAll UsersApplication DataVMwareVMware Site Recovery ManagerLogs
You will need to check the vmware-dr-index file to see what is the current log file.
I see a lot of recomputed datastore failures in my mixed 2.5 / 3.5 environment, what’s happening?
If you have ESX 2.5 hosts accessing a protected datastore you will see datastore recomputed datastore failures. Remove the ESX 2.5 host from the datastore.
I’m having pairing issues and it fails at specific %, why?
If you have an issue at approximately 24% it could be related to the license file not being live or installed. Reread the license file or restart the license service.
If you have an issue at approximately 82 or 84% you should make sure that the account you used to connect to the Recovery site has both VC and SRM admin rights. The specific role for SRM is Protected Site Administrator and on the Recovery Site it is called Recovery Site Administrator. This issue occurs most in a Microsoft domain world. The Administrator role includes both the Protected and Recovery site admin roles.
Things to check during troubleshooting of pairing issues would include firewalls between the sites and is the recovery site running VC successfully?
I’m configuring the SRDF SRA and although we replicated storage and it contains VMs I still don’t see “replicated LUNs”.
After checking all the configuration settings on the SRA side, SRM side and the SAN we noticed that the SPC-2 bit was not enabled. This setting is mandatory according to the FC San Config Guide(page 57) and solved our issues.
“Failed to connect to the management system address when executing the discoverArrays command.”
You should not often see this but it can be addressed by making sure the SRA is in fact installed on the recovery side. You may also need to check routing between the sites (in particular to the Recovery side SRA / storage management interface.
How do I change the SRM change of power state time out values?
The default value is 120 seconds which might not be long enough and could lead to issues when a power off is forced of a VM. You can increase or decrease this value by editing the SRM configuration file (vmware-dr.xml). Look for the following section:
<Recovery>
<powerStateChangeTimeout>120</ powerStateChangeTimeout>
</Recovery>
If this section is not in the .xml file add it. Don’t forget to restart the SRM Service.
Error: Failed to recover datastore:
This error usually indicates that the recovery side cannot communicate with the array on the recovery side. In the SRM logs on the recovery side you can see a Mapped LUN line (s) that will help you see what the protected side is mapped to on the recovery side. This will sometimes help you fix this error message.
We noticed a “SRM unlicensed error” in the logs but we have a good license installed.
If you change the SRM license file(s) you may have a small issue, as it is not the same process as changing an ESX or VC license. You would follow the normal steps of dropping the file in the license folder and rereading the license folder in the license tool. This would be enough for VC or ESX but is not enough for SRM. You could after these steps see the license in the VC Admin License view, but would still see the unlicensed errors in the SRM log. You need to restart the SRM service for the new license change to occur.
I cannot uninstall SRM successfully – what can I do?
Uninstalling SRM will normally require access to the VC that it is paired with. If you do not have that VC running it is hard to uninstall SRM. If you don’t cleanly uninstall SRM you cannot install it again. It is possible to uninstall with no VC if you read the screens carefully and answer appropriately, but I have seen where that doesn’t work. Use one of the ideas below to help if you need it. It is always best to use the Add Remove programs method to uninstall but if that doesn’t work the ideas below should.
msiexec.exe /qn /x {35A202EA-1549-4592-97A5-65F5E4CCDEC9}
Microsoft’s uninstall utility: http://support.microsoft.com/kb/29031
Only three Recovery Plans can run at the same time.
Not sure what the error message is if you try to do more than 3 but at least you now know that only 3 should be executed at the same time. This is due to the QA level of testing and will be significantly improved in the future.
Can I automatically rename my datastore back to it’s original name?
Edit the vmware-dr.xml file in the C:Program FilesSite Recovery ManagerConfig directory and look for a line that reads:
-
<fixRecoveredDatastoreNames>false</fixRecoveredDatastoreNames>
Change it to:
-
<fixRecoveredDatastoreNames>true</fixRecoveredDatastoreNames>
Can I change the administrator’s email address after the installation?
Extension.xml is the configuration xml file where you can change the Administrator Email:
<adminEmail>admin@yellow-bricks.com</adminEmail>
Why is Port 80 used in the install but port 443 later?
During install of SRM port 80 is specified and you cannot type in 443, but after the install is complete than SRM talks to VC on 443, so why is 80 specified in the install? Even though SRM uses SSL when it communicates to VC, it does not use port 443. SRM establishes a TCP connection to port 80, than uses an HTTP CONNECT request to establish a tunnel to the VC servers, then does an SSL handshake with the VC over that tunneled connection. The SRM installation enforces these semantics.
I need to rescan my storage twice before I actually see my LUNs can SRM also do this?
To enable the additional rescan, edit the vmware-dr.xml file at both the protected and recovery sites to add a <hostRescanRepeatCnt> element within the <SanProvider> element. Set the value of <hostRescanRepeatCnt> to 2, as shown in the following example:
<SanProvider>
.
.
.
<hostRescanRepeatCnt>2</hostRescanRepeatCnt>
</SanProvider>
For SQL server use, does the SRM DB user need the DB_OWNER permission?
For SQL server, the SRM DB user doesn’t not need the DB_OWNER permissions. As long as the schema has the same name as the username, and is the default schema for that user, and is owned by that user, then you are ok.
Unexpected MethodFault (dr.san.fault.ManagementSystemNotFound)
This error occurs after you upgrade the EqualLogic PS Series Interface SRA adapter to the Dell EqualLogic PS Series Interface. You can uninstall the new SRA and install the old one as a work around, but there is another option. You can locate the manifest.xml file in the SRA installation directory, modify the SRA name in it, and restart the SRM service and you would be good to go.
The password of my SRM account has changed how do I change the password for SRM?
You can have some issues with changing account passwords after everything is working. In theory you can use the installcreds.exe file but it has been reported to not always work. In a near future there will be an update to make this process easier but for now you must use the srm-config.exe command. When it is complete you will be able to restart the SRM service and have communication between the SRM servers (will need to repair the communication by doing the pairing again). The format is complex for this command. You must ran it twice, the first time to obtain a thumbprint, and than the second time to actually make the change. Below is a sample command line. This utility is found in the bin directory of the c:program filesVMwareVMware Site Recovery Managerconfig folder. You can find parameter names (such as value for –sitename) in the vmware-dr.xml file found in the config folder.
Srm-config.exe –cmd confuserbased –sitename <local site name> -cfg <SRM configuration file> -u <username> -vc <host[:port]> [-thumbprint <sha-1 server certificate thumbprint]
Srm-config.exe –cmd confuserbased –sitename srm-primary –cfg vmware-dr-primary.xml –u administrator –vc 10.10.10.10 –thumbprint 96:E0:E8:F5:59:1C:BF:6D:81:6C:A2:AB:51:76:24:DE:31:D1:E8
Without the password you will need to use the thumbprint. So run this command the first time without the thumbprint parameter and you will be shown the thumbprint and than run it again with the thumbprint.
If your site name contains spaces enclose the name in quotes.
My recovery site is only using x number of hosts to start VM’s but it should be using y number.
When I experienced this, it was due to the host that was not starting VM’s not having access to the storage array. This was due to it not having a VMkernel port that LHN required. I have seen this with other vendor where there was no security between the ESX host in questions and the storage array. There are no error messages associated with this situation so make sure you test for it. I have seen a similar error where the single host at the recovery site didn’t have an IP entered for the iSCSI array.
Priority Levels in Recovery Plan don’t reflect my changes.
You have made changes in the Protection Group to the priority level of some of your protected VM’s. But when you refresh the Recovery Steps you see your VM’s with the original priority and not the new that you changed in the Protection Group. This is correct behavior. It may be improved in the future. It is due to the difference in security permissions on both sides. It would be possible from someone on the Protected side to make changes that affect VM’s on the recovery side. This may or may not be appropriate. Until there is a good solution, just right click on the VM in question and use the Move Up or Move Down options to change its execution order priority.
Error:Expected virtual machine file path ….. vm-vmname/vm-vmname.vmx cannot be found
This can occur during test or recovery and it means quite simply the VM reference in the error is not in the replicated SAN datastore where it is expected. This most often occurs when you add another VM to the protected datastore and before it has time to replicate start a test recovery. The solution is to wait until the replication catches up and try the test again.
Database access issues
Use Windows Authentication if the DB server is local to the SRM server, and SQL Authentication if the DB server is remote to the SRM server.
How can I tell the SRM version from the log files?
The first line of the SRM log files will hold the release info. The version=1.0.0 tells the version and build=build-97878 tells the build.
Installation logs
You can create an installation log using the command line parameters of /s /V”lve installlog.txt”. The command line will look like:
VMware-srm-1.0.0=.exe /s /V”lve installlog.txt” .
How do I change the log level?
You can easily change the log level by editing a configuration file. However, to have that change read by SRM you will need to restart the SRM service. The file name is vmware-dr.xml and is found by default in C:Program FilesVMwareVMware Site Recovery Managerconfig . Remember that when you restart the service that you will interrupt anyone working with SRM.
Look for the line that looks like:
C:Documents and SettingsAll UsersApplication DataVMwareVMware Site Recovery ManagerLogs
Below it you will find a line that looks like:
<level>verbose</level>
You can change the verbose to trivia, which will generate more entries, or to info, which generates less. In the RC builds it didn’t seem to make much of a difference what the setting was.
No available Customization specifications found.
You can create customizations using the View Edit Customization command in the VI client. This is how you can change a network setting in a recovery. This is like sysprep, and you are required to fill in all of the necessary information, but only the network info will be used. You will need to create your customization specification on the recovery site. Remember that you can export and import customizations so if necessary it doesn’t take much to move them between your protected and recovery sites.
Net::SSLeay::load_error_strings
This comes from the Perl module for OpenSSL, which is required by some SRA’s (such as NetApp) and means that perl is not installed on the recovery SRM server.
Is there a limitation of DR failover LUNs for some iSCSI arrays and some Hosts?
There is a hard limit of 64 iSCSI arrays per host. However, when using SRM there is a limit of approximately 23 recovery LUNs on the recovery side only. For more information about this please visit http://kb.vmware.com/kb/1005867 . This is not specific to SRM but to any DR setup you might test.
A general system error occurred – unable to get configuration information for the recovery VM
This error will occur when a VM has been added to a protected datastore, and is part of a recovery plan, but during test fail over it has not be replicated so it is not available to the recovery side. This can happen during a non – test failover as well. This can happen with LHN but the error message is more obvious of the problem.
Failed to launch SAN integration scripts
If you are using SRDF and get the error below when configuring your array you have a path issue. The error is “Failed to launch SAN integration scripts to execute discoverArrays command.” The issue is a missing path to the SYMCLI folder in the path. The solution is to add the path to the SYMCLI bin folder to the System variables PATH environment. The default path is C:Program FilesEMCSYMCLIbin and you will need to restart the SRM server service after the PATH change. This exact error is from an issue with SRDF it may occur with other SRA’s from other or the same vendor.
No visible LUN’s during configuration of the array
This will occur if there is NO VM’s in the protected datastore. Add a VM to the protected datastore and the LUN will be visible in the array configuration.
Null parameter name:key error
If you are adding a protection group and you get a error with a value of null parameter name:key in it, the solution at this time is to restart the SRM service on both the protected and recovery sites.
Missing testbubble switch on recovery host.
When you are checking your test recovery VM’s for network connectivity you find that while one ESX host worth of VM’s can talk to each other, but on other ESX hosts there is no connectivity. Further checking shows that only one recovery ESX host has the testbubble switch and the other hosts do not have that switch even though the recovery VM’s are configured to use it. Therefore the VM’s configured to use the test bubble switch that doesn’t exist will not be able to communicate.
Review Replicate Datastores window of Array Manager is blank.
When you are configuring your SRA and the last step in it is to show you the replicated LUN’s, but you see nothing you have a problem. Using the Rescan button doesn’t cause the LUN(s) to be displayed. To work around this issue, use the following steps:
- In the VI Client,
- Goto the ESX host configuration area
- Now select Storage
- In the upper right area select the Refresh option.
- Now return to the SRM Array Manager configuration,
- Select Rescan,
- Than select Back,
- Now select Next
- You should now see your LUN information displayed.
SRM will raise VC events for the following conditions:
· Disk space low
· CPU use exceeded limit
· Memory low
· Remote Site not responding
· Remote Site heartbeat failed
· Recovery Plan Test started, ended, succeeded, failed, or cancelled
· Virtual Machine Recovery started, ended, succeeded, failed, or reports a warning