I was reading through some documentation and found a piece on creating a cold Standby vCenter server. This used to be a common practice with vCenter 2.5 and it worked well as vCenter itself was more or less stateless.
With vSphere 4.0 something changed. Although at first it might not seem substantial it actually is. As of vSphere 4.0 VMware started using ADAM. ADAM is most commonly referred to as the component which enables Linked Mode. Linked Mode gives you the opportunity to manage multiple vCenter Server from a single pane of glass.
Not only will you have a single pane of glass you will also have a central store for roles and permissions. This is key! Roles and permissions are stored in ADAM.
Lets assume you have just a single vCenter Server and are not using Linked Mode. This will not impact the way vCenter Server stores its roles and permissions… it will still use ADAM. Even when cloned daily full consistency can not be guaranteed and as such I would personally not recommend using a cold Standby vCenter Server unless you are willing to take the risks and have fully tested it.
Just remember that you need at least AD LDS if your vCenter(s) are not in AD.
I’m hoping VMware gets things together and lets FT support multi vCPU machines and that this will no longer be an issue. If we can run FT on vCenter, then a lot of those worries will disappear.
And it seemed like VMware was planning on pushing that idea, but have fallen behind with support for FT.
So for someone looking to use linked mode for DR in an environment where there is no DR right now and only a single vCenter with its custom roles/permissions defined, adding a new vCenter server in linked mode would just replicate the roles/permissions to the new server? It’s not very clear in the documentation. On that note what about using mixed mode say vCenter 4.0 and 4.1 servers in linked mode, kind of to upgrade vCenter with no downtime.
If your vCenter server is completely self-contained how big an issue would this be?
Meaning, if ADAM and the vCenter database are together and you replicate at a set interval does the same concerns about synchroncity and consistency still apply?
Please offer suggestions about how one might deal with this issue(s).
This is a major gotcha for SMB shops with limited resources and an obvious play by VMware to get people to buy SRM.
I have an SR open right now dealing with some ADAM issues–each time I reboot a newly upgraded vCenter 4.1 server ADAM has some issue where it acts like it is corrupt and does a full restore of the ADAM database (and seems to break some VUM functionality in the process). I was just about to ask the support rep what VMware components use ADAM and how. I noticed when digging around in ADSIEdit that roles/perms were in there. Kind of scary that I didn’t know this until now and am not even sure how I would recover from a failed server now.
Any info you can provide on the ins/outs of what uses ADAM and how, and how best to maybe backup/restore the ADAM database to a new server would be appreciated. We are considering a move from Windows 2003 EE 64-bit to 2008 EE 64-bit, but now I’m not sure how I would do this if not during a vCenter upgrade…
I just thought I would throw out that vCenter Server Hearbeat can be used as the Standby vCenter Server replacement whether vCenter Server is in standalone or linked mode. vCenter Server Heartbeat is supported over a LAN or a WAN and even supports the secondary vCenter Server having a different IP address than the primary vCenter Server. The only downside to this approach over say FT is cost.
Yes of course it can be used, but I guess the reason people were using Cold Standby servers in the first place was the simplicity and the costs 🙂
Is it possible to have an additional vCenter server and simply enter in the documented roles and permissions onto the ‘new’ vCenter server as necessary?? This manual option works best for small shops with few hosts and VMs, but a Powershell script could gather up all the info and then re-write it to the next vCenter server?? I am not a programmer, sigh…just lots of ideas and questions…
ADAM is VSS aware and has it’s own writer so snapshots should make the server crash consistent. At worst you will lose any changes and performance metrics between your last snapshot and when you bring up the clone but it should come up clean.
AFidel’s comment appears to mean that one could restore the original vCenter to another location and still be mostly okay so long as one had not changed roles/permissions etc. between the time of the backup and the time of the restore, which is not highly likely in many SMB environments.
This is one of the things that we really would like vmware to change about VC. We don’t want to keep configuration data inside VC. we want to have cold VC’s available and import anything and everything as needed.
At the moment we have a CMDB with details for the VM’s and such as their resource pools and permissions and upon loading up a “cold” VC, we export that data from the CMDP and use powershell to import into VC.
Once the import is ready, VC is ready to go.
We also have the database component, which does not move nicely to another cold VC, which is why so much scripting is required.
I really feel VMware has missed an opportunity this one. The ability to dial up a VC at some disaster site where things have really gone wrong just do not seem to be in the planning. My current plan is to be able to bring up 10,000 VM’s in a remote location and have VC look just like production within 48 hours. I’ve a working prototype for this on a smaller scale. If things continue along the current path, it will be possible for us to get there.
Hello Wilson,
Please consider being generous to blog or post on VMware Communities how you are doing this on the smaller scale??
It would benefit many people…
Thank you, Tom
Another thought: importing a vCenter database backup into the DR site vCenter, would this work??
Hi All,
I actually came across a script to do a migration from one vCenter to another, including all permissions and resource pools etc.
http://technodrone.blogspot.com/2010/01/vcenter-powercli-migration-script.html
Perhaps something like that could be modified to transfer to a standby. Obviously not very elegant though.
-Doug
In a linked mode vCenter environment using the “cold standby” technique is surely a very low risk? ADAM replicates using multimaster replication between each member of the configuration set. If a vCenter server fails and a “cold clone” is powered up. ADAM will update all objects to the latest Update Sequence Number (USN) from the other ADAM members (vCenter servers).
Furthermore, if the failed vCenter server had its database on a seperate server then there would no data loss and minimal downtime to vCenter.
With this in mind, when designing a vSphere environment the most important design practice is to seperate vCenter and the vCenter database?
If you are not using vCenter in a linked mode environment then you can always create another ADAM instance on a seperate disk and still use the “cold clone” technique. See my article for more details on this solution:
http://communities.vmware.com/blogs/virtuallysi/2010/10/13/return-of-vcenter-cold-standby-solution
If you are not using vCenter in a linked mode environment then you can always create another ADAM instance on a separate disk and still use the “cold clone” technique. See my article for more details on this solution: I agree with this and I think this works and has lesser failure.