Erik Zandboer just posted a topic on the VMTN forum about an HA error he received when he updates his machines to 3.5 U2. The error was one almost everyone has probably seen by now “could not contact primary HA agent”. This is normally solved by pressing “reconfigure for ha” or disabling and enabling HA. This wasn’t the case this time. after some research Erik discovered that the host file entry for the ESX host did not match the DNS name, one of them started with a capital while other did not. This caused HA to fail, after changing the hostname/dns name and a reboot everything worked fine again.
I can imagine this happens because of the fact that VirtualCenter is actually performing as a DNS/Hosts file for HA. Inconsistent naming has always been, and probably will always be a problem. So before upgrading, check your hostname and /etc/hosts file!
Previously, enabling VMware High Availability required DNS resolution of all ESX Server hosts in a High Availability cluster. This was done using configuring DNS records or by adding all of the host names and IP addresses to the /etc/hosts file on each server.
Starting with the ESX Server 3.5 Update2 release, DNS resolution or /etc/hosts file entries are no longer required to configure High Availability. The host name and IP address information will now be provided by the managing VirtualCenter Server. the source
Eric Sloof says
The list includes a small, almost unnoticed, reference to VC 2.5 U2 not be as dependent on DNS for some configurations. This is actually a big change. I just happen to speak with the Product Manager and one engineer for HA last week who provided more detail about this. They said that HA is no longer dependent on the ESX server being correctly configured for DNS. During configuration and startup, HA now uses the VC Server’s hostname/IP info instead. The reason for this change is because they have found many customers do not have correct DNS setups and it was causing a lot of service calls. They said HA still creates and uses the ft_hosts file during operation. (Steve Bradshaw)
Pieter Gerritse says
Oke, thats better. To not have HA depend on an “external” (but core) source like DNS. On the other hand, people should configure DNS correct, consistent naming and numbering of anything in DNS and VC (nics, hba’s, hostnames, luns) is a must!
P van Oosterom says
I had the same issue but mine was fixed by not having the vmkernel gateway for vmotion pointing to an existing ip address. Once i fixed this and re-enabled HA it worked fine
Ryan says
I was having the same issue and edited my /et/hosts to include the non FQDN at the end, host rebooted and HA came up. I didn’t change anything on the VI side.
e.g
AlexM says
I found your site on technorati and read a few of your other posts. Keep up the good work. I just added your RSS feed to my Google News Reader. Looking forward to reading more from you down the road!
Byron Bear says
It is the greatest post I examine as of today.Wherever did you got all of the info from? That is really useful and really education. I will stay awhile in here. Really worth reading each words!Thumbs up!free credit report