HA, primary and secondary nodes?

Duncan Epping · Sep 9, 2008 ·

Because I’ve been looking into HA myself I wanted to clarify things up, for you guys and for myself… writing is a good way of getting the facts straight. I’ve seen and get a lot of questions regarding HA. So I just bundled a bunch of questions I received over the last couple of months…

How does a primary and / or secondary get selected?

The first 5 hosts that join the VMware HA cluster are automatically selected as “primary nodes”
All the others are automatically selected as “secondary nodes”
When you do a reconfigure for HA the primary nodes and secondary nodes are selected again, this is random

What’s up with these primaries and secondaries?

Primary nodes hold cluster settings and all node states which are synced between primaries
Secondary nodes send their state info(resource occupation) to the primary nodes
Nodes send heartbeats to each other, primary nodes send heartbeats to primary nodes only and secondary also only to primary. And they do this every second. (Which is a changeable value: das.failuredetectioninterval)

So what if a primary node fails, will a secondary be promoted?

No, there will only be a new primary appointed when the failed one is removed from the cluster. A secondary will be promoted to primary at random.

But what if all my primary nodes fail?

This is an unaddressed issue, that’s the reason why you can only account for 4 host failures within a cluster! There needs to be at least one primary!

So when does the gateway come in play?

Actually the gateway, which is the default “isolation address”, will only be used when an isolation has occurred. So when the AAM client thinks it’s isolated it will check the isolation addresses.

So if anyone has a question just drop it here and I’ll try to answer it and update the above list…

Comments

John says

9 September, 2008 at 16:06

HA question –
What has a bigger performance impact on VC:

Number of cluster nodes or the number of machines? I can pretty much bury my VC at will when I put a node into maintenance mode.

When I turn off HA the impact seems to be less (still a huge drag) but less. Expected behavior?

I ended up turning HA off.

Also – I try to keep my clusters are 5-6 nodes. I have two 4 node clusters now (due to IP range restrictions). Is there any way to make the cluster aware of the different IP networks (beyond creating different vLan names)?
Chris says

9 September, 2008 at 20:32

Two Questions:
1) Is it better to try and keep clusters to only a 4-5 node system with the nodes spread across two sites for DR purposes or keep them spread out and just have one large cluster?
2) How do you determine which nodes are primary and which are secondary.
Duncan Epping says

10 September, 2008 at 16:20

@ john:
1st question: I would suppose number of nodes in a cluster because all the resource state data needs to go around.
2nd question: this is awkward behavior that I haven’t seen personally. you should definitely check your log files and check for any weird messages regarding this. I would recommend with posting a question on VMTN and phone VMware support. I heard a case like this once and it had to do with zoning on the SAN.
3rd questions: it’s currently not possible to have different ip networks for the SC in the same cluster
Duncan Epping says

10 September, 2008 at 16:24

@chris:
1) that totally depends on the infrastructural design, stuff like network link. I’ve setup both variations and actually don’t have one that I prefer more… I always have the feeling that it’s safer to have 2 clusters, one at each site.
2) try:
“more /opt/LGTOaam512/log/aam_config_util_listnodes.log”
or
“more /var/log/vmware/aam/aam_config_util_listnodes.log”
Alastair says

11 September, 2008 at 03:33

I had understood that the number of primary nodes was the configured host failure number plus one, up to a maximum of five.
i.e. in a ten node cluster with configured failure level of 1 there would be only two primaries.
Does this differ from your understanding?
Bouke Groenescheij says

11 September, 2008 at 10:45

As always, excellent and spot-on info! Great job Duncan.
Duncan Epping says

11 September, 2008 at 15:47

@ Alastair, every node up til 5 becomes primary!

@ Bouke, that’s a great compliment from the Guru that got me enthusiastic about VMware!
John says

11 September, 2008 at 20:38

Duncan, after upgrading to 2.5 U2 and dropping all the old performance data that wasn’t rolling up life seems to be much better. I was about ready to open an SR around it but how often do you really put hosts in M mode?
Thanks for the feedback!
Duncan Epping says

11 September, 2008 at 21:10

well around once every two months or so…
LucD says

15 December, 2008 at 21:53

Do you happen to know where (which log) you can see if a re-election has taken place ?

My idea was to use the ReconfigureComputeResource method to force a re-election after a 1 or more host failures.

This could be done like this (PowerShell-VITK)

$clus = Get-View -Id (Get-Cluster ).Id
$clsConfigInfoEx = New-Object VMware.Vim.ClusterConfigSpecEx
$clsConfigInfoEx.DasConfig = $clus.ConfigurationEx.DasConfig
$clsConfigInfoEx.DpmConfig = $clus.ConfigurationEx.DpmConfig
$clsConfigInfoEx.DrsConfig = $clus.ConfigurationEx.DrsConfig
$clsConfigInfoEx.DpmConfig = $clus.ConfigurationEx.DpmConfig

$clus.ReconfigureComputeResource($clsConfigInfoEx, $true)
Santosh Kumar says

21 December, 2011 at 17:09

Thanks for the useful information !!

I have one question, In HA,It can hold max 5 primary nodes, where in case of cluster which has more than 10-12 esx added to it.

But ,In case of max 3 or 4 esx boxes in cluster, 1st one added to cluster becomes the primary, if the second added one will be automatically considered as the primary or secondary. In this scenario how many primary and secondary will be in cluster ,

Can anyone clarify this.. I will be so gr8 full..
Senthil says

29 December, 2011 at 18:19

This portion cleared my primary,secondary concept of HA

Related

Reader Interactions

Comments