fdm

Disable the re-registering of HA disabled VMs on other hosts!

Duncan Epping · Jan 24, 2023 · Leave a Comment

Years ago we had various customers that complained about the fact that they had VMs that were disabled for HA and that these VMs would not be re-registered when the host they were registered on would fail. I can understand why you would want the VMs to be re-registered as this makes it easier to power-on those VMs when a host has failed. If the VM would not be re-registered automatically, and the host to which it was registered has failed, you would have to manually register the VM and only then would you be able to power on that VM.

Now, it doesn’t happen too often, but there are also situations where certain VMs are disabled for HA restarts (or powered off) and customers don’t want to have those VMs to be re-registered as these VMs are only allowed to run on one particular host. In that particular case you can simply disable the re-registering of HA disabled VMs through the use of an advanced setting. The advanced setting for this, and the value to use, is the following:

das.reregisterRestartDisabledVMs - false

The following VIBs on the host are missing from the image and will be removed from the host during remediation: vmware-fdm

Duncan Epping · Jan 2, 2023 · 1 Comment

I’ve seen a few people being confused about a message which is shown when upgrading ESXi. The message is: The following VIBs on the host are missing from the image and will be removed from the host during remediation: vmware-fdm(version number + build number). Now this happens when you use vLCM (Lifecycle Manager) to upgrade from one version of ESXi to the next. The reason for it is simple, the vSphere HA VIB (vmware-fdm) is never included in the image.

If it is not included, how do the hosts get the VIB? The VIB is pushed by vCenter Server to the hosts when required! (When you enable HA for instance on a cluster.) This also is the case after an upgrade. After the VIB is removed it will simply be replaced by the latest version of it by vCenter Server. So no need to be worried, HA will work perfectly fine after the upgrade!

Can I make a host in a cluster the vSphere HA primary / master host?

Duncan Epping · May 21, 2021 ·

There was an interesting question on the VMware VMTN Community this week, although I wrote about this in 2016 I figured I would do a short write-up again as the procedure changed since 7.0u1. The question was if it was possible to make a particular host in a cluster the vSphere HA primary (or master as it was called previously) host. The use case was pretty straightforward, in this case, the customer had a stretched cluster configuration with vSAN, they wanted to make sure that the vSphere HA primary host was located in the “preferred” site, as this could potentially speed up the restart of VMs. Now, mind you, that when I say “speed up” we are talking about 2-3 seconds difference at most, but for some folks, this may be crucial. I personally would not recommend making configuration changes, but if you do want to do this, vSphere does have the option to do so.

When it comes to vSphere HA, there’s no UI option or anything like that to assign the “primary/master” host role. However, there’s the option to specify an advanced setting on a host level to indicate that a certain host needs to be favored during the primary/master election. Again, this is not very common for customers to configure, but if you desire to do so, it is possible. The advanced setting is called “fdm.nodeGoodness” and depending on which version you use, you will need to configure it either via the fdm.cfg file, or via the configstorecli. You can read about this process in-depth here.

Of course, I did try if this worked in my lab, here’s what I did, I first list the current configured advanced options using configstorecli for vSphere HA:

configstorecli config current get -g cluster -c ha -k fdm
{
   "mem_reservation_MB": 200,
   "memory_checker_time_in_secs": 0
}

Next, I will set the “node_goodness” for my host, when setting this it will need to be a positive value, in my case I am setting it to 10000000. I first dumped the current config in a json file:

configstorecli config current get -g cluster -c ha -k fdm > test.json

Next, I edited the file and added the setting “node_goodness” with a value of 10000000, so that is looks as follows:

{ 
    "mem_reservation_MB": 200, 
    "memory_checker_time_in_secs": 0,
    "node_goodness": 10000000
}

I then imported the file:

configstorecli config current set -g cluster -c ha -k fdm -infile test.json

After importing the file and reconfiguring for HA on one of my hosts, you can see in the screenshots below that the master role moved from 1507 to 1505.

I also created a quick demo, for those who prefer video content:

How long does it take before a host is declared failed?

Duncan Epping · Jan 26, 2021 ·

I had a question this week around the failure of a host. The question was how long it takes before a host is declared failed. Now let’s be clear, failed means “dead” in this case, not isolated or partitioned. It could be the power has failed, the host has gone completely unresponsive, or anything else where there’s absolutely no response from the host whatsoever. In that scenario, how long does it take before HA has declared the VM dead? Now note, the below timeline is in a traditional infrastructure. Also note, that this is theoretical, when everything is optimal.

T0 – Secondary Host failure.
T3s – The Primary Host begins monitoring datastore heartbeats for 15 seconds.
T10s – The host is declared unreachable and the Primary will ping the management network of the failed host.
- This is a continuous ping for 5 seconds.
T15s – If no heartbeat datastores are configured, the host will be declared dead.
T18s – If heartbeat datastores are configured and there have been no heartbeats, the host will be declared dead, restarts will be initiated.

Now, when a Primary Host fails the timeline looks a bit different. This is mainly because first, a new Primary Host will need to be elected. Also, we need to ensure that the new primary has received the latest state of all secondary hosts.

T0 – Primary Host failure.
T10s – Primary election process initiated.
T25s – New primary elected and reads the protectedlist.
- New primary waits for secondary hosts to report running VMs
T35s – Old primary declared unreachable.
T50s – Old primary declared dead, new primary initiates restarts for all VMs on the protectedlist which are not running.

Keep in mind, this does not mean that VMs will be restarted with 18 seconds, or 35 seconds, for that matter. When the host is declared dead, or a new primary is elected, the restart process starts. The VMs that need to be restarted will first need to be placed, and when placed, they will need to be restarted. All of these steps will take time. On top of that, depending on the operating system and the apps running within the VM, the time it takes before the restart is fully completed could vary a lot between VMs. In other words, although the state is declared rather fast, the actual total time it takes to restart can vary and is definitely not an exact science.

HA Architecture Series – Datastore Heartbeating (3/5)

Duncan Epping · Jul 26, 2011 ·

**disclaimer: Some of the content has been taken from the vSphere 5 Clustering Technical Deepdive book**

The first time I was playing around with 5.0 and particularly HA I noticed a new section in the UI called Datastore Heartbeating.

Those familiar with HA prior to vSphere 5.0 probably know that virtual machine restarts were always initiated, even if only the management network of the host was isolated and the virtual machines were still running. As you can imagine, this added an unnecessary level of stress to the host. This has been mitigated by the introduction of the datastore heartbeating mechanism. Datastore heartbeating adds a new level of resiliency and allows HA to make a distinction between a failed host and an isolated / partitioned host. Isolated vs Partitioned is explained in Part 2 of this series.

Datastore heartbeating enables a master to more correctly determine the state of a host that is not reachable via the management network. The new datastore heartbeat mechanism is only used in case the master has lost network connectivity with the slaves to validate whether the host has failed or is merely isolated/network partitioned. As shown in the screenshot above two datastores are automatically selected by vCenter. You can rule out specific volumes if and when required or even make the selection yourself. I would however recommend to let vCenter decide.

As mentioned by default it will select two datastores. It is possible however to configure an advanced setting (das.heartbeatDsPerHost) to allow for more datastores for datastore heartbeating. I can imagine this is something that you would do when you have multiple storage devices and want to pick a datastore from each, but generally speaking I would not recommend configuring this option as the default should be sufficient for most scenarios.

How does this heartbeating mechanism work? HA leverages the existing VMFS filesystem locking mechanism. The locking mechanism uses a so called “heartbeat region” which is updated as long as the lock on a file exists. In order to update a datastore heartbeat region, a host needs to have at least one open file on the volume. HA ensures there is at least one file open on this volume by creating a file specifically for datastore heartbeating. In other words, a per-host a file is created on the designated heartbeating datastores, as shown in the screenshot below. HA will simply check whether the heartbeat region has been updated.

If you are curious which datastores have been selected for heartbeating. Just go to your summary tab on your cluster and click “Cluster Status”, the 3 tab “Heartbeat Datastores” will reveal it.

** Disclaimer: This article contains references to the words master and/or slave. I recognize these as exclusionary words. The words are used in this article for consistency because it’s currently the words that appear in the software, in the UI, and in the log files. When the software is updated to remove the words, this article will be updated to be in alignment. **