high availability

Trigger APD on iSCSI LUN on vSphere

Duncan Epping · Jun 21, 2018 ·

I was testing various failure scenarios in my lab today for the vSphere Clustering Deepdive session I have scheduled for VMworld. I needed some screenshots and log files of when a datastore hit an APD scenario, for those who don’t know APD stands for all paths down. In other words: the storage is inaccessible and ESXi doesn’t know what has happened and why. vSphere HA has the ability to respond to that kind of failure. I wanted to test this, but my setup was fairly simple and virtual. So I couldn’t unplug any cables. I also couldn’t make configuration changes to the iSCSI array as that would rather trigger a PDL (permanent device loss), so how do you test and APD scenario?

After trying various things like killing the iSCSI daemon (it gets restarted automatically with no impact on the workload) I bumped in to this command which triggered the APD:

SSH in to the host you want to trigger the APD on, run the following command
```
esxcli iscsi session remove  -A vmhba65
```
Make sure of course to replace “vmhba65” with the name of your iSCSI adapter

This triggered APD, as witness in the fdm.log and vmkernel.log, and ultimately resulted in vSphere HA killing the impacted VM and restarting it on a healthy host. Anyway, just wanted to share this as I am sure there are others who would like to test APD responses in their labs or before their environment goes in to production.

There may be other easy ways as well, if you know any, please share in the comments section.

vSphere HA Restart Priority

Duncan Epping · Apr 4, 2018 ·

I’ve seen more and more questions popping up about vSphere HA Restart Priority lately. I figured I would write something about it. I already did in this post about what’s new in vSphere 6.5 and I did so in the Stretched Cluster guide. It has always been possible to set a restart priority for VMs, but pre-vSphere 6.5 this priority simply referred to the scheduling of the restart of the VM after a failure. Each host in a cluster can restart 32 VMs at the same time, so you can imagine that if the restart priority is only about VM restarts that it doesn’t really add a lot of value. (Simply because we can schedule many at the same time, and the priority would as such have no effect.)

As of vSphere 6.5 we have the ability to specify the priority and also specify when HA should continue with the next batch. Especially this last part is important, as this allows you to specify that we start with the next priority level when:

Resources are allocated (default)
VMs are powered on
Guest heartbeat is detected
App heartbeat is detected

I think these are mostly self-explanatory, note though the “resources are allocated” means that a target host for restart has been found by the master. So this happens within milliseconds. Very similar for VMs are powered on, this also says nothing about when a VM is available. This literally is “power on”. In some cases it could take 10-20 seconds for a VM to be fully booted and the apps to be available, in other cases it may take minutes… It all depends on the services that will need to be started within the VM. So if it is important for the “service provided” by the VM to be available before starting the next batch then option 3 or 4 would be your best pick. Note that with option 4 you will need to have VM/Application Monitoring and defined within the VM. Now when you have made your choice around when to start the next batch you can simply start adding VMs to a specific level.

Instead of the 3 standard restart “buckets” you now have 5: Highest, High, Medium, Low, Lowest. Why these funny names? Well that was done in order to stay backwards compatible with vSphere 6 / 5 etc. By default all VMs will have the “medium” restart priority, and no it won’t make any difference if you change all of them to high. Simply because the restart priority is about the priority between VMs, it doesn’t change the host response times etc. In other words, changing the restart priority only makes sense when you have VMs at different levels, and usually will only make a big difference when you also change the option “Start next priority VMs when”.

So where do you change this? Well that is pretty straight forward:

Click on your HA cluster and then the “Configure” Tab
Click on “VM Overrides” and then click “Add”
Click on the green plus sign and select the VMs you would like to give a higher, or lower priority
Then select the new priority and specify when the next batch should start

And if you are wondering, yes the restart priority also applies when vCenter is not available. So you can use it even to ensure vCenter, AD and DNS are booted up first. All of this info is stored in the cluster configuration data. You can examine this on the commandline by the way by typing the following:

/opt/vmware/fdm/fdm/prettyPrint.sh clusterconfig

Note that the outcome is usually pretty big, so you will have to scroll through it to find what you need, if you do a search on “restartPriority” then you should be able to find it the VMs for which you changed the priority. Pretty cool right?!

Oh, if you didn’t know yet… Frank, Niels and I are actively updating the vSphere Clustering Deep Dive. Hopefully we will have something out “soon”, as in around VMworld.

Changing advanced vSphere FT related settings, is that supported?

Duncan Epping · Feb 1, 2018 ·

This week I received a question around changing the values for vSphere FT related advanced settings. This customer is working on an environment where uptime is key. Of course the application layer is one side, but they also want to have additional availability from an infrastructure perspective. Which means vSphere HA and vSphere FT are key.

They have various VMs they need to enable FT on, these are vSMP VMs (meaning in this case dual CPU). Right now each host is limited to 4 FT VMs and at most 8 vCPUs, this is being controlled by two advanced settings called “das.maxftvmsperhost” and “das.maxFtVCpusPerHost”. The values for these are, obviously, 4 and 8. The question was: can I edit these and still have a supported configuration? Also, why 4 and 8?

I spoke to the product team about this and the answer is: yes, you can safely edit these. These values were set based on typical bandwidth and resource constraints customers have. An FT VM easily consumes between 1-3Gbps of bandwidth, meaning that if you dedicate a 10Gbps link to it you will fit roughly 4 VMs. I say roughly as of course the workload matters: CPU, Memory and IO pattern.

If you have a 40Gbps NIC, and you have plenty of cores and memory you could increase those max numbers for FT VMs per host and FT vCPUs. However, it must be noted that if you run in to problems VMware GSS may request you to revert back to the default just to ensure the issues that occur aren’t due to this change as VMware tests with the default values.

UPDATE to this content can be found here: https://www.yellow-bricks.com/2022/11/18/can-you-exceed-the-number-of-ft-enabled-vcpus-per-host-or-number-of-ft-enabled-vcpus-per-vm/

Can you use the management IPs as the isolation address for HA?

Duncan Epping · Aug 11, 2017 ·

There was a question on VMTN this week about the use of the management IP’s in a “smaller” cluster as the isolation address for vSphere HA. The plan was to disable the default isolation address (default gateway) and then add every management IP as an isolation address. In this case 5 or 6 IP’s would be added. I had to think this through and went through the steps of what happens in the case of an isolation event:

no traffic between secondary and primary or primary and secondary hosts (depending on whether the primary is isolated or one of the secondary hosts)
if it was a secondary which is potentially isolated then the secondary will start a “primary election process”
if it was the primary which is potentially isolated then the primary will try to ping the isolation addresses
if it was a secondary and there’s no response to the election process then the secondary host will ping the isolation address after it has elected itself as primary host
if there’s no response to any of the pings (happen in parallel) then the isolation is declared and the isolation response is triggered

Now the question is: will there be a response when the host tries to ping itself while it is isolated, as you need to add all ip-addresses to “isolation address” options for it to make sense… And that is what I tested. It will ping all isolation addresses. All but one will fail, the one that will be successful is the management IP address of the host which is isolated. (You can still ping your own IP when the NICs are disconnected even.) Leaving the VMs running as one of the isolation addresses responded.

In other words, don’t do this. The isolation address should be a reliable address outside of the ESXi host, preferably on the same network as the management.

Where’s the HA enforce VM-Host and Affinity rules option in vSphere 6.5?

Duncan Epping · Apr 25, 2017 ·

Last week on (VMware internal) Socialcast someone asked where the UI option is in vSphere 6.5 that allows you to enable the ability for vSphere HA to respect VM-Host Affinity and VM-VM Anti Affinity rules. In vSphere 6.0 there is an option in the Rules part of the UI as shown in the screenshot below.

In vSphere 6.5 that option has disappeared completely. The reason for this is because vSphere HA now respects these rules by default, as it appeared this is the behavior customers wanted anyway. Note, that if for whatever reason vSphere HA cannot respect the rule it will restart the VMs (violating the rule) as these are non-mandatory rules it chooses availability over compliance in this situation.

If you would like to disable this behavior and don’t care about these rules during a fail-over you can set either or both advanced settings:

das.respectvmvmantiaffinityrules – set to “true” by default, set to “false” if you want to disable it
das.respectvmhostsoftaffinityrules – set to “true” by default, set to “false” if you want to disable it

I hope that helps those looking to make changes to this behavior.