• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

Slight change in “restart” behavior for HA with vSphere 5.0 Update 1

Duncan Epping · Mar 27, 2012 ·

Although this is a corner case scenario I did wanted to discuss it to make sure people are aware of this change. Prior to vSphere 5.0 Update 1 a virtual machine would be restarted by HA when the master had detected that the state of the virtual machine had changed compared to the “protectedlist” file. In other words, a master would filter the VMs it thinks had failed before trying to restart any. Prior to Update 1, a master used the protection state it read from the protectedlist. If the master did not know the on-disk protection state for the VM, the master did not try to restart it. Keep in mind that only one master can open the protectedList file in exclusive mode.

In Update 1 this logic has slightly changed. HA can know retrieve the state information from either the protectionlist stored on the datastore or from vCenter Server. So now multiple masters could try to restart a VM. If one of those restarts would fail, for instance because a “partition” does not have sufficient resources, the master in the other partition might be able to restart it. Although these scenarios are highly unlikely, this behavior change was introduced as a safety net!

 

** Disclaimer: This article contains references to the words master and/or slave. I recognize these as exclusionary words. The words are used in this article for consistency because it’s currently the words that appear in the software, in the UI, and in the log files. When the software is updated to remove the words, this article will be updated to be in alignment. **

Share it:

  • Tweet

Related

BC-DR, Server 5.0, ha, update 1, VMware, vSphere

Reader Interactions

Comments

  1. Sketch says

    27 March, 2012 at 16:51

    Time to update the book!

  2. Andreas Peetz says

    27 March, 2012 at 18:30

    Hi Duncan,

    thanks for pointing this out. The title of your post reminded me of another “change in the restart behavior with 5.0 Update 1”. Not related to HA, but to the VM autostart functionality for stand-alone hosts. It looks like this is broken when using the free license:
    http://v-front.blogspot.de/2012/03/esxi-50-update-1-breaks-vm-autostarts.html
    Do you know if VMware is aware of this bug (I hope it is only a bug and not by intention?!) and working on a solution?

    – Andreas

  3. Duncan Epping says

    27 March, 2012 at 18:46

    Yes VMware is aware and working on it: http://bit.ly/GT2Gbn

  4. SteveB says

    28 March, 2012 at 00:00

    Hello Duncan great site!

    Sorry if this out of topic can you please pass the word to VMware and who ever publishes the recent Technical White Paper documentation to stop using the 2 inch left margin of white space for their documentation in pdf files? Reading it on any e-reader you have to zoom in more often to read the papers. What is the reason for the big margin of space on the left side?

    Keep up the great work. Steve B.

  5. Martin says

    29 March, 2012 at 08:27

    Hi,

    In case of partition scenario, would it be possible for both masters to successfully startup the vm

  6. Duncan says

    29 March, 2012 at 14:46

    No, as soon as the first master powers on the VM the files will be locked.

  7. xzearik says

    16 September, 2013 at 11:32

    Hi,
    I would like to know the exact and vSphere defined time of VMware HA in following scenario.
    02 Host in a HA cluster running 02 VM’s with default restart priority. When power chord removes from one server what should be the standard response time?
    – How many seconds HA need to conclude that the Host is down?
    – How many second both VM will take to move from lost host to the live host?
    – What is the VMware guaranteed total time a VM take to actually powered on at surviving host in first attempt? (ONLY VM powered ON not entering in the OS loading).

    xzearik

    • Duncan Epping says

      16 September, 2013 at 12:45

      That is not an easy question to answer as it depends whether the host is a master or a slave. Even then it will depend on various variables so there is no hard guaranteed time. When a slave fails HA takes around 18-20 seconds to conclude it has failed and queue the power-on attempts.

      For a master this takes slightly longer, it would be about 35-40 seconds as a new master will need to be elected etc.

      Hope that helps.

  8. xzearik says

    17 September, 2013 at 12:06

    Thanks for the clarification, however i would like to know if there is any parameter available in HA config which defines the time and if it is changeable with the admin defined value.

    Actually the purpose of this inquiry to conclude that how much minimum & maximum time requires to have VM powered on in node file situation.
    Appreciate if you consider both cases while answering the inquiry.
    Total time to wait before see a VM moved and powered on surviving node IF a SLAVE Failed?
    Total time to wait before see a VM moved and powered on surviving node IF a Master Failed?

    • Duncan Epping says

      17 September, 2013 at 12:23

      Not that I can share unfortunately. If 40 seconds before restarts is too long I think your best option is either vSphere FT or using an application level clustering service.

      • xzearik says

        17 September, 2013 at 13:54

        Alright.. Thanks for the useful tip.

  9. tushar says

    28 September, 2014 at 13:21

    Does HA restarts (proper reboot) or resets (unexpected reboot) a VM during Host failure/down? If it performs proper reboot then in what case scenarios does unexpected reboot is detected by guest os during HA?

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive", the “vSphere Clustering Technical Deep Dive” series, and the host of the "Unexplored Territory" podcast.

Upcoming Events

29-08-2022 – VMware Explore US
07-11-2022 – VMware Explore EMEA
….

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2022 · Log in