• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

Server

HA without DRS?

Duncan Epping · Dec 8, 2010 ·

I had question on my HA Deepdive which I thought was worth answering in an article:

How does the active primary node decide where to restart failed VMs? Does it use a round-robin algorithm for selecting a host to start the VMs in restart priority order? What happens if the remaining nodes are imbalanced, especially without DRS enabled; are the nodes that have no spare capacity skipped? Or, does the active primary node restart VMs on the least busy host first, then the next busy host, etc?

Also, if VMs have no reservation for CPU or memory set, how does HA decide the number of VMs to restart on any one node? Is it possible that HA will restart too many VMs on one node so that performance is extremely poor until DRS move some VMs to other nodes?

In the past HA(pre 4.1) would consider the utilization of the Hosts and go through a check for every VM that needs to failover. It would fail the VM over the host with the most amount of available resources. Now from a “latency” perspective that is not the best approach as you can imagine. With latency meaning the time it takes to restart the VMs and the delay caused by hostd. Now type of delay can be cause by hostd? Well lets assume you have 1 host which is not doing a lot, this host would be the host that is selected for most failovers. Having 10 VMs (or more) starting in parallel will beat hostd severely.

So does HA use DRS to select which host to use for the restart? No it won’t, DRS happens on a vCenter level and HA happens on a host level…. But more things have changed. As of vSphere 4.1 virtual machines will be evenly distributed across hosts to lighten the load on the hostd service and to get quicker power-on results. HA then relies on DRS to redistribute the load later if required. This improvement results in faster restarts of the virtual machines and less stress on the ESX hosts.

So what if you are not using DRS? To put it bluntly, make sure you manual balance your environment to ensure HA doesn’t “overload” a single host… that is the only thing you can do for now. (by the way, all of this is included in the HA and DRS tech deepdive :-))

Cool Tool: opvizor

Duncan Epping · Dec 7, 2010 ·

Recently Dennis Zimmer, which most of you probably know of Icomasoft or from the books he authored, emailed me about a new tool his company was developing. I watched the video that is hosted on opvizor.com and must admit that it looks promising. Especially as most solutions today are reactive or semi-pro-active and opvizor is aiming to be pro-active.

opvizor identifies in advance when the virtualized IT infrastructure is lo osing on performance or might crash. Issues in VMware environments can be analyzed and corrected before they become dangerous. In addition, opvizor provides optimized logfiles and makes it possible to share the infrastructure data with internal and external partners, thus allowing more efficient problem solving. “Our goal is, that opvizor anticipates 60 percent of issues from system behavior.”

Now the tool just entered the Beta stage and opvizor is looking for people willing to give it a testdrive and willing to provide feedback! Funnily enough the tool kind of reminds me of a great tool we use internally to take vm-support files apart and analyze them. I can assure you that with the right amount of work / commitment this can turn into a really powerful tool to monitor / healthcheck your environment on a regular basis.

vSphere 4.1 HA and DRS Technical Deepdive, the book!

Duncan Epping · Dec 6, 2010 ·

In August we announced that we were working a secret project and let you guys in on it. The idea was to get it published through an official Publisher but due to several circumstances and a very tight deadline we decided to go the self-publishing route to make it available as soon as possible. So here it is, the moment both Frank Denneman and I have been waiting for…. it is finally available, the HA and DRS technical deepdive.

As of today “vSphere 4.1 HA and DRS Technical Deepdive” is available on paper via CreateSpace and Amazon. We are also working on getting a digital copy up for sale but that will more than likely be early 2011.

There is something I want to make very clear here as I have heard multiple people referring to this book as “Duncan’s Book”. This book was very much a joint effort. Frank has invested at least as much time in this project as I have, and probably even more. I want to thank Frank for his hard work and hope everyone realizes that it is our book and not my book!

We want to take the opportunity to thank our Technical Reviewers for their very valuable feedback and for keeping us honest; fellow VCDX Panel Member Craig Risinger (VMware PSO), Marc Sevigny (VMware HA Engineering), Anne Holler (VMware DRS Engineering) and Bouke Groenescheij (Jume.nl). A very special thanks to Scott Herold for writing the foreword!

For those who can’t wait, order it via CreateSpace or Amazon now. (Please be so kind to leave a review

This is the description of the book that is up on CreateSpace/Amazon:

About the authors:
Duncan Epping (VCDX 007) is a Consulting Architect working for VMware as part of the Cloud Practice. Duncan works primarily with Service Providers and large Enterprise customers. He is focussed on designing Public Cloud Infrastructures and specializes in bc-dr, vCloud Director and VMware HA. Duncan is the owner of Yellow-Bricks.com, the leading VMware blog.
Frank Denneman (VCDX 029) is a Consulting Architect working for VMware as part of the Professional Services Organization. Frank works primarily with large Enterprise customers and Service Providers. He specializes in Resource Management, DRS and storage. Frank is the owner of frankdenneman.nl which has recently been voted number 6 worldwide on vsphere-land.com

VMware vSphere 4.1 HA and DRS Technical Deepdive zooms in on two key components of every VMware based infrastructure and is by no means a “how to” guide. It covers the basic steps needed to create a VMware HA and DRS cluster, but even more important explains the concepts and mechanisms behind HA and DRS which will enable you to make well educated decisions. This book will take you in to the trenches of HA and DRS and will give you the tools to understand and implement e.g. HA admission control policies, DRS resource pools and resource allocation settings. On top of that each section contains basic design principles that can be used for designing, implementing or improving VMware infrastructures.
Coverage includes:

  • HA node types
  • HA isolation detection and response
  • HA admission control
  • VM Monitoring
  • HA and DRS integration
  • DRS imbalance algorithm
  • Resource Pools
  • Impact of reservations and limits
  • CPU Resource Scheduling
  • Memory Scheduler
  • DPM

We hope you will enjoy reading it as much as we did writing it. Thanks,

You don’t need any brains to listen to music pt III

Duncan Epping · Nov 28, 2010 ·

It is one of those days again… It is time to blog about something different than VMware, virtualization, cloud or storage… Music. Over the last couple of months there are a couple of songs which I just can’t stop listening to.

The first song is Parade Of the Dead by the almighty Black Label Society. I realize that many of you might not appreciate Black Label Society but this is one of those tracks that can get you pumped up for everything. Feeling down? Going for a 10K run? Need to lose some of that frustration? This song literally solves everything, well at least in my case it does. The vocals combined with bonecrushing riffs and those drums just keep on marching on… you will know what I mean when you listen to the track.

Again I fully understand that some might appreciate music with less dB violence. The following track, Munich, has literally been one of my favorite tracks over the past years and it is one of my favorite bands as well. Editors made a huge impression on my when I saw them live, and still every album and every song reminds of that concert. I actually don’t really know why I like this song so much, I guess the fact that it is uptempo helps… plus that it used to be one of my sons favorite songs as well… a three years(back then, he is now 8) old Dutch kid singing a song in English and knowing every word does make an impression.

Configuring HA: Error while running health check script

Duncan Epping · Nov 24, 2010 ·

I was just cleaning up our Cloud Lab and noticed HA wasn’t enabled. I enabled it and immediately it threw the following error at me:

Error Message: Configuration Issues – HA agent on esx4.mgm.local in cluster ams-hadrs-01 in Lab2 has an error : Error while running health check script.

When experiencing HA configuration issues there are a couple of steps I usually take to try to fix the experienced issues:

  • Click “reconfigure for VMware HA” and see if the issue is still there, if so:
    • Is DNS configured and does it actually work? If not, fix and reconfigure for HA.
    • Is the gateway reachable? If not, fix and reconfigure for HA.

This usually solves 75% of the issues. If it hasn’t been fixed the next step I usually take is unloading the agent and restarting the management services. Although it is pretty rigurous it is the fastest way of fixing HA issues.  In my case I am using ESXi and this is what I needed to do to clean up the host:

  • Disable HA on the cluster
  • /opt/vmware/aam/VMware-aam-ha-uninstall.sh
  • /sbin/services.sh restart
  • Enable HA on the cluster

This solved the issue I had with HA,

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 198
  • Page 199
  • Page 200
  • Page 201
  • Page 202
  • Interim pages omitted …
  • Page 336
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Also visit!

For the Dutch-speaking audience, make sure to visit RunNerd.nl to follow my running adventure, read shoe/gear/race reviews, and more!

Do you like Hardcore-Punk music? Follow my Spotify Playlist!

Do you like 80s music? I got you covered!

Copyright Yellow-Bricks.com © 2026 · Log in