Guaranteeing availability through admission control, chip in!

Duncan Epping · Apr 9, 2013 ·

I have been having these discussions with our engineering teams for the last year around guaranteed restarts of virtual machines in a cluster. In the current shape / form we use Admission Control to guarantee virtual machines are restarted. Today Admission Control is all about guaranteeing virtual machine restarts by keeping track of Memory and CPU resource reservations, but you can imagine that in the Software Defined Datacenter this could be expanded with for instance storage or networking reservation.

Now why am I having these discussions, what is the problem with Admission Control today? Well first of all it is the perception that many appear to have of Admission Control. Many believe the Admission Control algorithm uses “used” resources. Reality however is that Admission Control is not that flexible, it uses resource reservations and as you know this is static. So what is the result of using reservations?

By using reservations for “admission control” vSphere HA has a simple way of guaranteeing a restart is possible at all times. Simply because it checks if sufficient “unreserved resources” are available and if so it allows the virtual machine to be powered-on. If not, then it won’t allow the power-on just to ensure that all virtual machines can be restarted in case of a failure. But what is the problem? Although we guarantee a restart we do not guarantee any type of performance after the restart! Unless, unless of course you are setting your reservations equal to what you provisioned… but I don’t know anyone doing this as it eliminates any form of overcommitment and will result in an increase of cost and a decrease in flexibility.

So that is the problem. Question is – what should we do about it? We (the engineering teams and I) would like to hear from YOU.

What would you like admission control to be?
What guarantees do you want HA to provide?
After a failure, what criteria should HA apply in deciding which VMs to restart?

One idea we have been discussing is to have Admission Control use something like “used” resources… or for instance an “average of resources used” per virtual machine. What if you could say: I want to ensure that my virtual machines always get at least 80% of what they use on average? If so, what should HA do when there are not enough resources to meet the 80% demand of all VMs? Power on some of the VMs? Power on all with reduced share values?

Also, something we have discussed is having vCenter show how many resources are used on average taking your high availability N-X setup in to account, which should at least provide an insight around how your VMs (and applications) will perform after a fail-over. Is that something you see value in?

What do you think? Be open and honest, tell us what you think… don’t be scared, we won’t be bite, we are open for all suggestions.

Comments

Neil says

9 April, 2013 at 15:36

Sure, we’d like to see HA factor in storage & network resources. TBH this should have happened years ago with SIOC having been around a while now…
Others on the wish list would be HA to better understand, and in some ways move away from a VM centric view to a vApp / service layer. I’m not suggesting using the “VM & Application Monitoring” option in todays’ HA (this is still very VM centric) but to move to a collection of VMs (vApp?) so allow better failure planning scenario’s in cluster, or in DC.
Taking this one step further could be to invoke SRM for site b failover (to restore service) if site A is deemed by HA as completely hosed.
Roger Mårtensson says

9 April, 2013 at 15:44

Nice article and very good question(s).
In a strictly systems administrators view running a smaller deployment while keeping everything else up and running I would like vCenter to actually help me to decide. Help me in choosing admission control and more important (if possible) show me what my decisions would do to my environment if a fail-over should happen.
The easiest scenarior, that many of us have been in, is when loosing a node in a cluster. I would like vCenter be able to show a as-close-to-reality description what would happen to my VMs before it happens.
I like your idea about setting a percentage on how much “performance” is acceptable when bad things happens to a node and is lost.
In the event when not enough resources is available could be to use some sort of priority based list. Maybe by building groups of VMs, folders and/or resource pools. The highst priority gets the resources first?
Anyway. What I would an admission control to be like is to be easy. I shouldn’t be needing a large KB/book and a spreadsheet to be able to set and maintain an acceptable level.
Again, thanks for talking about this. Hope you will get some real good input from this.
Travis Backs says

9 April, 2013 at 16:30

“Also, something we have discussed is having vCenter show how many resources are used on average taking your high availability N-X setup in to account, which should at least provide an insight around how your VMs (and applications) will perform after a fail-over. Is that something you see value in?”

I find extreme value in this. Our environments are organic and doing weekly maths to see how many resources I have in case of single, double, triple host failure is sometimes tedious. If vCenter could provide a quick snapshot of how my vms are protected from “what-ifs” taking into account current resources would be very valuable to me as an architect.
Tim Patterson says

9 April, 2013 at 17:01

I would love to see a deeper integration between admission control and vCops… I mean, vCops has very intelligent historical data to use as a basis for its decision points, and its metrics covers all of the “core 4” areas.
Phil says

9 April, 2013 at 17:27

I think you’re on the right track. Implement a way for Admission Control to evaluate the average mem/cpu usage for things that do not have a reservation and allow the users to say “I want to guarantee 50% of CPU, 100% memory will be available” or whichever combination of values. Some people may want the opposite for CPU intensive tasks that require little memory. Throwing a simulation tool in there would be ideal to give the admin a “What if?” scenario. This would help deliver a quality of service beyond what shares can provide.

The term “Admission Control” is weird to a dumb end-user such as myself. What does that even mean? Should it maybe be called “HA Reserve”? Aren’t we reserving cluster resources to provide for a host failure?
- Marc says
  
  9 April, 2013 at 19:11
  
  Thanks for the input, Phil. I have a follow-up question:
  
  ” Implement a way for Admission Control to evaluate the average mem/cpu usage for things that do not have a reservation ”
  
  So this implies that HA tracks the resource usage over time for powered on VMs. Since the VMs are already running (they have been “Admitted”, which, by the way, I agree and also dislike the term Admission Control!), what should HA do if it learned that it is no longer satisfying your “HA resource reservation” policy ? (isn’t that much more intuitive? 😉 Raise an alarm? Power off lower priority VMs (probably not)…
  
  The failover modeling would be dynamic, but giving snapshot-in-time ideas of what could happen under various failover scenarios could help the admin visualize what HA would attempt to do.
  - Phil says
    
    11 April, 2013 at 15:11
    
    Correct, the VM would already be admitted, but I was thinking more along the lines of guaranteeing resources of that VM during a failover.
    
    For instance –
    
    Joe Admin has an ESXi cluster with three hosts. None of the virtual machines have reservations, but he wants to set an HA Resource Reservation policy to ensure the virtual machines will have 50% of their average CPU usage available. Joe wants vCenter(or vCOPS?) to collect this data over a period of his choosing(hours, day, week…) and then determine the amount of resources that should be reserved to satisfy the need.
    
    This would play in to the realm that the current “Percentage of cluster resources…” options fill, but in a more educated manner.* I’m not the brightest guy, so I could be way off base on how this is all working behind the scenes.*
    
    Some would say “Why not just use reservations”? Because that requires setting static limits and not something based on what the application is actually doing. Wouldn’t it make more sense to have dynamic resource reservations or percentage based? I dont want to reserve 4GB of RAM. Tell me how much the VM uses for active memory and let me decide if i want a static 4GB or the learned 1.7GB of actual needed RAM.
    
    In regards to what to do when the HA Resource Reservation policy can’t satisfy resources, that is a good question! An alarm may fancy some, if not most, while disallowing virtual machines to power on could also be a continued action. As the resource requirement may be dynamic in this case, that is quite an issue to ponder of how to handle the reservations.
    - Travis Backs says
      
      11 April, 2013 at 15:23
      
      I agree 100% with Phil’s suggestions.
      
      Also, on the question of reservations; I agree that it maybe useful in some scenarios, but if you constantly use reservations, doesn’t that degrade the purpose of an agile and flowing environment where compute resources can be shared? I do not use reservations in my environment and I use the percentage-based Admission Control policy.
Josh says

9 April, 2013 at 18:01

I would like to see an option to set Admission Control to work off of Allocations instead of Reservations.

Then for a three host Cluster I could cap usage to 66% for each host for memory and you could specify the vCPU allocation threshold you would desire such as 3:1 (virtual : physical cores)
Pawel says

9 April, 2013 at 20:25

Powering off the lowest priority VMs is not that bad idea.
It would be nice to have a priority list of VMs that in case of a host failure should have enough resources to perform at the same level. The second priority list of VMs that could tolerate performance degradation and the lowest priority that might be powered off to release resources.
Marko says

9 April, 2013 at 22:44

Coming from a niche called “Industrial usage of VMware” I would be happy to see AC as a rule based tool which acts on user defined scenarios (multi-stage restarts, restarts based on external conditions,…). There is maybe a difference in controlling a production plant if two hosts in two different datacenters are down or if two hosts in one datacenter are down. At the moment there is no difference for AC. Restarting based on trends (calculated using historical data) should be integrated.
As an option HA should provide an “always on” guarantee which will restart all VMs even if this results in a system on full load. That may sound strange but it’s the same if you sit in a four-engined plane. It should bring you to the next airport also with only one engine.
vCenter should provide an “what happens if” screen to visualize the effect of an host failure (or several host failures) – will there be enough ressources to restart all systems? Storage and network usage should be also used for this calculation.
As a plus on top I could image a federation integration so that VMs will be started in another private or public cloud.
larstr says

9 April, 2013 at 22:52

We already have a setting for HA restart priority, but having the ability to define some sort of failure SLA with n% degraded performance in case of a failure would be something many of my customers would make good use of.

But the idea of basing HA on the actual used resources in an intelligent way is what I really miss. Calculating this correctly today is very much based on guesswork. An intelligent approach should also take into account that it often costs extra resources for a VM to get it’s services up and running than while it’s in stable production.

Lars
NiTRo says

10 April, 2013 at 00:03

Setable percentage of used resources (ie 80% cpu and 90% mem) would be great. Network bandwith and storage bandwith/iops criteria would be awesome!
Brett says

10 April, 2013 at 01:59

yes please!!! being able to take a VM or vAPP or even Resource pool’s history into account when making decisions around reservations wold be great. generally we only have some systems requirements book to go on when setting them and thus way over reserve, then having to monitor that and reduce is allot of work I would rather not have to do. What I think wold be nice is saying commit say 80% of the last 2 weeks of average usage or 100% of the last 4 days this way it makes it a sliding scale as applications grow. We would still need hard reservations for those applications like source code build servers that run at 100% day in day out (I use them to make use of any idle capacity left around on the cluster).
On another note I would not be keen on any of it being set at the cluster level as like affinity rules that makes the VM’s dependent on particular hardware where I think we need to keep boundaries at the DC or VM level thus making things agile and flexible.
Dries De Wachter says

10 April, 2013 at 16:33

I too think that taking performance history into account is the key. I think HA should be a lot easier to configure than it is nowadays though. I’m convinced that, as indicated, admission control is oftentime poorly configured. I think the options are not very clear in the interface. It should be more intuïtive. I’m thinking of a host-failure the cluster tolerates in a true sense approach. Taking into account all resource reservations yourself is way too complicated. By the way.. don’t we all want to have at least one spare server in our protected cluster?Maybe HA could follow the standard deviation-path as with DRS? The deviation could then take into account the maximum and minimum amount of memory and cpu per host and combine this with historic usage data to come up with a number that show how well your cluster is protected. This number takes into account how many hosts in your cluster can have a failure (you have the option to indicate this) and then calculate how well you current situation is. Furthermore HA could be an option on a per-VM basis. Even though my vm is running in a protected cluster, it is not allways mission-critical and therefor can be ignored in certain calculations… I know there’s restart priority but that doesn’t cover it all…
Keith says

11 April, 2013 at 11:39

Thank you all for your input on this issue. It is very helpful.

A follow up question. We are discussing having HA base placement decisions on the historical resource usage of VMs rather than reservations. How should we summarize this data into one number? Brett suggested one approach — use the greater of 80% of the last 2 weeks of average usage or 100% of the last 4 days. What do you think of this approach? Are there others that are better suited to your workload?

Another question. We are also discussing HA allocating VMs to hosts so that the restarted VMs could consume, say, 80% of their historical usage. How should HA enforce this? Is it fine to use today’s approach based on share values (all the VMs on a host compete for resources using their share values). Or, e.g., should HA configure the VMs with reservations? Or something else?

And finally, after powering on the first set of VMs, when is it okay to power on any remaining VMs? E.g., should HA measure the demand of all powered on VMs over 10 minutes, recompute the historical averages for each VM, and then decide if more VMs can be powered on?
Alastair Brown says

11 April, 2013 at 18:00

Once our virtual machines have been running for a while I run a script that sets the memory reservation for each vm to be a percentage of the memory consumed:-
#memory allocation is now a percentage of RAM consumed
$memoryAllocation.Reservation = $vmview.Summary.QuickStats.HostMemoryUsage * $ramfraction
This is in my opinion gives all the memory that the machine needs to run in general use. It also reduces the disk required for vmswap, so two birds with one stone.

The script also sets the %reservation as N+1 (and a little bit) :-
$Num = $clustinfo.Summary.NumHosts
if ($Num -eq 0) {$num=4}
$Num=1/$Num
$Num=$Num*100 + 1
[Math]::Truncate($Num)
$num+=3 #bit of leeway
So in theory I should always have enough memory to run machines. The reservation is normally a lot less than them memory aloocated to the VM, so it is still overcommitted but I have enough resources should a host fail.

What do you reckon to this method ??
Andrew Fidel says

11 April, 2013 at 19:36

Keith, this is kind of tangential to your question but one of the significant barriers to going to 100% virtualized with HA is the idea of chained resources. Ie my DNS/AD servers must be up first, then my database layer, and then everything else. You can accomplish this with vApps but it’s hard and cumbersome and forces you to place all of your VM’s into a single vApp which is less than ideal for a variety of reasons. It would be nice if you could give HA a rule that says say one of the highest priority VM’s must be up before the high priority starts, etc. Heck if you want a decent model for this look at host monitoring systems like WhatsUp Gold, you can easily setup dependency chains where monitoring comes back online in the order of dependence.
- Keith says
  
  12 April, 2013 at 21:54
  
  Andrew, I agree. Enforcing restart orders would be very valuable. We are working on a solution. I think it would be valuable to get your and other’s feedback on some design questions. Let’s do this as part of another blog-thread. I’ll follow up with Duncan on this. Probably won’t happen for a couple of weeks.
Marko says

11 April, 2013 at 21:45

Duncan, Keith, how about an “Get an impression” tour of you and some guys from your teams? Imho a lot of readers of this blog would be happy to show you what’s necessary to run their special vSphere environment and what their requirements are. For my part, I would be happy to show you both our internal and also the requirements of our customers. As always, it’s not about your customer, it’s about your customers customer.
RTunisi says

12 April, 2013 at 15:15

Don’t think the term “Admission Control” is that bad to define the process behind evaluating HA restarts, just because the same AC that do this job is responsible for choosing which host a VM will be powered on. This is the admission process the name implies 😉 so is a multitask routine that do both, and this makes the name very suitable…
About the historical data as a source to be used on HA resource evaluation process: this should be the most assertive way of working with AC imho, but leaving the technical field a little, this could bring a feature that is essentialy vCenter (HA evaluation) to be linked to a product like vCOPS, which is great, but would raise the final product cost or leave a situation like “Want to use this beautiful shiny and red Nitro Button that we put on your Brand New Ferrari? Just buy this Nitro-Using-Kit and you’ll be good to go!”… i know that a technical forum like YB wouldn’t be the right place to put a discussion essentially comercial like this, but i wouldn’t be very satisfied to have a great feature available and not be able to use it because its linked to a third tool.

And sorry about the bad english too 🙂
Ledao Cai says

12 April, 2013 at 21:43

It should be simple, easy to understand and calculate. How about allow customer to choose a percentage of total resources (CPU, Memory, Network), and control base on the percentage. On the other hands, the percentage will be the over commitment rate from customers’ points of view. And different percentage setting for different level of VM restart priority.
Greg W. Stuart says

13 April, 2013 at 18:15

Great post Duncan, I like seeing the engineers at VMware reach out to the blogger community and readers to ask what we want. I agree with Ledao Cai, whatever the solution is, it should just be simple and easy to configure. The last thing anyone wants to see is VMware becoming so complex that you have to be a VCDX just to understand how it all works. I like the idea of being able to pick and choose at a more granular level the VMs that you would want to have a high restart priority. I also like the idea of more vCOPS integration with HA.
Hearts Watson says

18 April, 2013 at 07:35

VMware should just use actual RAM Usage devided through all running VM. Same for CPU. Or same kind of normalizes conservative maximum of these values.
We do the same manually now.
Andrew Mauro says

18 April, 2013 at 17:56

There are already a lot of good idea here. Good post and good comments!

IMHO could be nice a way to handle resources considering what you can shutdown (testing VMs, for example), what could be run slowly, and what must run at the best.
Here an integration with SIOC and NIOC could be interesting to try to give also the right resources also for storage and networking (not only for computing).
Josh Odgers (VCDX#90) says

18 April, 2013 at 23:40

My opinion is the “percentage of cluster resources reserved for HA” should do what it says, as opposed to using VM reservations. This is actually what the vast majority of the public believe HA does. So if I have 4 hosts in a cluster each with 20Ghz / 512Gb ram, if i set a 25% percentage of cluster resources reserved for HA, 20Ghz and 512Gb ram should be reserved for HA, regardless of the number of VMs and their reservations. This would ensure one host could be lost with minimal/no performance impact.

I wrote a blog post on a similar topic to the above, which aimed to assist people getting a result similar to the above, to guarantee a minimum level of performance in a HA event with the current setting. The post is here for those who may be interested http://www.joshodgers.com/2013/03/09/example-architectural-decision-vmware-ha-percentage-of-cluster-resources-reserved-for-ha/

The “Host failure the cluster tolerates” should also use a similar process, and calculate the largest host in the clusters compute resource (in my earlier example 20Ghz/512Gb ram), and reserve that amount. If this was implemented, it would basically negate the need for the percentage based HA admission control policie as this policy would effectively do the same thing without the need for the admin or architect to calculate the required percentage and update it when hosts are added/removed from the cluster. This solution would be the more simple approach, as a vSphere admin can select the number of hosts based on the availability requirements, ie: N+2 by selecting 2 as the number of “Host failures the cluster tolerates” and its job done.

Although I rarely use a Standby host, I have no problem with this setting as it stands, and have some use cases which I have discussed an example here http://www.joshodgers.com/2013/02/07/example-architectural-decision-ha-admission-control-policy-with-software-licensing-constaints/
Josh Odgers (VCDX#90) says

18 April, 2013 at 23:45

oh, and on the question of “After a failure, what criteria should HA apply in deciding which VMs to restart?” – SRM handles this well for DR scenarios, so a similar solution for vSphere HA where things like protection groups and priority groups w/ dependencies would be superb.

Related

Reader Interactions

Comments