VMware

DRS not taking CPU Ready Time in to account? Need your help!

Duncan Epping · May 9, 2013 ·

For years these rumors have been floating around that DRS does not take CPU Ready Time (%RDY) in to account when it comes load balancing the virtual infrastructure. Fact is that %RDY has always been a part of the DRS algorithm but not as a first class citizen but as part of CPU Demand, which is a combination of various metrics but includes %RDY. Still, one might ask why %RDY is not a first class citizen.

There is a good reason though that %RDY isn’t, just think about what DRS is and does and how it actually goes about balancing out the environment, trying to please all virtual machines. Yes a lot of possibilities indeed to move virtual machines around in a cluster. So you can imagine that it is is really complex (and expensive) to calculate what the possible impact is after a virtual machine has been migrated “from a host” or “to a host” for all of the first class citizen metrics.

Now, for a long time the DRS engineering team has been looking for situations in the field where a cluster is balanced according to DRS but there are still virtual machines experiencing performance problems due to high %RDY. The DRS team really wants to fix this problem or bust the myth – what they need is hard data. In other words, vc-support bundles from vCenter and vm-support bundles from all hosts with high ready times. So far, no one has been able to provide these logs / cold hard facts.

If you see this scenario in your environment regularly please let me know. I will personally get you in touch with our DRS engineering team and they will look at your environment and try to solve this problem once and for all. We need YOU!

Tested / Supported / Certified by VMware? (caching / dr solutions)

Duncan Epping · May 8, 2013 ·

Lately I have been receiving more and more questions around support for specific “hypervisor side” solutions. With that meaning, how VMware deals with solutions which are installed within the hypervisor. I have always found it very difficult to dig up details around this both externally and internally. I figured it was time to try to make things a bit more clear, if possible at all.

For VMware Technology Partners there are various programs they can join. Some of the programs include a rigid VMware test/certification process which results in being listed on the VMware Compatibility Guide (VCG). You can find those which are officially certified on our VMware Compatibility Guide here, just type the name of the solution in the search bar. For instance when I type in “Atlantis” I get a link to the Atlantis ILIO page and can see which version of ILIO is supported today with which version of vSphere. Note that in this case on vSphere 4.x is listed, but Atlantis assured me that this will be updated to include vSphere 5.x soon.

Then there are the Partner Verified and Supported Product (PVSP) solutions. These are typically solutions that do not fit the VCG, for instance when it is new type of solution and there is no certification process yet. Now of course there are still strict guidelines for these solutions to be listed. For instance, your solution will only be listed on the PVSP (and the VCG for that matter) when you are using public APIs. An example for instance is the Riverbed Steelhead appliance, it follows all of the guidelines and is listed on the PVSP as such. You can find all the solutions which are part of the PVSP program here.

Finally there is the VMware Solutions Exchange section on vmware.com. This is where you will find most other solutions… Solutions which are not officially tested/certified (part of the VCG) or part of the PVSP program because of various reasons. Note that these solutions, although listed, are not supported by VMware in anyway. Now, of course VMware Support typically will do its best to help a customer out. However, it is not uncommon to be asked to reproduce the problem on an environment which does not have that solution installed so that it can be determined what is causing the issue and who is best equipped to help solving the issue.

I am not saying that those solution that are not listed on the VCG or PVSP should be avoided. They could very well solve that problem you have, or be the solution to fulfill your business requirements and as such be the “must use” component in your stack. It should be noted though that when introducing any 3rd party solution that there is a “risk” associated with it. From an architectural and operational perspective it is heavily recommended to validate what that risk exactly is. How you can minimize that risk? What you will need to do to get the right level of support? And ultimately, which company is responsible for which part? As when push comes to shove, you don’t want to be that person spending hours on the phone just figuring out who is supporting what! You just want to be on the phone to solve the problem right?!

I hope this helps some of you out there who asked me this question.

** Note: the above is not an official VMware Support statement or a VMware Partner Alliances statement, these are my observations made while digging through the links on vmware.com **

What is static overhead memory?

Duncan Epping · May 6, 2013 ·

We had a discussion internally on static overhead memory. Coincidentally I spoke with Aashish Parikh from the DRS team on this topic a couple of weeks ago when I was in Palo Alto. Aashish is working on improving the overhead memory estimation calculation so that both HA and DRS can be even more efficient when it comes to placing virtual machines. The question was around what determines the static memory and this is the answer that Aashish provided. I found it very useful hence the reason I asked Aashish if it was okay to share it with the world. I added some bits and pieces where I felt additional details were needed though.

First of all, what is static overhead and what is dynamic overhead:

When a VM is powered-off, the amount of overhead memory required to power it on is called static overhead memory.
Once a VM is powered-on, the amount of overhead memory required to keep it running is called dynamic or runtime overhead memory.

Static overhead memory of a VM depends upon various factors:

Several virtual machine configuration parameters like the number vCPUs, amount of vRAM, number of devices, etc
The enabling/disabling of various VMware features (FT, CBRC; etc)
ESXi Build Number

Note that static overhead memory estimation is calculated fairly conservative and we take a worst-case-scenario in to account. This is the reason why engineering is exploring ways of improving it. One of the areas that can be improved is for instance including host configuration parameters. These parameters are things like CPU model, family & stepping, various CPUID bits, etc. This means that as a result, two similar VMs residing on different hosts would have different overhead values.

But what about Dynamic? Dynamic overhead seems to be more accurate today right? Well there is a good reason for it, with dynamic overhead it is “known” where the host is running and the cost of running the VM on that host can easily be calculated. It is not a matter of estimating it any longer, but a matter of doing the math. That is the big difference: Dynamic = VM is running and we know where versus Static = VM is powered off and we don’t know where it might be powered!

Same applies for instance to vMotion scenarios. Although the platform knows what the target destination will be; it still doesn’t know how the target will treat that virtual machine. As such the vMotion process aims to be conservative and uses static overhead memory instead of dynamic. One of the things or instance that changes the amount of overhead memory needed is the “monitor mode” used (BT, HV or HWMMU).

So what is being explored to improve it? First of all including the additional host side parameters as mentioned above. But secondly, but equally important, based on the vm -> “target host” combination the overhead memory should be calculated. Or as engineering calls it calculating “Static overhead of VM v on Host h”.

Now why is this important? When is static overhead memory used? Static overhead memory is used by both HA and DRS. HA for instance uses it with Admission Control when doing the calculations around how many VMs can be powered on before unreserved resources are depleted. When you power-on a virtual machine the host side “admission control” will validate if it has sufficient unreserved resource available for the “static memory overhead” to be guaranteed… But also DRS and vMotion use the static memory overhead metric, for instance to ensure a virtual machine can be placed on a target host during a vMotion process as the static memory overhead needs to be guaranteed.

As you can see, a fairly lengthy chunk of info on just a single simple metric in vCenter / ESXTOP… but very nice to know!

Guaranteeing availability through admission control, chip in!

Duncan Epping · Apr 9, 2013 ·

I have been having these discussions with our engineering teams for the last year around guaranteed restarts of virtual machines in a cluster. In the current shape / form we use Admission Control to guarantee virtual machines are restarted. Today Admission Control is all about guaranteeing virtual machine restarts by keeping track of Memory and CPU resource reservations, but you can imagine that in the Software Defined Datacenter this could be expanded with for instance storage or networking reservation.

Now why am I having these discussions, what is the problem with Admission Control today? Well first of all it is the perception that many appear to have of Admission Control. Many believe the Admission Control algorithm uses “used” resources. Reality however is that Admission Control is not that flexible, it uses resource reservations and as you know this is static. So what is the result of using reservations?

By using reservations for “admission control” vSphere HA has a simple way of guaranteeing a restart is possible at all times. Simply because it checks if sufficient “unreserved resources” are available and if so it allows the virtual machine to be powered-on. If not, then it won’t allow the power-on just to ensure that all virtual machines can be restarted in case of a failure. But what is the problem? Although we guarantee a restart we do not guarantee any type of performance after the restart! Unless, unless of course you are setting your reservations equal to what you provisioned… but I don’t know anyone doing this as it eliminates any form of overcommitment and will result in an increase of cost and a decrease in flexibility.

So that is the problem. Question is – what should we do about it? We (the engineering teams and I) would like to hear from YOU.

What would you like admission control to be?
What guarantees do you want HA to provide?
After a failure, what criteria should HA apply in deciding which VMs to restart?

One idea we have been discussing is to have Admission Control use something like “used” resources… or for instance an “average of resources used” per virtual machine. What if you could say: I want to ensure that my virtual machines always get at least 80% of what they use on average? If so, what should HA do when there are not enough resources to meet the 80% demand of all VMs? Power on some of the VMs? Power on all with reduced share values?

Also, something we have discussed is having vCenter show how many resources are used on average taking your high availability N-X setup in to account, which should at least provide an insight around how your VMs (and applications) will perform after a fail-over. Is that something you see value in?

What do you think? Be open and honest, tell us what you think… don’t be scared, we won’t be bite, we are open for all suggestions.

vCenter Federation Survey

Duncan Epping · Apr 2, 2013 ·

One of our product managers asked me if I could share this survey with the world. The topic is vCenter Federation and APIs. It literally takes a couple of minutes to fill out. Your help / input is greatly appreciated, so please if you have those two minutes to spare at the end of the day, then take the time:

http://tinyurl.com/VMwareFederator