CPU/MEM Reservation Behavior

Duncan Epping · Mar 3, 2010 ·

Again an interesting discussion we had amongst some colleagues (Thanks Frank, Andrew and Craig! Especially Craig as most text below comes from The Resource Guru). The topic was CPU/Memory reservations and more specifically the difference in behavior of these two.

One would expect that both a CPU and Memory reservation would have the same behavior when it comes to claiming and releasing resources but unfortunately this is not the case. Or should we say fortunately?

The following is taken from the resource management guide:

CPU Reservation:
Consider a virtual machine with reservation=2GHz that is
totally idle. It has 2GHz reserved, but it is not using any of
its reservation. Other virtual machines cannot reserve these 2GHz. Other virtual machines can use these 2GHz, that is, idle
CPU reservations are not wasted.

Memory Reservation:
If a virtual machine has a memory reservation but has not yet accessed its full reservation, the unused memory can be reallocated to other virtual machines. After a virtual machine has accessed its full reservation, ESX Server allows the virtual machine to retain this much memory, and will not reclaim it, even if the virtual machine becomes idle and stops accessing memory.

The above paragraph is a bit misleading , as it seems to imply that a VM has to access its full reservation. What it should really say is “Memory which is protected by a reservation will not be reclaimed by ballooning or Host-level swapping even if it becomes idle,” and “Physical machine memory will not be allocated to the VM until the VM accesses virtual RAM needing physical RAM backing.” Then that pRAM is protected by the reservation and won’t be reclaimed by ballooning or .vswp-file swapping. If there is any .vswp memory at all as no .vswp is created when the reservation is equal to the provisioned memory.

Note, however, that even if pRAM is not allocated to the VM to back vRAM because the VM hasn’t accessed corresponding vRAM yet, the whole reservation is reserved, but the pRAM could still be used This gets really confusing. But I think of it thus:

Reservations can be defined at the VM level or the Resource Pool level.
Reservations at the RP level are activated or reserved immediately.
Reservations at the VM level are activated or reserved when the VM is powered on.
An activated reservation is removed from the total physical Resource “Unreserved” accounting.
Reserving and using a resource are distinct: memory or CPU can be reserved but not used or used but not reserved.
CPU reservations are friendly.
Memory reservations are greedy and hoard memory.
Memory reservations are activated at startup, yet pRAM is only allocated as needed. Unallocated pRAM may be used by others.
Once pRAM is protected by a memory reservation, it will never be reclaimed by ballooning of .vswp-swapping even if the corresponding vRAM is idle.

Example: A VM has 4 GB of vRAM installed and a 3 GB memory reservation defined. When the VM starts, 3 GB of pRAM are reserved. If the host had 32 GB of RAM installed and no reservations active, it now has 29 GB “unreserved”.

However, if the VM accesses only 500 MB of vRAM, only 500 MB of pRAM are allocated (or granted) to it. Other VMs could use 2500 MB of RAM that you would think is part of the reservation. They cannot reserve that 2500 MB however. As soon as the VM accesses 3 GB of vRAM and so has 3 GB of pRAM backing it, no other VMs can use that 3 GB of pRAM even if the VM never touches it again, because that pRAM is now protected by the 3 GB Reservation. If the VM uses 4 GB, it gets the 3 GB guaranteed never ballooned or swapped, but the remaining 1 GB is subject to ballooning or swapping.

Simple huh 😉

Comments

David_G says

3 March, 2010 at 16:37

Doesn’t Windows, not sure about Linux, typically touch all it’s memory during boot? Therefore protecting all it’s memory reserving.
Jason Boche says

3 March, 2010 at 16:42

My head is spinning. 🙂
Brandon says

3 March, 2010 at 17:07

How would a reservation at a resource pool level work then? You mention it briefly. Do all the VMS within that resouce pool consume memory thus hoarding up memory, perhaps from VMs that are also within that same resource pool? I cannot tell from the way it is worded. You say it at the resource pool level it is reserved immediately, so I can imply then that those resources are guaranteed to the VMs within it, but I would not want them hoarding memory from each other. If I do not have it confused, then it appears the main point of concern is applying reservations to VMs directly, because it works a little differently than expected. On a resource pool it just takes the memory and reserves it all immediately. Keeping any other resource from taking it regardless. What about ballooning and swapping on the VMs in the resource pool? Does it dole out reserved memory to the VMs in the resource pool — and then once the amount of reserved memory is consumed, it is maintained on the VMs that claimed it originally (assuming they do not power off), leaving the remaining VMs to fight with the left over unreserved memory? First come, first serve? That would be very counter intuitive for sure.
Kenneth van Ditmarsch says

3 March, 2010 at 17:28

Had to read the post multiple times but now I get it 😉 Good stuff!

Kenneth
Jason Langer says

3 March, 2010 at 17:32

Great read. Thanks for posting that up Duncan.

-Jason
vmm386 says

3 March, 2010 at 20:13

Is it like reserving a seat at the cinema. If no, show up, the seat can be reallocated to somebody else. But if I do show up and claim my reserved seat, but then walk away in the middle of the movie, my seat reservation doesn’t get reallocated for the duration of the movie?
Doug says

4 March, 2010 at 04:25

Excellent post. Thanks for the clarification.

For me, remembering these is enough to get me through most explanations:

# CPU reservations are friendly.
# Memory reservations are greedy and hoard memory.

🙂
Chris Huss says

4 March, 2010 at 16:21

How does the idle memory tax come into play here?

Here’s page 29 of the latest Resource Management guide:

Memory Tax for Idle Virtual Machines
If a virtual machine is not actively using all of its currently allocated memory, ESX/ESXi charges more for idle
memory than for memory that is in use. This is done to help prevent virtual machines from hoarding idle
memory.
The idle memory tax is applied in a progressive fashion. The effective tax rate increases as the ratio of idle
memory to active memory for the virtual machine rises. (In earlier versions of ESX which did not support
hierarchical resource pools, all idle memory for a virtual machine was taxed equally).
The Mem.IdleTax advanced setting allows you to modify the idle memory tax rate. Use this option, together
with the Mem.SamplePeriod advanced attribute, to control how the system determines target memory
allocations for virtual machines.

It’s my belief that the VMkernel can use the Mem.IdleTax to reclaim unused memory from a VM regardless of it’s used memory reservation. This technique reclaims memory where other techniques like the balloon driver can’t.
Duncan Epping says

4 March, 2010 at 18:26

IdleTax does not apply to memory which is reserved. So even when the memory is sitting idle it will still not be reclaimed.
Chris Huss says

5 March, 2010 at 12:50

Is there a document that states IdleTax won’t touch reserved memory?
DGo says

5 March, 2010 at 18:52

So to reclaim the pRAM of a VM that was reserved, we have to shut the machine off to flush the memory reservation? Does a reboot count? or this Power Off?
Craig Risinger says

5 March, 2010 at 22:08

@David_G
Yes, typically a Windows VM touches all its installed vRAM at bootup. That in turn forces pRAM to back all its vRAM at least for a while. (I think this is not true with vSphere, where Transparent Page Sharing at startup happens immediately, as I understand it. But it’s true for ESX 3.) Whatever pRAM is protected by a Reservation will then never be reclaimed by ballooning or host-level swapping.

@Chris Huss:
I’ve not seen it in a document, but Scott Drummonds confirmed to me that memory protected by a reservation is never reclaimed by ballooning or host-level swapping. (I don’t know about transparent page sharing.)

@DGo:
Yes, the memory reservation would have to be deactivated. To do that, I don’t know if a reboot is enough or if you have to do a full power off and power on.

@Doug:
Indeed, those are the keys. CPU reservations are friendly; memory reservations are greedy.
Craig Risinger says

5 March, 2010 at 22:15

P.S.
It makes sense for CPU reservations to be friendly but Memory reservations to be greedy. It’s easier to snatch back CPU at a moments notice; reclaiming memory someone else was using requires swapping (or ballooning).

Think what would happen if VM-A holds a CPU reservation and a memory reservation. VM-B wants to use those resources. At time t=0, VM-A isn’t using either CPU or memory. So VM-B gets to use the pCPU cycles that would otherwise be idle. At t=1, if VM-A wants to use CPU, no problem: VM-B gets deschduled almost immediately, and VM-A gets to use the cycles it’s reserved.

But what if at t=0 VM-B got to steal some of the memory VM-A had reserved? Between t=0 and t=1, VM-B put data into that pRAM. Now at t=1, VM-A wants to use that pRAM. But now the hypervisor has to flush out VM-Bs data from that pRAM. It might be able to do this through ballooning (inside VM-B’s vRAM), and/or it might have to write to VM-B.vswp. Neither of those is going to be nearly as fast at reclaiming reserved memory as simply juggling vCPUs around pCPUs. So it’s OK for CPU reservations to be friendly, but memory reservations have to be greedy.

See, it does actually make some sense! 🙂
Chris Huss says

5 March, 2010 at 23:49

Craig,

Thanks for the input. The memory reclaiming technique I’m talking about has nothing to do with vmmemctl(balloon driver) or the .vswp file. The memory idle tax should reclaim VM memory that it hasn’t used in a while…reserved or otherwise.

That is the $1M question right now.
Duncan says

6 March, 2010 at 08:34

Euuh Chris, don’t know what you think reclaims the idle memory but depending on how this is setup it’s either via .vswp or via the balloon driver! By default it’s the balloon driver which is being used.
wilson says

9 March, 2010 at 05:23

This is a good post. Many have trouble with the concept of reservation. The wording here is quite clear.
Chris Huss says

10 March, 2010 at 14:41

Duncan,

I wasn’t aware that the balloon driver was involved with the Mem.IdleTax. I haven’t seen any documentation stating this…and assumed that the VMkernel just stopped mapping idle memory for the VM without letting it know. If the VM needed the memory again, the VMkernel would just re-map it.

I can be totally wrong about this, but I have not seen any documentation to debunk this theory. It is my belief that the Mem.IdleTax is a totally separate memory saving/shaving technique from the balloon driver or the .vswp file.

If VMware engineering has or would publish an official article on this…I think it would clear up alot of things.

I never liked the idea of memory reservations allowed a VM to keep unused or idle memory. That is different than how CPU reservations work…and I’m hoping that my Mem.IdleTax theory is correct.

Chris
Duncan says

10 March, 2010 at 21:37

So how would the VMkernel know it is unused / free? There’s no way from the outside to tell as the OS doesn’t zero out the mem page when it frees it. It just marks it free in its table.

Just think about it, what if I store something in memory and don’t use it for a day does that mean the VMkernel can just throw the page away and give it to someone else? Something inside the VM needs to tell the OS to do something with that page.

Ahhh what the heck, I will write an article on this when I have some spare time on my hands.
Chris Huss says

10 March, 2010 at 22:12

How would the VMkernel know it is unused/free? He’s the one mapping memory for the VM…so I would think he knows what pages are active for the VM…and which one(s) haven’t been accessed in a while. Thus, Mem.IdleTax.

I think it’s exactly what the VMkernel does with the tax…just stops mapping it physically. If the VM needs it…the VMkernel can remap it physically.

Just my theory…unless VMware enginnering can produce a document to the contrary.

Chris
Duncan Epping says

11 March, 2010 at 00:03

Sorry Chris but I absolutely believe what you are stating here is incorrect.

If the VMkernel would stop mapping it physically there still needs to be a process that triggers the GOS to either page it or commit it to disk or whatever. Memory pages can’t disappear magically and reappear magically.

Anyway, I will do digging to give you the proof you obviously need to believe me.
Chris Huss says

11 March, 2010 at 06:09

Thanks man. I’m struggling to find any official VMware documentation that says the balloon driver or the .vswp file have anything to do with the Mem.IdleTax process.

Whatever you can find out will put this theory to rest one way or the other…and I greatly appreciate your efforts.

Chris
Seva says

19 March, 2010 at 09:46

Just to put the bottom line here:

1. Windows initialization
From some old explanations by Carl Waldspurger I know, that after Windows GOS zeroes its memory at start TPS kicks off and reduces amount of zero pages. That was prior to 4.0. What did Craig mean, when wrote that in vSphere TPS “at startup happens immediately” needs clarification (you cannot predict which page will be zeroed and share it, prior OS will do that)
2. Memry Tax is designed to implement something like the “social justice” in real world 😉 It decreases the number of shares for memory pages, which are not being us by “privileged” VM (i.e. VMs with a big amount of shares) and thus allows the redistribution of that memory to “needy” VMs (i.e. VMs with low amount of shares but with high memory demand). Connection to balooning and swapping which Chris Huss desperately seeking in our documentation is that both of them are (like the tax police – Finanzamt :))) ) taking a part in memory redistribution. When Mem.BalancePeriod expires (or the memory shortage is detected oby the other way) baloon driver is used to redistribute memory according to memory shares or whatever happened to them after the idleTax deduction. When the balooning driver achieves Mem.CtlMaxPercent and there is still a demand in memory redistribution the swapper starts to retrieve the rest memory. The swapper activity indicates very earnest memory issues.
Generally shares a re more preferred way to provide enough memory for critical (lets say “platinum SLA”) VMs, because this metod is more flexible and allows redistribute the idle memory. Reservation just pins up the memory.
Samuel Nunes says

18 December, 2012 at 21:27

I’m still a little confused. So when I arrive at the total booking will start a file. Vswp?

Thank You
Naseer says

7 November, 2013 at 06:14

Hi Frank, Andrew,Craig,Duncan

Got a query,appreciate your valuable inputs.

Assume there is a Resource Pool created by name “TEST” which has 4 vms configured in it (nothing set on per VM level , no reservation, shares etc)

On Resource Pool TEST , under Memory Reservation , assume the memory reservation is set to 4GB , does this mean each VM will get 4GB of memory reservation OR 4GB of memory reserved gets divided between 4 vms and each gets 1 GB

Also does SHARES come into calculation in the above scenario while ESX divides reserved memory?

Related

Reader Interactions

Comments