We had a discussion internally around memory limits and what the use case would be for using them. I got some great feedback on my reply and comments so I decided to turn the whole thing into a blog article.
A comment made by one of our developers, which I highly respect, is what triggered my reply. Please note that this is not VMware’s view or usecase but what some of our customers feed back to our development team.
An admin may impose a limit on VMs executing on an unloaded host to better reflect the actual service a VM will likely get once the system is loaded; I’ve heard this use case from several admins)
From a memory performance perspective that is probably the worst thing an Admin can do in my humble opinion. If you are seriously overcommitting your hosts up to the point where swapping or ballooning will occur you need to think about the way you are provisioning. I can understand, well not really, people doing it on a CPU level as the impact is much smaller.
Andrew Mitchell commented on the same email and his reply is key to understanding the impact of memory limits.
“When modern OS’s boot, one of the first things they do is check to see how much RAM they have available then tune their caching algorithms and memory management accordingly. Applications such as SQL, Oracle and JVMs do much the same thing.”
I guess the best way to explain in one line is: The limit is not exposed to the OS itself and as such the App will suffer and so will the service provided to the user.
The funny thing about this is that although the App might request everything it can it, it might not even need it. In that case, more common than we think, it is better to decrease provisioned memory than to create an artificial boundary by applying a memory limit. The limit will more than likely impose an unneeded and unwanted performance impact. Simply lowering the amount of provisioned memory might impact performance but most likely will not as the OS will tune it’s caching algorithms and memory management accordingly.
Do the VMware Tools tell Windows what part of memory is swapped at VMware level? Say VM with 2GB RAM and 1,5GB limit. Windows now needs the 2GB and VMware will now use 0.5GB VM swap. Is just any memory placed in this slow swap memory or do the VMware Tools help on deciding which memory to place in the slow swap space?
And since a VM swap won’t be undone untill next power-off of the VM, is ballooning able to first free up the memory that was placed in the slow swap space? In my opinion Windows doesn’t know / care and just decides for itself, without paying attention to the fact if the memory is coming from host RAM or SWAP. In that case, if memory requirements by Windows would drop and memory could be returned to the host, you can’t be sure the SWAP is emptied first. Correct???
Gabrie
The only use case I see is to lower performance! 🙂
Most people just don’t realize that setting a memory limit will force all memory above the limit into the swap file.
Let’s say you are provisioning an IIS VM for a colleague within an organization. Your colleague is convinced that his server needs 8GB RAM. You know from performance monitoring that this server will never need any more than 4GB, and you’re tired of arguing the point. What is the downside of allocating 8GB RAM to the VM, but using a LIMIT of 4GB? This allows me to “hot add” memory (up to the allocated 8GB) to the VM if in fact the VM gets memory starved. Bad idea?
I guess we just need to eliminate the underlying general purpose OSes and/or build smarter apps 😀
@Arnim van Lieshout,
we use limits on VMs running applications with known memory leaks. This helps keep them in check, separated from the rest of the clean-running apps.
@Sketch….if you are running an application with a known memory leak, why even run the application? Why wouldn’t you monitor the application and have it restarted? This is not a good use for this scenario and not a resounding reason in my opinion to adopt this methodology.
@PimpF… its an old custom app. There’s not much I can do to get around that due to office politics (though its a “future” project to rebuild the app – and has been for 5 years now)… for the time being, its in a java container on a redhat box, so the support team restarts that particular container…
@Gabrie
As far as I know, ESX will start ballooning first to free up unused pages in an effort to put all active pages into the allowed 1.5GB of machine memory. If this is unsuccesful (because the VM needs 2GB) the kernel will use swapping to swap out pages (I guess the ones that haven’t been touched recently first) to fit everything within the allowed 1.5GB.
After memory needs drop below the 1.5GB. Pages that reside in the VM swap file will stay swapped out until they are accessed by the guest OS. Ballooning will not try to “clean up” memory in the swap file.
@Russ
In that situation limit the VM to 1 or 2GB so performance will be bad. Then explain to the stubborn colleague that configuring it with 4GB will boost performance. Configure it with 4GB instead 8GB and remove the limit. Your colleague will be convinced that 4GB is better indeed.
As I mentioned before the use case is to lower-performance. You just used it in your advantage 🙂
@Sketch
As Duncan already mentioned. Performance will always drop.
Limit the VM by decreasing the configured memory to a reasonable value. If you suffer from memory leakage make sure you restart your app/vm regularly and let someone fix that app!
@Arnim van Lieshout
Thanks. That’s pretty much what we’re doing now. We’ve also isolated it from the pools. aside from that, there’s not much else to do but wait until its re-written. as for performance drop, I don’t see much – but that could be due to the fact that we’re restarting the container every so often. I don’t see how limits in and of themselves would degrade performance unless the limit is reached – and as you mentioned the balloon driver kicks in, swapping, etc…
@Sketch
I don’t believe the balloon driver will kick in just because a VM reaches it’s memory limit. As long as the host still has machine memory available, the balloon driver will remain deflated even though esx is dipping the vm into swap.
Although, you have a limit set, you shouldn’t see a perf penalty until the vm reaches that limit on memory usage.
@Baptista : Even if the Host has enough ram as a limit has been set the kernel will first try to reclaim idle pages to free up memory (balloon driver) than it will try to page out as much as it can (balloon driver) and as a final resort it will use vmkernel swap.
@Gabrie : VMware tools does not have any clue of which pages should be pages / swapped. When the balloon driver kick in the OS will decide what to page and what not. When the vmkernel kicks in it is more or less random.
@sketch : it WILL impact performance, as there is no way of telling the OS to use less than it thinks it has available. So at some point your OS will go across that artificial boundary and that will have an impact.
@Russ : please read my post again for the downsides, I think the article clearly states what the downsides are.
@sketch : place a limit because of a memory leak? why not provision less memory then? What’s the point?
To restate what I think the main point of the article is:
The Guest OS and apps tend to use whatever vRAM they see. They’re opportunistic. Limits are invisible to the guest OS. Therefore, a limit will hurt performance as the GOS tries to use vRAM that’s actually backed by .vswp. A VM with less vRAM but no limit will outperform a VM with more vRAM but a limit.
@Arnim:
Host-level swapping to .vswp is very unintelligent. It doesn’t pick least-recently used pages. It would be impractical for ESX to track such detailed page usage history (“this page of vRAM last accessed at _____” or similar). (How does it know Active pages then, you ask? It does some statistical sampling, where a subset of pages is tracked for how often they’re accessed. But that’s not every page being tracked.)
@Duncan
provisioning less memory would be an option if this were the only app on the server (multiple apps are integrated and ‘apparently’ can not be separated from each other onto other VMs). the use of limits comes in as a stop-gap measure. As Russ pointed out above, it is useful to have the ‘hot-add’ for RAM, in cases that the server admins can’t do their job (out sick, world cup, long weekend, etc), we can extend the viability of that particular server until the max ram point is reached or the guys come back in and cycle the container. Its only effecting one particular container, so we can restart the container w/o having to restart the entire server, thus keeping our theoretical up-time. That’s the best explanation i can give. If anyone has a better suggestion, I’m all ears.
@Baptista
That’s what I was thinking… another reason we have the limit in place – as otherwise, the app would soak up all the resources, inflate the driver and slow the system down for everyone – at least, that’s what I’m thinking would happen – apparently I was wrong on that angle…
@Duncan,
So, does that mean that when any guest OS reaches its ram ‘limit’, pre-set or virtual, the guest OS can force the host to inflate the balloon driver?
Yes, I will be attending TA7750 at VMworld this year…
Don’t forget that the limit is truly from the hypervisor perspective, so the memory overhead is also maintained UNDER the limit. Even setting the limit equal to the provisioned size is a bad idea for this reason, especially with multiple vCPUs as there is even more overhead.
CPU limits make sense in certain situations, memory limits are a horrible idea unless you politically can’t do anything about it and you are a BOFH.
@Brandon
So, a RAM limit at any level is effectively reducing RAM levels within the host as a whole? so how does the hypervisor (not the guest OS) see a VM provisioned with 8GB RAM, but a 4GB limit? aren’t those ‘other’ 4GB still available for other VMs to be provisioned?
@Sketch: It still doesn’t make sense. Are you guys monitoring those VMs every minute and every day of the week?
@Sketch:
– A limit is a per VM limit. So if you provision 8GB to a VM and set a 4GB limit the OS will see 8GB but can only touch 4GB worth of RAM.
– Yes those other 4GB will be available to other VMs to use… that has got nothing to do with limits at all as that is on a different layer.
@Duncan Epping
not quite THAT closely, but as that VM is a known risk, people keep an eye on it for sure, just as it was watched when it was on a physical machine.
I understand that the OS will still see the 8Gb, and only be able to use 4GB but Brandon stated it was done at the hypervisor (or kernel) layer as opposed to the VM layer. That went to a question of whether a guest OS can force the host to inflate the balloon driver?
/I apologize for the confusion I’m causing…
@Sketch
I think you transposed what you meant to say, since inflating the balloon driver would effectively take more memory away from the VM. If the VM neded more memory for its own use you would hope the balloon driver will deflate — not something that is likely to happen if there is pressure.
The hypervisor will force that VM to stay under the limit at all costs. It is a hard limit and the VM will never get more than what the limit is set to, period. So if the balloon driver can’t inflate enough to keep it below the limit, the next thing that will happen is vmkernel will swap the guest’s memory to the .vswp file. That is the worst thing that can happen performance wise, because it is a blind swap — it has no idea what is “important” memory and what is “not important”. Check your performance graphs, I wouldn’t be surprised if that is happening. Check the memory swapped metric.
Memory overcommitment is a good thing when done correctly, but the best practice is to keep enough memory in the host server to satisfy the working sets of your VMs.
Limits are common in service providers and I guarantee you will see them heavily in the cloud (public or private). People want to pay for resources, the admins need controls to lock them into those resources and control them. Limits are an easy way to make sure you only get what you pay for. VMware might not understand why people use them vs. what they intended perhaps, but people will continue to use them until other controls can be created/used in place of limits.
@Russ
Lets say I create a VM with 8GB of RAM but I provide a 2GB limitation so the admin feels warm and fuzzy.
You boot the server and the OS starts to cache disk contents as well as start the application. He thinks he has 8GB of RAM and he starts mapping some pages to memory none the wiser. Eventually he is going to hit his 2GB ceiling which means some of these pages might end up on disk even though he may only require 1-1.5GB of RAM to run his application at full performance.
However, now the guest OS has to read out of the swap file on the disk (completely unaware since VMware hides this fact from the guest) so suddenly webserver response times go from say 15ms up to 500ms. Why? Because it just so happened that maybe ‘default.htm’ was stored in a page that’s actually on disk. It could inflate the balloon driver on itself (presuming you meet 100% of the requirements, many shops don’t realize you need to ensure there is sufficient guest swap for example) but there is still a penalty associated with guest kernel swapping.
Basically it should be based on a service level agreement. If an application only needs a 500ms response time then maybe using limits is fine. If the application has much smaller tolerances then you’re running a big risk imposing a limit that could cause you to lose a customer/get an angry phone call.
Basically if you have data that supports a server only needs 2GB of RAM then you should present this data to the application owner and provide the server 2GB of RAM. If he pushes back you need to institute some kind of resource charging scheme since virtualization isn’t a “magic compute generator.”
Thanks for clarifying gents. It’s definitely a topic worthy of discussion. Excellent article Duncan!
Duncan,
I like and agree with your stand on this matter. It makes lots of sense, specially when this comes down to applying memory limits on the VM’s ourside of the guess OS.
But, what is the use case for the limit setting. Can anyone think about a proper situation?
@Thomas Bryant
But why not just right-size the vRAM in the VM? Bigger VMs cost more, smaller cost less, and none have limits defined on the VM.
Limits defined at the Resource Pool level are a different situation and IMO much more likely to be sensible, especially if the customer gets to create or destroy (or power off and on) VMs in the pool. If you limit a Resource Pool, the collective demands of the VMs could still be met (if there aren’t too many VMs with too much collective vRAM running at a time). If you limit a VM, the demand of the VM will almost certainly not be met.
When Windows boots it fills all available memory with zeros. Do the zeros count against the limit? Meaning, you’ve assigned 4GB, windows fills the 4GB, but the limit for physical you assigned is 2GB – does that mean the zeros above 2GB are now out in the page file?
@Fred Peterson:
Yes.
A future feature might make transparent page sharing happen immediately on those zero-pages as they’re written, which means that they wouldn’t need to go in the .vswp and they wouldn’t count toward the limit. Limits limit pRAM, not vRAM access.
I can give you two use cases for imposing memory and CPU limits.
1) Even with admission control switched on you can find that there isn’t a whole host worth of free ram in the system. When you want to remediate a host, limits can be used to selectively squeeze VMs or resource pools to free up memory temporarily.
2) some sets of services are literally as different as night and day. User facing services like Exchange and file server VMs might have the right amount of ram and CPU for the day but at night are arguably over provosioned.
Conversely the VMs that run your overnight batch processing do nothing in the day but work hard at night.
Why not schedule a vCenter task to see saw the resource pool limits for these two sets of service.
Why not schedule a vCenter task to squeeze the
I totally agree, I have seen first hand the horror of memory limits and the impact it can have on a guest.
A customer asked me to assist resolve a performance issue with a VM. The VM was Windows Server 2000, it was allocated 4GB of RAM. I opened the VI client performance tab and was shocked to see the Memory ballon running at 3GB solid. Some bright spark created a VM memory limit of 1GB on the Windows 2000 template, the template was allocated 1GB of RAM. Unfortunately this VM guest on deployment was granted more memory and no one thought to look for a hard coded memory limit.
@Suttoi
1) What type of admission control are we talking about? HA or DRS?
2) if the service is not doing anything why impose a limit and risk the chances of having OS related pages being swapped to disk? Seriously I don’t get it.
Suttoi; your second use case would be handled automatically by the hypervisor. If resource demand shifts from one group of VMs to another then DRS or ballooning can be used to reclaim inactive pages from idle virtual machines.
@Duncan
1) Is there an answer that doesn’t mean I’m wrong?
I was thinking of HA admission control. Just re-read your HA deep dive (safety first) and still think that admission control (slot mode, not % or named host mode) is mostly driven by reservations.
So, if there are no reservations, HA admission control will keep letting VMs start and consume memory with no consideration to the n+x cluster config. If this happens, the next time scheduled maintenance is needed on a host, you are stuck.
Having been in this situation, the best way I could think of to selectively free up pRAM without hurting performance on key VMs was to temporarily add memory limits to the least important VMs and force the balloon driver into action.
2) Control is my key argument. You want the batch processing VMs to be paged out to disk in the day because they aren’t doing anything. By forcing them out of pRAM in the day, you make the biggest possible amount of pRAM available to active your workloads. This reduces the chances of one of these active workloads having to balloon or swap.
@Suttoi
1) you might want to read that HA article again as both Slotmode and %mode is based on reservations… anyway I understand what you are trying to do.
2) it might incur a swap in penalty, keep that in mind.
Hi all,
So it will make any sense to specify a limit that is equal to the amount of RAM configured for a VM or the maximum RAM a machine can access is already given by its max configured memory?
Thank you,
Dan