Scott Lowe pointed to KB Article 1020524 in his short take article. Although I agree with Scott that it is a useful article it is actually technically incorrect. I wanted to point it out as when Scott points to something you know many will pick up on it.
On Nehalem systems with Hardware assisted Memory Management Unit (MMU), TPS improves performance from 10 – 20% by using large pages.
Since TPS is done per small page, the VMkernel delays page sharing until host memory becomes over-committed. In the background, the ESX host gathers information about TPS, then uses it when it needs more memory.
It might be just a detail, but it is important to realize that it is not TPS that improves performance but Large Pages. TPS has got absolutely nothing to do with it and does not impact performance anywhere near the mentioned 10-20%.
One thing to note is that TPS does identify the pages which can be shared, however as 2MB pages are used they are not actively shared. When a system gets overcommitted those Large Pages (2MB) will be broken down in Small Pages (4KB) and the already identified pages will shared.
I just love TPS….
NiTRo says
Duncan, is there any way to force this pages breaking for testing purpose ?
Matt says
You can set the Mem.AllocGuestLargePage to 0 and then reboot your VMs and it won’t use large pages. But I don’t really see a need to do that except for testing purposes like you said.
Duncan – I bet the KB is just a typo and will be fixed. But I’m glad that there is finally a KB about this. This is so much better than pointing people to a thread on the VMTN forums from when ESX4 was in beta.
Mihai says
I set Mem.AllocGuestLargePage to 0 on my production servers and did not measure any degradation in performance on our workload (many smallish VMs).
On the other hand memory usage dropped dramatically. From my experience, if left to 1 TPS would be delayed too much to be useful in my opinion (swap-ing and ballooning would occur).
Matt Liebowitz says
Looks like VMware has updated the KB to fix the typo:
On Nehalem systems with Hardware assisted Memory Management Unit (MMU), performance improves from 10 – 20% by using large pages.
Koen Warson says
Hi,
This setting : Mem.AllocGuestLargePage is an ESX-host-wide setting.
Is there also a possibility to configure this VM per VM ?
Maybe I would like to control this. Leave Large Pages on for the big VMs (4GB or more per VM) and have small pages for the smaller VMs(1GB …4GB) … Is there a clear statement of the order of memory optimizations, it still stays the same ? First TPS, then ballooning and then swap …
If this is the case we could create a VM with a Memory Limit slightly below the VM’s available memory and the memory would be split up in small pages, is this a per VM turn-key-thing or can part of the memory pages of a VM be on large pages and part of it on small pages …
I know a lot of questions, but I would like to know the details 😉
Thanks for all your blogging!
Kind regards,
K.
Jason Boche says
Mem.AllocGuestLargePage is indeed an ESX-host-wide setting but one clarification to be made is that I don’t believe a host or VM reboot is required for a change of this advanced setting to take effect. Make the VMkernel change and then VMotion VMs off the host and then back on to the host.
Jas
orz says
Yes, VMotioning VMs off and back on the host is sufficient after you made that setting.
We too set Mem.AllocGuestLargePage to 0 and can’t see any degradation in performance. A 10-20% performance increase seems quite exaggerated for most workloads.
On the other hand, we achieve a permanent low memory usage on our hosts with lots and lots of GB memory saved.
http://www.vmware.com/files/pdf/large_pg_performance.pdf
Note that this whitepaper doesn’t only base the performance imporvements on large pages on the ESX-side, but also on the Guest OS and application making use of large pages (which can be done independantly from the ESX-side of things).
Duncan Epping says
@Matt : Yes, I reported it and discussed what the KB article should state with the KB Team and they updated and published it yesterday.
@Koen : Setting a limit is not the way to go. We are talking about HOST memory pressure and not VM. If you set a limit you will end up with swapping and TPS will not be triggered as it is the VM which has the limit and not the host itself.
@orz : I have tested it myself and when memory is actively used a 10-20% difference can be noticed. I witnessed it with a XenApp workload, and it did provide us a substantial performance gain.
Matt Liebowitz says
Duncan – when you say a 10-20% difference, how did that actually manifest itself? Did you see a 10-20% reduction in overall CPU usage (since the CPU doesn’t have to scan through lots and lots of small pages), or some other measurable improvement like user density?
duncan says
Yes I noticed a 15 – 20% increase in user density. Also noticed that there were less spikes in the CPU chart because of the large pages.
Matt Liebowitz says
I wish I could force large pages on my older generation Intel procs and get a 15% user density increase. 🙂
I agree and have been trying to convince people that nothing is broken and not to disable large pages. I guess many people are used to the way it used to work and so they think something is broken.
At the end of the day TPS will kick in when you need it. And although there may be a 15-20% reduction in performance for some workloads, the reduction would be far greater if the server had to swap/balloon instead of break up the pages and use TPS.
Valentin says
@Koen
You can set whether to use large pages on a per VM basis via the .vmx option: monitor_control.disable_mmu_largepages = TRUE
Cheers
Valentin
Scott Springer says
It all seems well and good that VMware says things will start to be broken into small page tables and that page sharing will occur as it is needed – but when is that point? I have not ever seen it reached and my hosts will go up near 100% memory usage. If I do disable LPT I recover about 60% of the used host memory (partially due to overcommitted memory on VMs, but tht is what you are supposed to be able to do!).
Askar Kopbayev says
Generally speaking, it seems to me as there is a choice for Vmware admin between CPU or Memore usage hit. In our VSphere farms I can rarely see higher than 20-30% of CPU usage, however the memory usage is quite often around 60-70%. So I assume we might consider disable LP.
I also have similar question as Scott – what is the threshold for TPS start? Is it adjustable?
Just checked one of our ESXi 4.1 servers (runs on Nehalem) – there are about 50% of free RAM (so it is not overcommited), however I can already see in ESXTOP that it shared about 3GB. This conflicts with statement from KB Article 1020524 that “TPS is only used when the host is over-committed on Nehalem based systems.”
I would be very glad to hear any explanation to it.
Duncan Epping says
Zero pages?
Askar Kopbayev says
yes, just checked it now, you are absolutely right about it. However, while checking this, I played a bit with test VM running Windows 2008R2 and found out an interesting thing – ESXi will mark zero pages as shared only when VM is powered up. Then Shared/Zero values constantly descrease and even after I rebooted VM several times Shared/Zero values were not changed in esxtop and continued to decrease.
looks like ESXi doesn’t reclaim zero pages of Guest VM after VM’s reboot, but only after it is powered on.
Askar Kopbayev says
Duncan, sorry for bothering, but I couldn’t find any information on when TPS kicks in on pre-Nehalem systems.
At which percentage of memory usage do pre-Nehalem systems start to break LP into small pages?
Duncan says
If and when large pages are used breaking down always happens at the same point!