On the internal mailinglist there was a discussion today around how disabling TPS (Transparent Page Sharing) could negativitely impact performance. It is something I hadn’t thought about yet but when you do think about it it actually does make sense and is definitely something to keep in mind.
Most new servers have some sort of NUMA architecture today. As hopefully all of you know TPS does not cross a NUMA node boundary. This basically means that pages will not be shared between NUMA nodes. Another thing that Frank Denneman already described in his article here is that when memory pages are allocated remotely there is a memory penalty associated with it. (Did you know there is an “esxtop” metric, N%L,which shows the percentage of remote pages?) These pages are accessed across an interconnect bus which is always slower than so called local memory.
Now you might ask what is the link between NUMA, TPS and degraded performance? Think about it for a second… TPS decreases the amount of physical pages needed. If TPS is disabled there is no sharing and chances of going across NUMA nodes are increased and as stated this will definitely impact performance. Funny how disabling a mechanism(TPS) which is often associated with “CPU overhead” can have a negative impact on memory latency.
daniel says
Good to know, but are there any occasions where disabling TPS on a modern architecture is actually recommended?
Duncan says
Not that I am aware off. My recommendation is always keep it turned on.
Duncan
AFidel says
Too bad we will all have this penalty in the next generation or two of CPU’s due to huge MMU pages making TPS completely ineffective =(
NiTRo says
I enabled TPS over NUMA nodes to get more memory saving on a cluster without noticable performance impact. How much hurt are we talking about Duncan ?
James Shelton says
I don’t know… There are so many performance variables that come into play, and so many contrary opinions and supposed facts floating around about best practices. I guess the question really is…how much would losing TPS negatively affect performance? Would the loss be more than the gain from forcing the utilization of the virtualized MMU as proposed by Mr. Drummonds here: http://vpivot.com/2010/04/20/a-performance-tip-for-esx-3-0-and-esx-3-5/ ? Or…am I missing something? Wouldn’t that force the use of large page tables which negatively impacts TPS?
I just know that regardless of how nice it might be…I’ve yet to work in an environment where management could stomach the overcommitment required to really appreciate TPS in all of its glory. Perhaps it’s a holy grail for some…but I think there are plenty who’ve hardly scratched the surface of memory overcommitment…
Anton Zhbankov says
AFidel, you can force ESX to use small pages and compare performance if there is negative impact.
I have done it myself on my Nehalems x5570. I noticed no negative impact, but after that I saw amazing numbers in one experiment. You can see esxtop screenshot here: http://blog.vadmin.ru/2010/02/transparent-page-sharing.html
I have VMotioned enough production VMs (37 actually) to one host so sum of VM memory became equal to physical memory, 64 GB. After one hour memory usage graph stabilized at level ~32 GB usage. I.e. TPS saved me half of configured memory. CPU load was 10-15% max at that moment. VMs I’m talking about were with various OSes – RHEL, Windows XP, 2003, 2008, 32 and 64 bit.
Fred Peterson says
If TPS was actually saving you 50%, you have way more memory assigned to those the VM’s then is actually necessary because the majority of savings with TPS comes from the zero pages.
AFidel says
Anton,
Yes, I typically see ~50% memory savings from TPS myself which is why I’m slightly worried about the way the hardware MMU’s are headed (1GB+ PT’s which would all but make TPS worthless with hardware acceleration).
p.s.
64GB is a really odd configuration for a 5570, should be 72GB typically. 64GB gives you an unbalanced configuration which can have a significant negative performance impact (~20-30%). Throw in a couple more DIMM’s and you should see a nice pickup in performance.
Anton Zhbankov says
Fred, I have about 20 VMs in production cluster with 3-4 GB of RAM that shoud never go swapping. So it’s not a discussion where I gave too much memory to these VMs. In the case go Hyper-V I have to buy all this memory and it’s not ‘cheap as garbage’.
AFidel, thanks for recommendation. If you’re referring to 3-channel memory then it would be problem for me, because I have 8*8 GB configuration -> 4 modules per CPU. So I have to either install 48 GB or 96 GB (3 or 6 modules). And I have no performance problem currently, 37 VMs loaded my CPUs only about 15% total.
Scot Grabowski says
http://kb.vmware.com/selfservice/microsites/search.dolanguage=en_US&cmd=displayKC&externalId=1004901 So this KB from VMware is actually telling you to do this ? We have many multi proc boxes now going into DEV on vm’s and this slow reboot is an issue during the testing phase. Are they telling us to steal from peter to pay paul basically ? So for now we are testing with 2 vm’s we turned pshare off on the guests for now. I guess we will see how it impacts the RAM ? It scares me to think of what may happen as we are starting to create a high perfomance cluster for VM’s with over 4 VCPU’s if we turn all the page sharing off on them what is going to happen to the memory ? Once they go prod and are not rebooting we can change it back having a very large env as we do, It can be an admin nightmare.
invisible says
What are setting controlling page sizes for TPS?
I’ve also noticed that most TPS savings come from Windows machines. for Linux page sharing usually does not exceed 10-15%.
P.S. >200 running VMs, cluster of 10 BL460c G6 with 2 X5570 and 96GB of RAM on each.
Chad King says
We just installed a slew of HP blades in our production environments and in our DC’s we are using the DL580’s. So do we need to look at turning that off or what exactly? Maybe I am having trouble fully understanding this? I would love to understand more about MMU, TPS I know pretty well, and NUMA.
ebenjamin says
In some cases we see an improved performance by using memory large pages (OS huge pages) and using this will minimize the effect/activation of TPS, since the large pages are 2Mb where TPS cannot deal with – does anyone have any numbers on pros and cons of TPS vs using Large Pages (which assumes TPS wont take effect)?
Perhaps you get the best of both worlds because on Linux OS you can have a percentage of hugepages and some small pages, vs Windows I believe you either have all large pages or not, which is potentially not optimal.
Duncan Epping says
Yes you do. Large Pages will in many cases improve performance. I have witnessed this myself with Citrix workloads. Nothing wrong with using it as long as you understand the possible implication.
ebenjamin says
Thanks Duncan.
How about weighing the 2 options, how much of a TPS performance loss do I get vs largepages performance gain?
Duncan Epping says
that depends on the size of the VMs and if they will exceed the local memory of a numa node.
you can actually verify with esxtop if the memory is being fetched locally or not by looking at the metric “N%L”. It should be 100%, but anything less than 80 will probably impact your performance.
ebenjamin says
The assumption in my question is that NUMA locality is preserved, and so lets say 4vCPU+Mem VM that fits in within the NUMA boundary. So I was asking really about “What TPS performance loss do I get vs largepages performance gain?” all within one NUMA…? when using LargePages.
Duncan Epping says
I can’t tell you to be honest, it would be a matter of a couple of percentages I guess. I have never tested it to that level.