We’ve all probably seen the announcement around inter-VM(!!) TPS (transparent page sharing) being disabled by default in future releases of vSphere, and the recommendation to disable it in current versions. The reason for this is the fact there was a research paper published which demonstrates how it is possible to get access to data under certain highly controlled conditions. As the KB article describes:
Published academic papers have demonstrated that by forcing a flush and reload of cache memory, it is possible to measure memory timings to determine an AES encryption key in use on another virtual machine running on the same physical processor of the host server if Transparent Page Sharing is enabled. This technique works only in a highly controlled environment using a non-standard configuration.
There were many people who blogged about what the potential impact is on your environment or designs. Typically in the past people would take a 20 to 30% memory sharing in to account when sizing their environment. With inter-VM TPS disabled of course this goes out of the window. Frank described this nicely in this post. However, as Frank also described and I mentioned in previous articles when large pages are being used (usually the case) then TPS is not used by default and only under pressure…
The under pressure part is important if you ask me as TPS is the first memory reclaiming technique used when a host is under pressure. If TPS cannot sufficiently reduce the memory pressure then ballooning is leveraged, followed by compression and swapping ultimately. Personally I would like to avoid swapping at all costs and preferably compression as well. Ballooning typically doesn’t result in a huge performance degradation so it could be acceptable, but TPS is something I prefer as it just breaks up large pages in to small pages and collapses those when possible. Performance loss is hardly measurable in that case. Of course TPS would be way more effective when pages between VMs can be shared rather then just within the VM.
Anyway, the question remains should you have (inter-VM) TPS disabled or not? When you assess the risk you need to ask yourself first who has access to your virtual machines as the technique requires you to login to a virtual machine. Before we look at the scenarios, not that I mentioned “inter-VM” a couple of times now, TPS is not completely disabled in future versions. It will be disabled for inter-VM sharing by default, but can be enabled. More to be found on that in this article on the vSphere blog.
Lets explore 3 scenarios:
- Server virtualisation (private)
- Public cloud
- Virtual Desktops
In the case of “Server virtualisation”, in most scenarios I would expect that only the system administrators and/or application owners have access to the virtual machines. The question then is, why would they go to this level when they have access to the virtual machines anyway? So in the scenario where Server Virtualization is your use case, and access to your virtual machines is restricted to a limited number of people, I would definitely reconsider enabling inter-VM TPS.
In a public cloud environment this however is different of course. You can imagine that a hacker could buy a virtual machine and try to retrieve the AES encryption key. What he (the hacker) does with it next of course is even then still the question. Hopefully the cloud provider ensures that that the tenants are isolated from each other from a security/networking point of view. If that is the case there shouldn’t be much they could do with it. Then again, it could be just one of the many steps they have to take to break in to a system so I would probably not want to take the risk, although the risk is low. This is one of the scenarios where I would leave inter-VM TPS disabled.
Third and last scenario is Virtual Desktops. In the case of a virtual desktop many different users have access to virtual machines… The question though is if you are running any applications or accessing applications which are leveraging AES encryption or not. I cannot answer that for you, so I will leave that up in the air… you will need to assess that risk.
I guess the answer to whether you should or should not disable (inter-VM) TPS is as always: it depends. I understand why inter-VM TPS was disabled, but if the risk is low I would definitely consider enabling it.
Dusan Tekeljak says
Hi, If you’re interested what you can expect from this change in your environment, you can find some thoughts here http://www.thevirtualist.org/disabling-inter-vm-tps-can-affect-current-environment/
Khal Damak says
One other thing to note… With DRS it is more difficult to stage a targeted attack since the workloads move around when VMs are under memory load and TPS is being utilized.
is reserved guest memory still liable to TPS if the host is under memory contention?
I think this is a horrible change to default behaviour. This is not a easy attack vector. VMware should consider having a ‘Secure Install’ switch which provisions a hardened instance (i.e. automate the recommendations from the hardening guides). This will hurt customers, as TPS is not something the average VMware guy is aware of really, but definitely capitalise on it – without knowing,
The feature lets win many markets face HyperV because it increases the number of virtual machines per server
This argument was a strong technical element when choosing a technology
Thanks for another great posting! By future releases I am guessing you mean 6+, correct? Or at least is that’s the intent? If TPS is disabled by default in 6, will an upgrade from 5.x to 6 automatically disable Inter-VM TPS, and leave Intra-VM TPS enabled?
Pretty sure it’s disabled as of the latest patch release of 5.5, I’m on build 2456374 and it appears to be off….