Last week VMware officially released an official paper around Deployment Best Practices for HA. I was one of the authors of the document. Together with several people from the Technical Marketing Team we gathered all best practices that we could find, validated and simplified them to make it rock solid. I think it is a good read. It is short and sweet and I hope you will enjoy it.
Latest Revision:
Dec 9, 2010Download:
http://www.vmware.com/files/pdf/techpaper/VMW-Server-WP-BestPractices.pdfDescription
This paper describes best practices and guidance for properly deploying VMware HA in VMware vSphere 4.1. These include discussions on proper network and storage design, and recommendations on settings for host isolation response and admission control.
JamesK says
On page 5 it says VLAN trunking is required? Why is that?
For example, if you have a dedicated vSwitch0 for kernel and console (and are using the same subnet) and use another, say vSwitch1 for example for guest traffic you would not NEED to use a trunk (at least not on the console / kernel vSwitch adapters).
It also states “In this example, the management network runs on vSwitch0 as active on vmnic0 and standby on vmnic2. The vMotion network runs on vSwitch0 as active on vmnic2 and standby on vmnic0.” You have to specifically define this in your failover policy (it doesn’t show this in the doc, it only briefly mentions setting up active/standby)? Also, on ESXi this would only work if you have separate IPs (possible, but not “required”) for management and the kernel, so is this also the best practice on vSphere vs running everything on a single IP/network (I’ve seen in the past where “best, best” recommendations were to separate everything, like using 3 vSwitches, management, kernel, guests, is why I ask).
Other than that, great document, always love best practices guides. *8)
Duncan says
Why? Because typically vMotion and Managent traffic are isolated from each other.
KJT says
What is the official line on use of CPU and memory reservations? Does VMware still recommend using reservations sparingly, as needed based on SLA? Or is reserving resources not as much of an issue as it was in the past.
Thanks!
Duncan Epping says
Reserving is not an issue, not understanding the impact is. I only recommend reserving resources when there is an absolute need from an SLA perspective. As it increases complexity and reduces consolidation ratio I always try to avoid it if and when possible.
Bret says
On the “percentage of cluster resources” reserved for HA admission control, we are moving to this choice instead of the default. However, we are using memory reservations, and have to be careful that the reservations are equal (or as close as possible) across VMs. Otherwise, the 1/N formula to determine percentage can leave you with fragmented resources and VMs that cannot fail over. It would be nice to mention that HA failover is not guaranteed when the simple 1/N formula is used. Unless I am misunderstanding something, which is usually the case.
Duncan says
HA can leverage DRS these days to defragment resources if and when needed, this however would mean you are either using a lot of reservations or are seriously overcommitting.
Bret says
Correct, we have some clusters with many VMs, all with 8GB reservations. Hopefully when we hit 4.1 the DRS will help resolve this. We’re still on 4.0.
kamlesh singh says
what is channel no. in runtime name, can you please elaborate it trhough pics
Duncan Epping says
Not sure I understand your comment?
kamlesh singh says
vmhba33:C0:T0:L2, this is a typical runtime name we use for storage naming convention, what C0 is referring here, its a physical port no.on HBA card or something which is logical.
kamlesh singh says
Hi duncan, could you pleae let me know what study material i should read to get VCI status, refer me the whitepaper links and books.