In 2013 I wrote an article about the minimum number of hosts for Virtual SAN. Since then this post has started living its own life. Somehow people have misunderstood my post and used/abused it in many shapes and forms. When I look at the size of a traditional cluster (non-VSAN) the minimum size is 2. From an availability perspective I ask myself what is the risk I am willing to take. What does that mean?
In a previous life I did many projects for SMB customers. My SMB customers typically had somewhat in the range of 2-5 hosts. With the majority having 2-3. In many cases those having 2-3 hosts were running roughly a similar number of virtual machines. The difference between the two situations “2 hosts” versus “3 hosts” was whether during times of maintenance (upgrading / updating) or failure if the ability to restart the virtual machine after a secondary failure. Many customers decided to go with 2 node clusters. Key reason for it being price vs risk. At normal operations risk is low, but the price of an additional host was relatively high.
Now compare this to Virtual SAN and you will see the same applies. With Virtual SAN we have a minimum of 3 hosts, well in a ROBO configuration you can have 2 with an external witness. This means that from a support perspective the bare minimum of dedicated physical hosts required for VSAN is 2. There you go, 2 is the bare minimum for ROBO. For non-ROBO 3 is the minimum. Fully supported, offers all functionality and similar to 4 hosts.
Is having an extra host a good plan? Yes of course it is. HA / DRS / VSAN (and any other scale-out storage solution for that matter) will benefit from more hosts. You as a customer need to ask yourself what the risk is, and if the cost is justifiable.
PS1: A question just came in, want to make that it is clear. Even in a 2-host ROBO configuration you can do maintenance! A single copy of the data and the witness remains available and will have quorum.
PS2: No, you cannot host your “witness” VM on the VSAN cluster itself, this is not supported as the witness is the quorom for the cluster and it should be outside of the cluster to provide certainty of the state in the case of a failure.
Christopher Etou says
Like you rightly mentioned, from where I’m coming from SMEs don’t have the budget for three servers therefore a 2-host ROBO configuration could do the trick. I believe the 2-host ROBO was introduced during this year VMworld and the presentation on youtube showed the witness as a VM back in the HQ infrastructure. Now is a case of an SME where I am considering 2-Host ROBO configuration, where do I host the witness then?
Sholom Brody says
Chris, for a scenario where you don’t have an HQ infrastructure you would be looking at deploying the Witness in a hosted cloud. I imagine this to be a great use case of vcloud air in the future as it has not been fully tested. As long as you can guarantee the latency and bandwidth requirements, you’d be good to go. See some more details about the requirements of the Witness from Comac:
http://cormachogan.com/2015/09/11/a-closer-look-at-the-vsan-witness-appliance/
shahed says
Nice write up. Thanks a lot sir.
JayST says
i’m curious about the migration path from a 2-host ROBO to a 3-host (non-ROBO then?) cluster. I think there could be license issues to take care of to begin with, but is it technically possible to migrate to non-ROBO thereby phasing-out the dedicated witness appliance holding all witness components?
Johannes says
…and also worth to mention: now you can deliver a two rack server redundancy concept with VSAN (with the witness on another location) which is a requirement of many of our customers 🙂
Stefano says
For small companies which has no a HQ site where to place the witness is it possible to place the witness in a small server with vmplayer or esxi free
Duncan Epping says
ESXi is supported, player or workstation isn’t at the moment.
Mado says
Great read,what is your advise on the raid level of each host, usually I go with I raid 1+0 for OS and the rest is raid 5, is this good enough?
Duncan says
I advise to read up on the early VSAN posts. VSAN expects either a disk in RAID-0 or in Passthrough. There is NO form of host side RAID used for protection, this is part of the build in VSAN logic.
David Carter says
Duncan, I was wondering if you could help answer a question for me. We’re thinking of deploying several 2-node ROBO 6.1 VSANs to some of our remote offices, with a fault domain per host, and a witness VM located in our of our data centres. I can’t find anywhere any documentation as to what happens if a host fails (> 60 minutes downtime) and requires a complete replacement, how do you replace the host in the cluster? Can you at all? I can’t really see any docs regarding any form of host replacement!
Steve says
David, Did you ever find any documentation on replacing a failed host in a 2 node cluster? I haven’t found anything.
Joe says
Has anyone experimented with using SMP FT on the witness ESXi VM (on the vSAN datastore) to better provide an all in one 2 node solution? Sounds possible…