I have had this question multiple times by now, I wanted to answer it in the Virtual SAN FAQ but I figured I would need some diagrams and probably more than 2 or 3 sentences to explain this. How are host or disk failures in a Virtual SAN cluster handled? I guess lets start with the beginning, and I am going to try to keep it simple.
I explained some of the basics in my VSAN intro post a couple of weeks back, but it never hurts to repeat this. I think it is good to explain the IO path first before talking about the failures. Lets look at a 4 host cluster with a single VM deployed. This VM is deployed with the default policy, meaning “stripe width” of 1 and “failures to tolerate” to 1 as well. When deployed in this fashion the following is the result:
In this case you can see: 2 mirrors of the VMDKs and a witness. These VMDKs by the way are the same, they are an exact copy. What else did we learn from this (hopefully) simple diagram?
- A VM does not necessarily have to run on the same host as where its storage objects are sitting
- The witness lives on a different host than the components it is associated with in order to create an odd number of hosts involved for tiebreaking under a network partition
- The VSAN network is used for communication / IO etc
Okay, so now that we know these facts it is also worth knowing that VSAN will never place the mirror on the same host for availability reasons. When a VM writes the IO is mirrored by VSAN and will not be acknowledged back to the VM until all have completed. Meaning that in the example above both the acknowledgement from “esxi-02″ and “esxi-03″ will need to have been received before the write is acknowledge to the VM. The great thing here is though that all writes will go to flash/ssd, this is where the write-buffer comes in to play. At some point in time VSAN will then destage the data to your magnetic disks, but this will happen without the guest VM knowing about it… [Read more…]