The question that keeps coming up over and over again at VMUG events, on my blog and the various forums is: What happens in a VSAN cluster in the case of an SSD failure? I answered the question in one of my blog posts around failure scenarios a while back, but figured I would write it down in a separate post considering people keep asking for it. It makes it a bit easier to point people to the answer and also makes it a bit easier to find the answer on google. Lets sketch a situation first, what does (or will) the average VSAN environment look like:
In this case what you are looking at is:
- 4 host cluster
- Each host with 1 disk group
- Each disk group has 1 SSD and 3 HDDs
- Virtual machine running with a “failures to tolerate” of 1
As you hopefully know by now a VSAN Disk Group can hold 7 HDDs and requires an SSD on top of that. The SSD is used as a Read Cache (70%) and a Write Buffer (30%) for the components stored on it. The SSD is literally the first location IO is stored; as depicted in the diagram above. So what happens when the SSD fails?
When the SSD fails the whole Disk Group and all of the components will be reported as degraded or absent. The state (degraded vs absent) will depend on the type of failure, typically though when an SSD fails VSAN will recognize this and mark it as degraded and as such instantly create new copies of your objects (disks, vmx files etc) as depicted in the diagram above.
From a design perspective it is good to realize the following (for the current release):
- A disk group can only hold 1 SSD
- A disk group can be seen as a failure domain
- E.g. as such there could be a benefit in creating 2 x 3HDD+1SSD versus 6HDD+1SSD diskgroup
- SSD availability is critical, select a reliable SSD! Yes some consumer grade SSDs do deliver a great performance, but they typically also burn out fast.
Let is be clear that if you run with the default storage policies you are protecting yourself against 1 component failure. This means that 1 SSD can fail or 1 host can fail or 1 disk group can fail, without loss of data and as mentioned typically VSAN will quickly recreate the impacted objects on top of that.
Doesn mean you should try safe money on reliability if you ask me. If you are wondering which SSD to select for your VSAN environment I recommend reading this post by Wade Holmes on the VMware vSphere Blog. Especially take note of the Endurance Requirements section! If I had to give a recommendation though, the Intel S3700 seems to still be the sweet spot when it comes to price / endurance / performance!
Mike Foley says
$250 for 100GB S3700’s
Not cheap, but you will get what you paid for.
I do believe the question everyone is asking in terms of what happens when SSD fails is related to cache and not VMDKs which is still unanswered. Cache is usually mirrored for availability and thus when it fails there is a copy of the cache just like the VMDKs. If the SSDs are mirrored also between the hosts then cache is also protected at host level and new copy is created just like with VMDKs, is this the case?
VM will keep on running when protected N+1 as write cache is indeed mirrored. Read cache is not mirrored however so when you are running N+1 you temp lose 50% read cache.
I have a silly question but I can’t yet find the answer.. in the event of an SSD failure, what is the correct approach for replacing the failed unit? If I pull the SSD labeled with ‘permanent failure’, and replace it with a new one, the new one does not show up as being available to add to the disk group. The original disk is still visible in the disk group with a red X on it. In order for me to be able to assign the new SSD, it seems I must first remove the original SSD from the group — at this point it warns that the entire disk group will be removed.
Duncan Epping says
right now you need to blast away the whole diskgroup and recreate it afaik.
Bryan Pizzuti says
Just reading this to get more info on vSAN (loving all these write-ups, extremely helpful) and I’m wondering if anyone has considered/tried simply setting up SSDs in RAID1 as “the SSD” to mitigate the failure domain? The servers I use I’d be stuck using the “RAID0” bit to get individual disks to be visible to vSAN anyway, so even with a single SSD I’d have to create a small RAID and tag it as an SSD. Seems to me that theoretically creating a RAID1 and tagging it should work, and should keep the associated disk group going if one of the SSDs fails?