I’ve been talking to a lot of customers the past 12-18 months, if one thing stood out is that about 98% of all our customers used Failures To Tolerate = 1. This means that 1 host or disk could die/disappear without losing data. Most of the customers when talking to them about availability indicated that they would prefer to use FTT=2 but cost was simply too high.
With VSAN 6.2 all of this will change. Today with a 100GB disk FTT=1 results in 200GB of required disk capacity. With FTT=2 you will require 300GB of disk capacity for the same virtual machine, which is an extra 50% capacity required compared to FTT=1. The risk, for most people, did not appear to weigh up against the cost. With RAID-5 and RAID-6 the math changes, and the cost of extra availability also changes.
The 100GB disk we just mentioned with FTT=1 and the Failure Tolerance Method set to “RAID-5/6” (only available for all flash) means that the 100GB disk requires 133GB of capacity. Already that is a saving of 67GB compared to “RAID-1”. But that savings is even bigger when going to FTT=2, now that 100GB disk requires 150GB of disk capacity. This is less than “FTT=1” with “RAID-1” today and literally half of FTT=2 and FTM=RAID-1. On top of that, the delta between FTT=1 and FTT=2 is also tiny, for an additional 17GB disk space you can now tolerate 2 failures. Lets put that in to a table, so it is a bit easier to digest. (Note that you can sort the table by clicking on a column header.)
FTT | FTM | Overhead | VM size | Capacity required |
---|---|---|---|---|
1 | Raid-1 | 2x | 100GB | 200GB |
1 | Raid-5/6 | 1.33x | 100GB | 133GB |
2 | Raid-1 | 3x | 100GB | 300GB |
2 | Raid-5/6 | 1.5x | 100GB | 150GB |
Of course you need to ask yourself if your workload requires it, does it make sense with desktops? Well for most desktops it probably doesn’t… But for your Exchange environment maybe it does, for your databases maybe it does, for your file servers, print servers, for your web farm even it can make a difference. That is why I feel that the standard used “FTT” setting is going to change slowly, and will (should) be FTT=2 in combination with FTM set to “RAID-5/6”. Now let it be clear, there is a performance difference between FTT=2 with FTM=RAID-1 vs FTT=2 with FTM=RAID-6 (same applies for FTT=1) and of course there is a CPU resource cost as well. Make sure to benchmark what the “cost” is for your environment and make an educated decision based on that. I believe though that in the majority of cases the extra availability will outweigh the cost / overhead, but still this is up to you to determine and decide. What is great about VSAN in my opinion is the fact that we offer you the flexibility to decide per workload what makes sense.
Dimos Gorogias says
Why all the 6.2 “goodies” on All flash VSAN only?
Duncan Epping says
See this post: https://blogs.vmware.com/virtualblocks/2016/02/23/deduplication-and-erasure-coding-in-virtual-san-6-2/
Steve Smith says
+1 on Dimos’ comment. Is it because of rebuild times?
Duncan Epping says
that is explained here:
https://blogs.vmware.com/virtualblocks/2016/02/23/deduplication-and-erasure-coding-in-virtual-san-6-2/
Duncan Epping says
and after that read about the “io amplification” here, it wouldn’t make much sense on hybrid as you would need more spindles to provide the performance while trying to save capacity:
https://blogs.vmware.com/virtualblocks/2016/02/12/the-use-of-erasure-coding-in-virtual-san-6-2/
Johannes says
What I am missing at the moment is some kind of Sub-FTT protection per host when using a lot of disks. We have about 15x 3-4 host VSAN clusters in field where FTT=1.
Let’s take one disk failure and an immediate rebuild of its data to another disk on the same host (permanent disk failure). If using a lot of disks per host, there is the probability of loosing a second disk on another host (with the same data objects) within the rebuild time of the first failed disk. Therefore a “local” FTT would be a nice feature, especially for stretched clusters where system half A could fail which would leave system half B without protection against additional disk failures.
P.S. Same “unhappiness” on my side because the new VSAN 6.2 goodies are not available for hybrid… I assume disk performance is the reason for that.
Duncan Epping says
Talk to your local VMware rep about the stretched situation Johannes, I can’t comment publicly on roadmap as you can understand.
Geoff says
I’m thinking about future expandability right now, and a six host cluster each with one, equally-sized disk group and room for a second.
If running FTT=2 w/ erasure coding on all VMs, it seems like it wouldn’t work to try to add capacity to the cluster with a second disk group on just one host should the compression/deduplication savings for the cluster not be quite as great as anticipated, since if each VM’s storage + RAID overhead is evenly spread over 6 hosts then additional capacity on one of them can be utilized by none. Is that right, and if so is there any recommendation for running 6/4 erasure coding on clusters larger than 6 for this reason? If it should turn out I need more capacity, would the best thing be to just throw an extra SSD into each host’s disk group?
Additionally, what’s the upper limit for the number of hosts using FTT=2 w/ erasure coding?
Duncan Epping says
typically people will full up diskgroups evenly and then start adding diskgroups, either way you can do a “capacity rebalance” if you want to in the UI if the imbalance is too big
David Chung says
Love the new features but I really do think ssd capability shul be part of standard vsan licensing.
Duncan Epping says
Understood, and I will provide that info back to the product marketing team.
Anthony Mintoff says
Hi great blog, I am a fan!
Two questions maybe for your next post.
1) I am currently on vsan 6.0 all flash , how will the upgrade work out? Just patch the hosts to the latest? Will I have to destroy my vsan? I need deduplication so I guess I will need to destroy my vsan and start fom scratch.
2) When a disk fails, cache or capacity, all flash or non flash, if someone had to plug-in the failed drive in a server would he manage to retrieve any data from it?
Duncan Epping says
1. It is a rolling upgrade, we will upgrade diskgroup by diskgroup, you don’t need to start from scratch.
2. that depends on how the device failed right, but in most cases you can assume someone will be capable of doing that. Is it realistic it will happen? Chances are slim, but that is why there are whole companies built around encryption and we support encryption at rest through hardware, and more is coming soon!
aptones says
From what I understand, FTT=2 and RAID 6 requires a minimum of 6 hosts (4+2) to protect against a node failure….. I guess it’s only applicable if an end user has a large VSAN cluster, after all surely the cost of 2 or 3 extra nodes outweigh the cost of just adding more disks and reverting back to FTT=1 and RAID1?
Duncan Epping says
What is large, I have customers doing 30 node clusters, and 64 is the limit. Anything above 16 is what most would consider large, 6 isn’t really large.
But keep in mind, that you could also do single CPU socket clusters, then the math will also change from a software point of view. It may be beneficial to go that route if you need the extra availability
Justin says
How does VSAN handle unbalanced configurations? I see the recommendation here is 6 node cluster running RAID6 to get FTT2. However what happens to the RAID if i add a 7th or 8th node?
I am also curious how RAID6 works with stretch clustering and fault domains. Can you configure RAID6 to one cluster and mirror data RAID10 to the stretched cluster to provide resiliency across both clusters?
Duncan Epping says
RAID-6 is always 4 data blocks and 2 parity blocks. Note that we don’t do RAID-6 on a host basis but on a per VM basis. It will always be 4+2 for a given VM, whether you have 6 or 64 hosts or anything in between.
Also, today RAID-6 does not work with a stretched cluster. For stretched only RAID-1 across sites is possible. Yes I know, local site protection would be great, which is why I have requested this feature 🙂
stewdapew says
Duncan,
Kudos on advocating for data protection schemes (FTT=2 and RAID6 erasure encoding). While it is unlikely a customer will experience a concurrent dual disk failure, other errors including unrecoverable bit errors, controller failures/errors, host failures or simple human error (like rebooting a host) can introduce a second ‘failure event’ that results in data loss within a disk group.
Obviously there’s a trade off with any configuration option, but ideally data protection is the number one priority of any storage platform.
Cheers,
v
#PaintItOrange
John says
Duncan,
If we are running a 4 node cluster with RAID5 will components rebuild if there is a node failure? Or will we be running in a degraded state since a 3+1 is the minimum spec for RAID5.
Thanks,
John
Todd says
Sounds like it is subject to standard RAID degradation’s and that the standard of a “hotspare” is recommended – from https://blogs.vmware.com/virtualblocks/2016/02/23/deduplication-and-erasure-coding-in-virtual-san-6-2/ – “While the minimum number of nodes required for RAID-5 is 4, in order to account for node failures, it is a good practice to have 5 nodes.”
Suraj says
Hi Duncan,
For a Hybrid vSAN is PCIe based Flash Drive Supported ??
Gr,Suraj
Roy Badami says
Suraj, the answer would seem to be yes.
If you go to the VSAN Hardware Comatibility List and seach for SSDs of type PCI-E suitable for use in the caching tier of an all-flash VSAN, you get significantly more than 0 results 🙂
Roy Badami says
Damn, misread your question, but ditto if you search for hybrid. Of course, AIUI erasure coding (aka RAID) isn’t supported in hybrid – which is what this article is about – although not sure it it’s relevent to your question.