I received two questions on the same topic this week so I figured it would make sense to write something up quickly. The questions were around an architectural decision for VSAN, one versus multiple VSAN disk groups per host. I have explained the concept of disk groups already in various posts but in short this is what a disk group is and what the requirements are:
A disk group is a logical container for disks used by VSAN. Each disk groups needs at a minimum 1 magnetic disk and can have a maximum of 7 disks. Each disk group also requires 1 flash device.
Now when designing your VSAN cluster at some point the question will arise should I have 1 or multiple disk groups per host? Can and will it impact performance? Can it impact availability?
There are a couple of things to keep in mind when it comes to VSAN if you ask me. The flash device which is part of each disk group is the caching/buffer layer for those disks, without the flash device the disks will also be unavailable. As such a disk group can be seen as a “failure domain”, because if the flash device fails the whole disk group is unavailable for that period of time. (Don’t worry, VSAN will automatically rebuild all your components that are impacted automatically.) Another thing to keep in mind is performance. Each flash device will provide an X amount of IOPS.A higher total number of IOPS could (probably will) change performance drastically, however it should be noted that capacity could still be a constraint. If this all sounds a bit fluffy lets run through an example!
- Total capacity required: 20TB
- Total flash capacity: 2TB
- Total number of hosts: 5
This means that per host we will require:
- 4TB of disk capacity (20TB/5 hosts)
- 400GB of flash capacity (2TB/5 hosts)
This could simply result in each host having 2 x 2TB NL-SAS and 1 x 400GB flash device. Lets assume your flash device is capable of delivering 36000 IOPS… You can see where I am going right? What if I would have 2 x 200GB flash and 4x 1TB magnetic disks instead? Typical the lower capacity drives will do less write IOPS but for the Intel S3700 for instance that is 4000 less. So instead of 1 x 36000 IOPS it would result in 2 x 32000 IOPS. Yes, that could have a nice impact indeed….
But not just that, we also have more disk groups and smaller fault domains as a result. On top of that we will end up with more magnetic disks which means more IOPS per GB capacity in general. (If an NL-SAS drive does 80 IOPS for 2TB then two NL-SAS drives of 1TB will do 160 IOPS. Which means same TB capacity but twice the IOPS if you need it.)
In summary, yes there is a benefit in having more disk groups per hosts and as such more flash devices…
ThinkBriK says
Hi and thank for this intresting point of view, I’m surprised by your 10% flash requirement !
It’s not the recommended best practices for sizing flash devices (https://www.vmware.com/files/pdf/products/vsan/VSAN_Design_and_Sizing_Guide.pdf).
You’re probably oversizing your flash devices… In another word : extra bucks => extra disks => extra VSAN groups => extra performances 🙂
Duncan says
It is just an example… Nothing more than that.
John Nicholson. says
VMawre has the VIP tool for actually assessing how much flash you need. I’ve been in the beta and was pretty impressed with its ability to tell you exactly how much flash per VM you need. A key thing is that you can resize this (add another group).
Christian Hansen says
What about fault tolerance?
If FTT is set to 1 for a VM, could there be a risk of losing data if the host containing the 2 disk groups crashed? Or would the replica never be placed within 2 disk groups on the same host?
Duncan Epping says
VSAN is smart enough to ensure that components of the same object part of the RAID-1 tree will not reside on the same host. So that is not an issue.
Jon says
What about hypervisor overhead? Would more disk groups increase processing or memory overhead?
Duncan Epping says
There is a small overhead, it is negligible if you ask me compared to the benefit of doubling your IOPS per GB.
Jon says
Do you have any specific numbers on that overhead? I’m sure you’re right but would be curious to see the stats.
John Nicholson. says
My understanding is VSAN with throttle CPU and keep it from ever using more than 10% of host CPU resources. Even with multiple IOMeter workers saturating a host, I haven’t personally seen over 3%. Considering all the money your saving on Expensive arrays, or more hosts, buying a CPU with some extra cores is a small investment. (And ECC memory is ~$10 a GB) so I’d consider any serious overhead to be less than a wash.
Duncan says
I don’t have any specific numbers. Cormac has documented the expected available memory, but this doesn’t equal the amount of used resources. http://cormachogan.com/2014/01/15/vsan-part-14-host-memory-requirements/
Frank Lempitsky says
Could I do 4 – 200GB SSD and 4 – 1.2TB 10K drive in each of 3 hosts? That I think would be a sweet spot for performance. your thoughts would be appreciated.
Duncan Epping says
Sure that is an option indeed…
Frank Lempitsky says
Does vSAN still limit you to 1 SSD per disk group? Seems like that is limiting if you have smaller SSD drives. Thanks.
Duncan Epping says
yes, 1 SSD and 1 HDD per disk group.
Frank Lempitsky says
Ok thanks, One more point of interest. If I have 200GB SSDs and 1.2TB 10K drives I have a couple options. Which makes more sense.
1. 1 – 200GB SSD + 1 – 1.2TB HDD 10K in a Disk Group and have 4 Disk Groups per ESX host there will be three.
2. 1 – 200GB SSD + 2 – 1.2TB HDD 10k in a Disk group and only have 2 Disk groups per ESX Host three ESX hosts here as well.
Would like to know your thoughts
thanks
Duncan Epping says
More disk groups and more flash is always better if you ask me! more cache + more failure domains (diskgroup)
Jon says
Let’s say one had two vSan disk groups in a four node cluster. But wanted the second disk group to serve as fail-over, obviously no possible now yet… Would it be interesting to have vm appliance running from “vsan-1-faulure” datastore disk group, serving as a witness, data replicating (other), maybe even add an additional storage policy layer for vsans; and an operations client plugin? Granted their would be overhead, some resource control could mediate its impact… Thinking allowed… Even if i have the bits wrong, the idea could be neat.
pete says
I’m quite interested in hearing more about Caching to Ram (similar to Pernix FVP 2.5/Atlantis USX) and/or Flash in a future release of VSAN to help eliminate cache degradation (mainly) and obviously provide a performance increase.