One versus multiple VSAN disk groups per host

Duncan Epping · May 22, 2014 ·

I received two questions on the same topic this week so I figured it would make sense to write something up quickly. The questions were around an architectural decision for VSAN, one versus multiple VSAN disk groups per host. I have explained the concept of disk groups already in various posts but in short this is what a disk group is and what the requirements are:

A disk group is a logical container for disks used by VSAN. Each disk groups needs at a minimum 1 magnetic disk and can have a maximum of 7 disks. Each disk group also requires 1 flash device.

Now when designing your VSAN cluster at some point the question will arise should I have 1 or multiple disk groups per host? Can and will it impact performance? Can it impact availability?

There are a couple of things to keep in mind when it comes to VSAN if you ask me. The flash device which is part of each disk group is the caching/buffer layer for those disks, without the flash device the disks will also be unavailable. As such a disk group can be seen as a “failure domain”, because if the flash device fails the whole disk group is unavailable for that period of time. (Don’t worry, VSAN will automatically rebuild all your components that are impacted automatically.) Another thing to keep in mind is performance. Each flash device will provide an X amount of IOPS.A higher total number of IOPS could (probably will) change performance drastically, however it should be noted that capacity could still be a constraint. If this all sounds a bit fluffy lets run through an example!

Total capacity required: 20TB
Total flash capacity: 2TB
Total number of hosts: 5

This means that per host we will require:

4TB of disk capacity (20TB/5 hosts)
400GB of flash capacity (2TB/5 hosts)

This could simply result in each host having 2 x 2TB NL-SAS and 1 x 400GB flash device. Lets assume your flash device is capable of delivering 36000 IOPS… You can see where I am going right? What if I would have 2 x 200GB flash and 4x 1TB magnetic disks instead? Typical the lower capacity drives will do less write IOPS but for the Intel S3700 for instance that is 4000 less. So instead of 1 x 36000 IOPS it would result in 2 x 32000 IOPS. Yes, that could have a nice impact indeed….

But not just that, we also have more disk groups and smaller fault domains as a result. On top of that we will end up with more magnetic disks which means more IOPS per GB capacity in general. (If an NL-SAS drive does 80 IOPS for 2TB then two NL-SAS drives of 1TB will do 160 IOPS. Which means same TB capacity but twice the IOPS if you need it.)

In summary, yes there is a benefit in having more disk groups per hosts and as such more flash devices…

Comments

ThinkBriK says

22 May, 2014 at 22:08

Hi and thank for this intresting point of view, I’m surprised by your 10% flash requirement !

It’s not the recommended best practices for sizing flash devices (https://www.vmware.com/files/pdf/products/vsan/VSAN_Design_and_Sizing_Guide.pdf).

You’re probably oversizing your flash devices… In another word : extra bucks => extra disks => extra VSAN groups => extra performances 🙂
- Duncan says
  
  22 May, 2014 at 23:51
  
  It is just an example… Nothing more than that.
  - John Nicholson. says
    
    27 May, 2014 at 02:51
    
    VMawre has the VIP tool for actually assessing how much flash you need. I’ve been in the beta and was pretty impressed with its ability to tell you exactly how much flash per VM you need. A key thing is that you can resize this (add another group).
Christian Hansen says

23 May, 2014 at 13:02

What about fault tolerance?
If FTT is set to 1 for a VM, could there be a risk of losing data if the host containing the 2 disk groups crashed? Or would the replica never be placed within 2 disk groups on the same host?
- Duncan Epping says
  
  23 May, 2014 at 21:10
  
  VSAN is smart enough to ensure that components of the same object part of the RAID-1 tree will not reside on the same host. So that is not an issue.
Jon says

23 May, 2014 at 20:11

What about hypervisor overhead? Would more disk groups increase processing or memory overhead?
- Duncan Epping says
  
  23 May, 2014 at 21:11
  
  There is a small overhead, it is negligible if you ask me compared to the benefit of doubling your IOPS per GB.
  - Jon says
    
    26 May, 2014 at 22:23
    
    Do you have any specific numbers on that overhead? I’m sure you’re right but would be curious to see the stats.
    - John Nicholson. says
      
      27 May, 2014 at 02:53
      
      My understanding is VSAN with throttle CPU and keep it from ever using more than 10% of host CPU resources. Even with multiple IOMeter workers saturating a host, I haven’t personally seen over 3%. Considering all the money your saving on Expensive arrays, or more hosts, buying a CPU with some extra cores is a small investment. (And ECC memory is ~$10 a GB) so I’d consider any serious overhead to be less than a wash.
    - Duncan says
      
      27 May, 2014 at 08:47
      
      I don’t have any specific numbers. Cormac has documented the expected available memory, but this doesn’t equal the amount of used resources. http://cormachogan.com/2014/01/15/vsan-part-14-host-memory-requirements/
Frank Lempitsky says

9 October, 2014 at 21:12

Could I do 4 – 200GB SSD and 4 – 1.2TB 10K drive in each of 3 hosts? That I think would be a sweet spot for performance. your thoughts would be appreciated.
- Duncan Epping says
  
  10 October, 2014 at 09:13
  
  Sure that is an option indeed…
  - Frank Lempitsky says
    
    10 October, 2014 at 14:23
    
    Does vSAN still limit you to 1 SSD per disk group? Seems like that is limiting if you have smaller SSD drives. Thanks.
    - Duncan Epping says
      
      10 October, 2014 at 16:20
      
      yes, 1 SSD and 1 HDD per disk group.
      - Frank Lempitsky says
        
        10 October, 2014 at 17:43
        
        Ok thanks, One more point of interest. If I have 200GB SSDs and 1.2TB 10K drives I have a couple options. Which makes more sense.
        
        1. 1 – 200GB SSD + 1 – 1.2TB HDD 10K in a Disk Group and have 4 Disk Groups per ESX host there will be three.
        
        2. 1 – 200GB SSD + 2 – 1.2TB HDD 10k in a Disk group and only have 2 Disk groups per ESX Host three ESX hosts here as well.
        
        Would like to know your thoughts
        
        thanks
      - Duncan Epping says
        
        10 October, 2014 at 19:00
        
        More disk groups and more flash is always better if you ask me! more cache + more failure domains (diskgroup)
Jon says

26 February, 2015 at 20:49

Let’s say one had two vSan disk groups in a four node cluster. But wanted the second disk group to serve as fail-over, obviously no possible now yet… Would it be interesting to have vm appliance running from “vsan-1-faulure” datastore disk group, serving as a witness, data replicating (other), maybe even add an additional storage policy layer for vsans; and an operations client plugin? Granted their would be overhead, some resource control could mediate its impact… Thinking allowed… Even if i have the bits wrong, the idea could be neat.
pete says

6 April, 2015 at 19:30

I’m quite interested in hearing more about Caching to Ram (similar to Pernix FVP 2.5/Atlantis USX) and/or Flash in a future release of VSAN to help eliminate cache degradation (mainly) and obviously provide a performance increase.

Related

Reader Interactions

Comments