I have had this question so many times I figured I would write an article about it, how to calculate what your Virtual SAN datastore size should be? Ultimate this determines which kind of server hardware you can use, which disk controller you need and which disks… So it is important that you get it right. I know the VMware Technical Marketing team is developing collateral around this topic, when that has been published I will add a link here. Lets start with a quote by Christian Dickmann one of our engineers as it is the foundation of this article:
In Virtual SAN your whole cluster acts as a hot-spare
Personally I like to work top-down, meaning that I start with an average for virtual machines or a total combined number. Lets take an example to go through the exercise, makes it a bit easier to digest.
Lets assume the average VM disk size is 50GB. On average the VMs have 4GB of memory provisioned. And we have 100 virtual machines in total that we want to run on a 4 host cluster. Based on that info the formula would look something like this:
(total number of VMs * average VM size) + (total number of VMs * average VM memory size) = total capacity required
In our case that would be:
(100 * 50GB) + (100 * 4GB) = 5400 GB
So that is it? Well not really, like every storage / file system there is some overhead and we will need to take the “failures to tolerate” in to account. If I set my “failures to tolerate” to 1 than I would have 2 copies of my VMs, this means I need 5400 GB * 2 = . Personally I also add an additional 10% in disk capacity to ensure we have room for things like: meta data, log files, vmx files and some small snapshots when required. Note that VSAN by default provisions all VMDKs as thin objects (note that swap files are thick, Cormac explained that here), so there should be room available regardless. Better safe than sorry though. This means that 10800 GB actually becomes 11880 GB. I prefer to round this up to 12TB. The formula I have been using thus looks as follows:
(((Number of VMs * Avg VM size) + (Number of VMs * Avg mem size)) * FTT+1) + 10%
Now the next step is to see how you divide that across your hosts. I mentioned we would have 4 hosts in our cluster. We have two options, we create a cluster that can re-protect itself after a full host failure or we create cluster that cannot. Just to clarify, in order to have 1 host of spare capacity available we will need to divide the total capacity by 3 instead of 4. Lets look at those two options, and what the impact is:
- 12TB / 3 hosts = 4TB per host (for each of the 4 hosts)
- Allows you re-protect (sync/mirror) all virtual machine objects even when you lose a full host
- All virtual machines will maintain availability levels when doing maintenance
- Requires an additional 1TB per host!
- 12TB / 4 hosts = 3TB per host (for each of the 4 hosts)
- If all disk space is consumed, when a host fails virtual machines cannot be “re-protected” as there would be no capacity to sync/mirror the objects again
- When entering maintenance mode data availability cannot be maintained as there would be no room to sync/mirror the objects to another disk
Now if you look at the numbers, we are talking about an additional 1TB per host. With 4 hosts, and lets assume we are using 2.5″ SAS 900GB Hitachi drives that would be 4 additional drives, at a cost of around 1000 per drive. When using 3.5″ SATA drives the cost would be a lot lower even. Although this is just a number I found on the internet it does illustrate that the cost of providing additional availability could be small. Prices could differ though depending on the server brand used. But even at double the cost, I would go for the additional drive and as such additional “hot spare capacity”.
To make life a bit easier I created a calculator. I hope this helps everyone who is looking at configuring hosts for their Virtual SAN based infrastructure.
fvanrooye says
Duncan in your first example:
6TB / 3 hosts = 2TB per host should it be:
6TB / 4 hosts = 2TB instead?
Duncan Epping says
Depends on how you look at it indeed, I took 1 host out of the equation hence I say divide by 3 instead of 4.
fvanrooye says
Looking at the long enough I realized that was what you were doing, I wrote up this quick formula from yours above (probably can be improved on) that I can go forward and backward to determine the best fit for a cluster.
X = (total number of VMs * average VM size) + (total number of VMs * average VM memory size) + 10%
Where X is total capacity required
The Per Host calculation:
Option #1:
Storage per Host Size = (x + (x / (total number of hosts))) / (total number of hosts)
Option #2:
Storage per Host Size = x / (total number of hosts)
Duncan Epping says
please note I made a change to the article. Somehow during editing I forgot to add a section about “failures to tolerate” which means the equation was incorrect.
Doug Baer says
Unless I am missing something, this calculation does not seem to take into account any redundancy (VSAN replicas). So, if the average size of my machine is 54 GB (including the average swap file), and I want to maintain 1 replica — if I need VSAN to “re-protect,” I need the blocks somewhere so that that can happen — I will need 108 GB space per VM, right?
Duncan Epping says
Yes, I noticed Doug. I left out a whole paragraph when I copied it over… Stupid from me. I now incorporated it in to the article.