In part 1 of “Building a hyper-converged platform using VMware technology” I went through the sizing and scaling exercise. In short to recap, in order to run 100 VMs we would need the following resources:
- 100 x 1.5 vCPUs = ~30 cores
- 100 x 5 GB = 500GB of memory
- 100 x 50 GB (plus FTT etc) = 11.8 TB of disk space
From a storage perspective 11.8 TB is not a huge amount, 500 GB of memory can easily fit in a single host today, and 30 cores… well maybe not easilyin a single host but it is no huge requirement either. What are our options? Lets give an example of some server models that fall into the category we are discussing:
- SuperMicro Twin Pro – 2U chassis with 4 nodes. Per node: Capable of handling 6 * 2.5″ drives and on-board 10GbE. Supports the Intel E-2600 family and up to 1TB of memory
- SuperMicro is often used by startups, especially in the hyperconverged space but also hybrid storage vendors like Tintri use their hardware. Hey SuperMicro Marketing Team, this is something to be proud of… SuperMicro powers more infrastructure startups than anyone else probably!
- Note you can select 3 different disk controller types, LSI 3108, LSI 3008 and the Intel C600. Highly recommend the LSI controllers!
- HP Sl2500t – 2U chassis with 4 nodes. Per node: Capable of handling 6 * 2.5″ or 3 * 3.5″ drives and FlexibleLOM 10GbE can be included. Supports the Intel E-2600 family and up to 512GB of memory
- You can select from the various disk controllers HP offers, do note that today there are a limited number of controllers certified.
- Many probably don’t care, but the HP kit just looks awesome 🙂
- Dell C6000 series – 2U chassis with 4 nodes. Per node: Capable of handling 6 * 2.5″ per node or 3 * 3.5″ drives. Supports the Intel E-2600 family and up to 512GB of memory
- Note there is no on-board 10GbE or “LOM” type of solution, you will need to add a 10GbE PCIe card.
- Dell offers 3 different disk controllers including the LSI 2008 series. Make sure to check the HC.
First thing to note here is that all of the configuration above by default come with 4 nodes, yes you can order them with less but personally I wouldn’t recommend that. Strange thing is that in order to get configuration details for the Dell and HP you need to phone them up, so lets take a look at the SuperMicro Twin Pro as there are details to be found online. What are our configuration options? Well plenty I can tell you that. CPUs ranging low-end Quad-core 1.8GHz up to Twelve-core 2.7 GHz Intel CPUs. Memory configurations ranging from 2GB DIMMS to 32GB DIMMS including the various speeds. Physical disks ranging from 250GB 7200 RPM SATA Seagate to 1.2TB 10k RPM SAS Hitachi drives. Unlimited possibilities, and that is probably where it tends to get more complicated.
One thing though I do want to stress is the drive type used. If you look at the various drive types it is very very tempting to buy the cheap SATA drives with high capacity. I mean you can buy 4TB drives these days for a couple of bucks. One thing to realize though is that there could be an impact. Just look at the following:
- 2 x 4TB 7200 RPM SATA drive = 8TB Capacity / ~150 IOps
- 16 x 500GB 10K RPM SAS drive = 8TB Capacity / ~2400 IOps
It is something I wrote about before and was a big point of discussion during various twitter conversations and VMUGs, which type of drive you should be using will fully depend on your use case. I know for my configurations I am being conservative and am taking the SAS route. Although more expensive, it will help destaging from SSD faster and more importantly it will help read performance when the read has to come from HDD as the requested block is not in SSD/cache. For this configuration I am assuming SAS drives are used and we know we need the following:
- 30 Cores
- 500 GB of memory
- 11.8 TB of disk space
I want to take a 1 node failure in to account and make sure all my VMs get the resources they need even after a failure, except for memory I will allow some overcommitment there in the case of a failure. To ensure I sufficient resources will divide the total number of resources by 3 and equip all of the 4 hosts with the same config. This means I need 10 cores per host, 166GB of memory and 4TB of disk capacity. As stated, I will slightly overcommit on memory in the case of a failure and instead of going for 166+ GB I will go with 128GB as I am counting on vSphere’s smart memory reclamation techniques to do its job.
Lets not forget the flash resources which is recommended to be 10% of the disk capacity size, this results in 400GB flash resources per host. Now let me spec that config out using a nice online configuration tool (various out there, just google it). This is the result:
- Supermicro SuperServer 2027PR-HC0TR – 2U Twin2 (4 nodes)
- 24x SATA/SAS – Dual 10-Gigabit Ethernet – LSI 3008 12G SAS – 2000W Redundant
- 8 *
- 32 *
- 20 *
- 4 * Intel S3700 400GB SSD
- ~ $ 28.500
In this case, as you can see, I even have more CPU power than needed in terms of cores, but I slightly under provisioned from a memory perspective. As stated, I am counting on TPS in failure scenarios to do its job, of course I could have gone with a different memory configuration like 192GB but the config tool didn’t provide me that option. For an additional $ 6000,- you could double the memory though in this configuration. The SSDs were manually added and selected because the ones listed were not what I hoped for. I added the price of the SSDs to the total number. There you have it a nice building block for your datacenter or company.
I know this is a lot to read and digest and in some cases may sound complicated. It should be noted that VMware’s HCL and the concept of VSAN Ready Nodes should simplify this exercise over time. Nevertheless, for many this would probably be a 1 time exercise and just a matter of repeating what was decided on. Figuring out what your requirements are and what your estate looks like today is probably most complex for the majority of people out there, and this is something you will need to do in a both a “pre-packaged” solution or when you are building your own solution. All in all, I think it is nice to have a choice.
Fabio Brizzolla says
Hey Duncan: what about Dell’s VRTX?
Duncan Epping says
I don’t know if Dell will test/certify them
John Nicholson. says
VRTX has a built in SAN. Its not really what I’d see you build VSAN off of. (Maybe use Virstro to accelerate?)
David Pasek says
To be more accurate VRTX has shared RAID controller allowing to share single LUN among several physical server sleds inside VRTX chassis. Therefore VRTX can be used for vSphere Cluster without VSAN. It’s also the reason VRTX is not the best choice for VSAN. DELL is coming with another product excellently suitable for VSAN. Just wait approx. 2 months. I cannot tell more 😉
Yuri Semenikhin says
Hi Duncan, what about SSD redundancy per node, as i understand you assume 1 ssd per node without RAID, so if one SSD will failed, will lost hole node ?, and some for 900GB SAS drive, there are no HP, only RAID 5 ?
duncan says
If the SSD fails then the disk group will be gone for the time the SSD is gone. There is no solution for that today. Not sure I understand your SAS drive question.
Yuri Semenikhin says
SAS drives are without hot spare (HP) only RAID redundancy ?, and if ssd dick group gone node will continues operate without ssd, using only sas disk group
Duncan Epping says
You completely lost me. I have no clue what RAID has got to do with any of this. Like I described in other posts VSAN takes care of resiliency for spindles by mirroring objects to multiple location. The SSD is the write buffer and read cache for a diskgroup. When your SSD dies your diskgroup will be unavailable for that host until the SSD is replaced.
Yuri Semenikhin says
OK, now its clear,
Thx
Mario says
Hi Duncan,
lets asume I would use 3 of the HP blocks (each with 4 nodes).
If I dont want a single node to be a single point of failure I guess the best would be to build a dedicated disk group per node in each block (lets say 4x 1SSD + 5HDD). So I can lose now a single node without a problem but whats about the whole block itself? Will there be any feature that let us make sure that the mirrored object will be placed on a node in another block? I think thats quite the same functionallity that a vMSC setup would require.
Without this kind of feature / function I can´t imagin using the DELL or HP solution you listed above, because if a whole block is down there is a big chance that many VMs can´t be restarted because the offline block holds the original + the mirrored object.
Cheers,
Mario
duncan says
That could happen indeed. Although these systems typically have redundant power supplies and no shared backplane like the blade chassis have. Chances of those failing are limited.
I agree this type of functionality would be welcome, being able to specify failure domains. I have requested this feature and our engineering team is looking to include some form (probably API driven) for the GA release. I will validate it this week for you as I am in Palo Alto anyway.
Mario says
Cool, that would be awesome!
Doug Youd says
Looking at building out the supermicro option for my lab, but the 2027PR-HC0TR is not currently on the vSphere HCL?
http://www.vmware.com/resources/compatibility/vcl/result.php?search=2027PR&searchCategory=all
Assuming it will probably work… but worth noting it’s not specifically supported?