Building a hyper-converged platform using VMware technology part 2

In part 1 of “Building a hyper-converged platform using VMware technology” I went through the sizing and scaling exercise. In short to recap, in order to run 100 VMs we would need the following resources:

  • 100 x 1.5 vCPUs = ~30 cores
  • 100 x 5 GB = 500GB of memory
  • 100 x 50 GB (plus FTT etc) = 11.8 TB of disk space

From a storage perspective 11.8 TB is not a huge amount, 500 GB of memory can easily fit in a single host today, and 30 cores… well maybe not easilyin a single host but it is no huge requirement either. What are our options? Lets give an example of some server models that fall into the category we are discussing:

  • SuperMicro Twin Pro – 2U chassis with 4 nodes. Per node: Capable of handling 6 * 2.5″ drives and on-board 10GbE. Supports the Intel E-2600 family and up to 1TB of memory
    • SuperMicro is often used by startups, especially in the hyperconverged space but also hybrid storage vendors like Tintri use their hardware. Hey SuperMicro Marketing Team, this is something to be proud of… SuperMicro powers more infrastructure startups than anyone else probably!
    • Note you can select 3 different disk controller types, LSI 3108, LSI 3008 and the Intel C600. Highly recommend the LSI controllers!
  • HP Sl2500t – 2U chassis with 4 nodes. Per node: Capable of handling 6 * 2.5″ or 3 * 3.5″ drives and FlexibleLOM 10GbE can be included. Supports the Intel E-2600 family and up to 512GB of memory
    • You can select from the various disk controllers HP offers, do note that today there are a limited number of controllers certified.
    • Many probably don’t care, but the HP kit just looks awesome :)
  • Dell C6000 series – 2U chassis with 4 nodes. Per node: Capable of handling 6 * 2.5″ per node or 3 * 3.5″ drives. Supports the Intel E-2600 family and up to 512GB of memory
    • Note there is no on-board 10GbE or “LOM” type of solution, you will need to add a 10GbE PCIe card.
    • Dell offers 3 different disk controllers including the LSI 2008 series. Make sure to check the HC.

First thing to note here is that all of the configuration above by default come with 4 nodes, yes you can order them with less but personally I wouldn’t recommend that. Strange thing is that in order to get configuration details for the Dell and HP you need to phone them up, so lets take a look at the SuperMicro Twin Pro as there are details to be found online. What are our configuration options? Well plenty I can tell you that. CPUs ranging low-end Quad-core 1.8GHz up to Twelve-core 2.7 GHz Intel CPUs. Memory configurations ranging from 2GB DIMMS to 32GB DIMMS including the various speeds. Physical disks ranging from 250GB 7200 RPM SATA Seagate to 1.2TB 10k RPM SAS Hitachi drives. Unlimited possibilities, and that is probably where it tends to get more complicated. [Read more...]

Building a hyper-converged platform using VMware technology part 1

I have been working on a slidedeck lately that explains how to build a hyper-converged platform using VMware technology. Of course it is heavily focusing on Virtual SAN as that is one of the core components in the stack. I created the slidedeck based on discussions I have had with various customers and partners who were looking to architect a platform for their datacenters that they could easily repeat. A platform which had a nice form factor and allowed them to scale out and up. Something that could be used in a full size datacenter, but also in smaller SMB type environments or ROBO deployments even.

I guess it makes sense to start with explaining what hyper-converged means to me, although I already wrote various articles on this topic a while back. I explicitly say “to me” as I am sure many folks will have a different opinion on this. A hyper-converged platform is an appliance type of solution where a single box provides a platform for virtual machines. This box typically contains multiple generic x86 hosts (trying to avoid using the word commodity) on which a hypervisor is installed, local storage which is aggregated in to a large shared pool, and network ports. Note that typically network switching is not included in the box itself, well except for virtual switches. In order to aggregate storage in to a large shared pool an additional piece of software is required. Typical examples of hyper-converged platforms which are out there today are Nutanix, SimpliVity and Pivot .

The question than arises if these are “just” x86 boxes with hypervisors installed and storage software, what are the benefits over a regular environment? Those benefits in my opinion are:

  • Time to market is short, < 4hrs to install / deploy (probably much faster for the majority)
  • Easy of management and integration
  • Scale out, both capacity and performance wise
  • Typically more affordable (results will vary)

Sounds like a done deal right? Easier, cheaper and faster… It is fair to say that these are great solutions for many companies as they provide you with one throat to choke. With that meaning it is a single SKU offering and which includes a single point of contact for support in most cases. Only downside I have heard from some partners and customers is that these solutions are typically tied to hardware and specific configurations, which is not always the same as your preferred supplier and probably not the exact configuration you prefer. This could lead to operational challenges when it comes to updating / patching, which probably doesn’t make the operational team happy. On top of that there is the “trust” issue. Some people swear by HP and would never ever want to touch any other brand, while others won’t come close to it. That is a matter of experience and personal taste I guess. Where is all of this leading to? Well here is, in my opinion, where Virtual SAN / VSAN comes in. [Read more...]

Startup News Flash part 12

First edition of the 2014 of the Startup News Flash. I expect this year to be full of announcements, new rounds of funding, new products, new features and new companies. There are various startups planning to come out of stealth this year and all play in the storage / flash space, so make sure to follow this series!

On Tuesday the 14th of January Nutanix announced a new round of funding. Series D financing is co-led by Riverwood Capital and SAP Ventures, and the total amount is $101 million. The company has now raised a total of $172.2 million in four rounds of funding and has been valuated close to $ 1 billion. Yes, that is huge. Probably one of the most successful startups of the last couple of years. Congrats to everyone involved!

announced a rather aggressive program. The Register reported it here, and it is all about replacing NetApp systems with Tintri systems. In short: “The “Virtualize More with 50% Less” Program offers 50% storage capacity and rack space savings versus currently installed NetApp FAS storage to support deployed virtualization workloads”. I guess it is clear what kind of customers they are going after and who their primary competition is. Of course there is a list of requirements and constraints which the Register already outlined nicely. If you are looking to replace your current NetApp storage infrastructure I guess this could be a nice offer, or a nice way to get more discount.. Either way, you win.

SSD and PCIe flash devices are king these days, but SanDisk is looking to change that with the announcement of the availability of the ULLtraDIMM. The ULLtraDIMM is a combination of Diablo’s DDR3 tranlation protocol and SanDisk’s flash and controllers on top of a nice DIMM. Indeed, it doesn’t get closer to your CPU then straight on your memory bus. By the looks of it IBM is one of the first vendors to offer it, as they  recently announced that the eXFlash DIMM is an option for its System x3850 and x3950 X6 servers providing up to 12.8TB of flash capacity2. Early benchmarks showed write latency around 5-10 microsecond! I bet half the blogosphere just raised their hands to give this a go in their labs!


Virtual SAN Datastore Calculator

Over a week ago I wrote an article around how to size your Virtual SAN Datastore. I included an equation to help everyone who is going through the same exercise. I figured I should be able to make life even easier by creating a simple Virtual SAN Datastore calculator based on this equation.

Please note that I’ve filled out the form with some default values, please change these as required. Although various people have tested the VSAN datastore calculator for me, I provide no guarantees. If anything is broken or you have questions, please let me know by leaving a comment.

Some short explanation may be useful:

  • I have taken the best practice of 10% flash vs disk capacity in to account and provide both the old and new rule of thumb (“full capacity” and the “10% of anticipated used capacity before FTT”) for educational purposes.
  • I have taken 10% slack space (my recommendation) into account
  • Per host capacity is provided, both with the option to rebuild after failure and without that option. More details around that can be found here.
  • I have added some flash performance details, these are taken from the 2 Million IOps blog post and are just an indication.

Have fun, Go VSAN!

** updated for GA numbers **


Virtual SAN Read IO – cache / buffer / spindles

I had a question around how Virtual SAN read IO is handled when data can be anywhere: read cache, write buffer, disks. On VMTN one of the engineers recently explained this. I figured I would create a quick diagram to illustrate it. Basically how it works is that VSAN will check the read cache, if the block that needs to be read is not available in the read cache it will check whether the block is in the write buffer or on disk. Simple right?

In the scenario I drew below two blocks needs to be read. Block 1 is actively served by ESXi-01 and Block 2 is actively served by ESXi-03. In the case of ESXi-01 the block resides in the read cache so it is read from the cache. In the case of ESXi-03 it is not in the read cache and neither in the write buffer, hence it is read from the magnetic disks. Do note that this is 1 virtual machine, so reads are being served from 2 hosts and depending who is actively serving IO for that block the block can reside on that host in the read cache. The host which is not actively serving IO for that block will also not place the block in the read cache! (Of course if the host which is actively serving IO for a block fails the other host will take over.)

I hope that helps.