• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

virtual san

Which disk controller to use for vSAN

Duncan Epping · Sep 28, 2017 ·

I have many customers going through the plan and design phase for implementing a vSAN based infrastructure. Many of them have conversations with OEMs and this typically results in a set of recommendations in terms of which hardware to purchase. One thing that seems to be a recurring theme is the question which disk controller a customer should buy. The typical recommendation seems to be the most beefy disk controller on the list. I wrote about this a while ago as well, and want to re-emphasize my thinking. Before I do, I understand why these recommendations are being made. Traditionally with local storage devices selecting the high-end disk controller made sense. It provided a lot of options you needed to have a decent performance and also availability of your data. With vSAN however this is not needed, this is all provided by our software layer.

When it comes to disk controllers my recommendation is simple: go for the simplest device on the list that has a good queue depth. Just to give an example, the Dell H730 disk controller is often recommended, but if you look at the vSAN Compatibility Guide then you will also see the HBA330. The big difference between these two is the RAID functionality offered on the H730 and the cache on the controller. Again, this functionality is not needed for vSAN, by going for the HBA330 you will save money. (For HP I would recommend the H240 disk controller.)

Having said that, I would at the same time recommend customers to consider NVMe for the caching tier instead of SAS or SATA connected flash. Why, well for the caching layer it makes sense to avoid the disk controller. Place the flash as close to the CPU as you can get for low latency high throughput. In other words, invest the money you are saving on the more expensive disk controller in NVMe connected flash for the caching layer.

#STO1490BU : vSAN Vision – The Future of HCI

Duncan Epping · Aug 28, 2017 ·

I am not able to attend many sessions, but wanted to add this one to my schedule. This session was hosted by Lee Caswell (VP Products Storage and Availability VMware), Christos Karamanolis (CTO Storage and Availability VMware) and the VP for Technical Solutions for Fox Dean Perrine.

Lee Caswell kicked off talking about the customer adoption and spoke about some of the fundaments of hyper-convergence and storage management. I think the numbers speak for itself that Lee spoke about, an expected marked size of 10.7 billion USD f for HCI in 2021, growing 3x faster than traditional storage systems. Key reason for it being: simplification of ops/architecture. And that is the reason we just hit 10k customers.

vSAN had 2 major releases in the past year, with some “firsts” during the past 12 months. Like first in HCI to support Intel Optane NVMe SSDs and HPE Synergy support etc. Today Lenovo released a new integrated HCI system called ThinkAgile VX. ServeTheHome has a great article on it, make sure to read it. What is the benefit over a ReadyNode? Well for instance that all support is coming from Lenovo, it is “appliance style”.

Jeff Hunter was pulled up to on stage to demo vSAN and the power of vSAN in terms of performance and ease of use. Jeff moved a VM from NFS to vSAN and shows what the performance increase was. More importantly in my opinion was how easy it was to move the VM by leveraging Policy Based Management. (Change policy and the correct datastore is presented, vSAN in this case.)

Next is announcement around “licensing”, single socket solution, 3 servers including all licenses for less than 25k. Great price point of which hopefully many SMBs and Enterprises which need a ROBO/Edge solution can benefit from.

Dean Perrine is up next and is discussing what Fox is using vSAN for and why. They replaced their NetApp FlexPod environment with Cisco UCS C220 with vSAN. Main reason for replace the FlexPod was the need to reduce complexity, lower power consumption while increasing performance. Fox had problems around Database performance, and it was critical to support this. The environment is used to run Mission-critical video applications like their broadcasting system and channels like Fox Sport etc. In other words, events like the Superbowl are hosted on vSAN as an infrastructure. Currently running across multiple sites leveraging vSAN, NSX and SRM to provide resiliency and flexibility need to run important workloads like this in a highly resilient fashion.

Up next Christos Karamanolis, so it is going to get technical. First up discussing performance, most critical for enterprise customer is predictable performance. Christos shows a benchmark that the vSAN team ran, which was a Stock Brokerage (TPC-E like) workload. What is shown is consistent latency below a millisecond across tests and database sizes. As Christos emphasizes, very important for customers like Fox, as large latency spikes could easily disrupt broadcasting large events.

Christos next demoes the Performance Diagnostic feature that is part of vSAN 6.6.1. If you haven’t seen it yet, watch my demo on youtube. The feature in short analyses benchmarks and provides you tips and hints around how to improve your benchmark or your vSAN configuration from a policy, software or hardware point of view.

Next Christos discusses vSAN usage for VMware Cloud on AWS. What was interesting here, is what Christos mentioned around scaling: independently scale vSAN / Storage from Compute. He didn’t add much detail, so I cannot share much more than this, so I will leave it up to your imagination.

Storage for Cloud Native Apps is up last. What if you have Kubernetes or are using Docker? Can you leverage vSAN or something else? Meet Project Hatchway. Which is a combination of a Docker Volume Service and the vSphere Cloud Provider for Kubernetes. Drivers for storage, natively integrated for instance with Kubo. Leveraging Policy Based Management control/manage/monitor your persistent storage for you containarized workloads. There’s a ton of detail to be found on StorageHub here. Something else that will be introduced to optimize vSAN for workloads like Cassandra / MongoDB etc is the ability to have “storage affinity” for your workloads. In other words, data is co-located with the VM and we will not move the VM around is locality is key for these workloads. You can also imagine these workloads to be deployed with a low “failures to tolerate” level as availability is provided by the app.

And that was it. Great session, thanks for that. Looking forward to sitting in on some of the other “vision/futures” vSAN related sessions.

vSphere 6.5 U1 is out… and it comes with vSAN 6.6.1

Duncan Epping · Jul 28, 2017 ·

vSphere 6.5 U1 was released last night. It has some cool new functionality in there as part of vSAN 6.6.1 (I can’t wait for vSAN 6.6.6 to ship ;-)). There are a lot of fixes of course in 6.6.1 and U1, but as stated, there’s also new functionality:

  • VMware vSphere Update Manager (VUM) integration
  • Storage Device Serviceability enhancement
  • Performance Diagnostics in vSAN

So those using vSAN 6.2 who upgraded to vSphere 6.0 U3, here’s your chance to upgrade to get all vSAN 6.5 functionality, and more!

The VUM integration is pretty cool if you ask me. First of all, when there’s a new release the Health Check will call it out from now on. And on top of that, when you go to VUM then also things like async drivers will be taken in to consideration. Where you would normally have to slipstream drivers in to the image and make that image available through VUM, we now ensure that the image used is vSAN Ready! In other words, as of vSphere 6.5 U1 Update Manager is fully aware of vSAN and integrated as well. We are working hard to bring all vSAN Ready Node vendors on-board. (With Dell, Supermicro, Fujitsu and Lenovo leading the pack.)

Then there’s this feature called “Storage Device Serviceability enhancement”. Well this is the ability to blink the LEDs on specific devices. As far as I know, in this release we added support for HP Gen 9 controllers.

And last but not least: Performance Diagnostics in vSAN. I really like this feature. Note that this is all about analyzing benchmarks. It is not (yet?) about analyzing steady state. So in this case you run your benchmark, preferably using HCIBench, and then analyze it by selecting a specific goal. Performance will be analyzed using “cloud analytics”, and at the end you will get various recommendations and/or explanations for the results you’ve witnessed. These will point back to KBs, which in certain cases will give you hints how to solve your (if there is) bottleneck.

Note that in order to use this functionality you need to join CEIP (Customer Experience Improvement Program), which means that you will upload info in to VMware. But this by itself is very valuable as it allows our developers to solve bugs / user experience issues and getting a better understanding of how you use vSAN. I spoke with Christian Dickmann on this topic yesterday as he tweeted the below, and he was really excited, he said he had various fixes going in the next vSAN release based on the current data set. So join the program!

A huge THANK YOU to anyone participating in Customer Experience Improvement Program (CEIP)! Eye-opening data that we are acting on.

— Christian Dickmann (@cdickmann) July 27, 2017

For those how can’t wait, here’s the release notes and downloads links:

  • vCenter Server 6.5 u1 release notes: https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-vcenter-server-651-release-notes.html
  • vCenter Server 6.5 u1 downloads: https://my.vmware.com/web/vmware/details?downloadGroup=VC65U1&productId=614&rPId=17343
  • ESXi 6.5 release notes: https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-esxi-651-release-notes.html
  • ESXi 6.5 download: https://my.vmware.com/web/vmware/details?downloadGroup=ESXI65U1&productId=614&rPId=17342
  • vSAN 6.6.1 release notes: https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vmware-vsan-661-release-notes.html

Oh and before I forget, there’s new functionality in the H5 Client for vCenter in 6.5 U1. As mentioned on this vSphere blog: “Virtual Distributed Switch (VDS) management, datastore management, and host configuration are areas that have seen a big increase in functionality”. And then some of the “max config” items had a big jump. 50k powered on VMs for instance is huge:

  • Maximum vCenter Servers per vSphere Domain: 15 (increased from 10)
  • Maximum ESXi Hosts per vSphere Domain: 5000 (increased from 4000)
  • Maximum Powered On VMs per vSphere Domain: 50,000 (increased from 30,000)
  • Maximum Registered VMs per vSphere Domain: 70,000 (increased from 50,000)

That was it for now!

Sizing a vSAN Stretched Cluster

Duncan Epping · May 30, 2017 ·

I have had this question a couple of times already, how many hosts do I need per site when the Primary FTT is set to 1 and the Secondary FTT is set to 1 and RAID-5 is used as the Failure Tolerance Method? The answer is straight forward, you have a local RAID-5 set locally in each site. RAID-5 is a 3+1 configuration, meaning 3 data blocks and 1 parity block. As such each site will need 4 hosts at a minimum. So if the requirement is PFTT=1 and SFTT=1 with the Failure Tolerance Method (FTM) set to RAID-5 then the vSAN Stretched Clustering configuration will be: 4+4+1. Note, that also when you use RAID-1 you will need at minimum 3 hosts per site. This because locally you will have 2 “data” components and 1 witness component.

From a capacity stance, if you have a 100GB VM and do PFTT=1, SFTT=1 and FTM set to RAID-1 then you have a local RAID-1 set in each site. Which means 100GB requires 200GB in each location. So 200% required local capacity, 400% for the total cluster. Using the below table you can easily see the overhead. Note that RAID-5 and RAID-6 are only available when using all-flash.

I created a quick table to help those going through this exercise. I did not include “FTT=3” as this in practice is not used too often in stretched configurations.

DescriptionPFTTSFTTFTMHosts per siteStretched ConfigSingle site capacityTotal cluster capacity
Standard Stretched across locations with local protection11RAID-133+3+1200% of VM400% of VM
Standard Stretched across locations with local RAID-511RAID-544+4+1133% of VM266% of VM
Standard Stretched across locations with local RAID-612RAID-666+6+1150% of VM300% of VM
Standard Stretched across locations no local protection10RAID-111+1+1100% of VM200% of VM
Not stretched, only local RAID-101RAID-13n/a200% of VMn/a
Not stretched, only local RAID-501RAID-54n/a133% of VMn/a
Not stretched, only local RAID-602RAID-66n/a150% of VMn/a

Hope this helps!

vSAN 6.6 Stretched Cluster Demo

Duncan Epping · May 19, 2017 ·

I had one more demo to finish and share and that is the vSAN 6.6 stretched cluster demo. I already did a stretched clustering demo when we initially released it, but with the enhanced functionality around local protection I figured I would re-record it. In this demo (~12 minutes) I will show you how to configure vSAN 6.6 with dedupe / compression enabled in a Stretched Cluster configuration. I will also create 3 VM Storage Policies, assign those to VMs and show you that vSAN has place the data across locations. I hope you find it useful.

  • « Go to Previous Page
  • Page 1
  • Page 2
  • Page 3
  • Page 4
  • Interim pages omitted …
  • Page 36
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Advertisements




Copyright Yellow-Bricks.com © 2025 · Log in