• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

vSAN

2 is the minimum number of hosts for VSAN if you ask me

Duncan Epping · Oct 1, 2015 ·

In 2013 I wrote an article about the minimum number of hosts for Virtual SAN. Since then this post has started living its own life. Somehow people have misunderstood my post and used/abused it in many shapes and forms. When I look at the size of a traditional cluster (non-VSAN) the minimum size is 2. From an availability perspective I ask myself what is the risk I am willing to take. What does that mean?

In a previous life I did many projects for SMB customers. My SMB customers typically had somewhat in the range of 2-5 hosts. With the majority having 2-3. In many cases those having 2-3 hosts were running roughly a similar number of virtual machines. The difference between the two situations “2 hosts” versus “3 hosts” was whether during times of maintenance (upgrading / updating) or failure if the ability to restart the virtual machine after a secondary failure. Many customers decided to go with 2 node clusters. Key reason for it being price vs risk. At normal operations risk is low, but the price of an additional host was relatively high.

Now compare this to Virtual SAN and you will see the same applies. With Virtual SAN we have a minimum of 3 hosts, well in a ROBO configuration you can have 2 with an external witness. This means that from a support perspective the bare minimum of dedicated physical hosts required for VSAN is 2. There you go, 2 is the bare minimum for ROBO. For non-ROBO 3 is the minimum. Fully supported, offers all functionality and similar to 4 hosts.

Is having an extra host a good plan? Yes of course it is. HA / DRS / VSAN (and any other scale-out storage solution for that matter) will benefit from more hosts. You as a customer need to ask yourself what the risk is, and if the cost is justifiable.

PS1: A question just came in, want to make that it is clear. Even in a 2-host ROBO configuration you can do maintenance! A single copy of the data and the witness remains available and will have quorum.

PS2: No, you cannot host your “witness” VM on the VSAN cluster itself, this is not supported as the witness is the quorom for the cluster and it should be outside of the cluster to provide certainty of the state in the case of a failure.

VSAN made storage management a non issue for the 1st time

Duncan Epping · Sep 28, 2015 ·

When ever I talk to customers about Virtual SAN the question that comes up usually is why Virtual SAN? Some of you may expect it to be performance, or the scale-out aspect, or the resiliency… None of that is the biggest differentiator in my opinion, management truly is. Or should I say the fact that you can literally forget about it after you have configured it? Yes, of course that is something you expect every vendor to say about their own product. I think the reply of one of the users during the VSAN Chat that was held last week is the biggest testimony I can provide: “VSAN made storage management a non-issue for the first time for the vSphere cluster admin”. (see tweet below)

@vmwarevsan VSAN made storage management a non-issue for this first time vSphere cluster admin! #vsanchat http://t.co/5arKbzCdjz

— Aaron Kay (@num1k) September 22, 2015

When we released the first version of Virtual SAN I strongly believed we had a winner on our hands. It was so simple to configure, you don’t need to be a VCP to enable VSAN, it is two clicks. Of course VSAN is a bit more than just that tick box on a cluster level that says “enable”. You want to make sure it performs well, all drivers/firmware combinations are certified, the network is correctly configured etc. Fortunately we also have a solution for that, this isn’t a manual process.

No, you simply go to the VSAN healthcheck section on your VSAN cluster object and validate everything is green. Besides simply looking at those green checks, you can also run certain pro-active tests that will allow you to test for instance multicast performance, VM creation, VSAN performance etc. It all comes as part of vCenter Server as of the 6.0 U1 release. On top of that there is more planned. At VMworld we already hinted at it, advanced performance management inside vCenter based on a distributed and decentralized model. You can expect that at some point in the near future, and of course we have the vROps pack for Virtual SAN if you prefer that!

No, if you ask me, the biggest differentiator definitely is management… simplicity is the key theme, and I guarantee that things will only improve with each release.

Designing a Virtual SAN stretched cluster

Duncan Epping · Sep 23, 2015 ·

There is a lot of material on stretched clusters already out there but somehow it seems that is hasn’t reached everyone yet. Last couple of weeks I spend a lot of time on the phone with different customers in the various countries/regions talking about designing a Virtual SAN stretched cluster. In this post I wanted to collect some design considerations for your Virtual SAN stretched clusters and provide pointers to different articles and white papers that can help you getting a better understanding of the solution. If any additional considerations come up in the various conversations I still have planned I will add this to this article, so it will be very much a “work in progress”.

First and foremost a stretched cluster isn’t something you implement “just because you can”. It is a solution which is usually implemented when there is a strong desire to be able to avoid a disaster or recover in an extremely fast way from a disaster. Customers I talk with are usually mid-size and up, and typically provide some form of 24×7 service. As an example, I have a customer who runs (mission) critical workloads and there is one who hosts websites which have high uptime requirements(government services) on their stretched clusters. In both cases downtime is not acceptable from a business perspective, and unfortunately not all applications provide the level of availability required, which means it needs to be solved on a different layer.

First thing that needs to be looked at is the network. From a Virtual SAN perspective there are clear requirements:

  • 5ms RTT latency max between data sites
  • 200ms RTT latency max between data and witness site
  • Both L3 and L2 are supported between the data sites
    • 10Gbps bandwidth is recommended, dependent on the number of VMs this could be lower or higher, more guidance will be provided soon around this!
  • L3 is expected between data and the witness sites
    • 100Mbps bandwidth is recommended, dependent on the number of VMs this could be lower or higher, more guidance will be provided soon around this!

First thing I need to call out, if you do L3 between data sites note that you will need some form of multicast routing. L3 from data to the witness site doesn’t need this, it doesn’t use multicast, this requirement has been removed, which simplifies the design. Latency requirements are strict. 5ms (data/data) and 200ms (data/witness) maximum, but I think it is important to say that network latency will impact your storage performance. Each write will need to be replicated to the other site, which means that each write could take 5ms if that is your RTT. Yes this can be a killer, but keep in mind that all writes always go to SSD first, so network is going to be the challenge. The lower the latency the better. Also note that this only applies to “writes”, reads will be served locally as with the Stretched Cluster functionality VSAN also introduced “site locality” to avoid those network hops for reads. Some may say: well who in their right mind is going to incur a 5ms or 3ms network latency for every IO, well I guess that fully depends on what your business requirements are.

I already mentioned witness requirements, but what about the witness itself? I was talking to a customer last week who was planning on placing a brand new physical host to serve as the witness. No need for that, you can just deploy the VSAN witness appliance on an existing host. That witness will come with all the licenses included, so there is no vSphere/VSAN cost. If you have no third site then I think it is good to know that we are working on certifying the use of the witness in vCloud Air. Easy and cost effective way of having a witness in a 3rd site without needing a 3rd site and having to manage a 3rd site.

Then there is compute of course and storage. What do you need from that point of view? Well first of all you will have to buy a lot more hardware than you would normally need running in a single location. You will need extra CPU and memory resources to ensure VMs can be restarted when a full site has failed. Yes, HA Admission Control will help with that but you also need to plan for it which is something not everyone always realizes. I guess it is a discussion that you will need to have with the business, does performance need to be the same before / after a failure? If yes, then make sure you have sufficient capacity to tolerate a 50% loss.

From a storage perspective the VSAN Stretched Cluster is based on FTT=1. This means that if you have a 10GB VMDK that 10GB is stored in the first site, and another 10GB is stored in the second site, for a total of 20GB. Of course there is the swap file for a VM and some overhead. But that is relatively simple to calculate. Just remember: (Average VM disk capacity + Swap) * 2. I usually add 10% slackspace and another 10 / 20% for snapshots depending on the usage, and I would recommend adding room for growth. Another thing to remember is the limit of 200 VMs per host with this version of VSAN. Keep in mind that you want to tolerate a full site failure, so you will want to make sure that all VMs can run on the remaining site in a supported manner.

When it comes to HA and DRS the configuration is pretty straight forward and has been described in-depth by both Cormac and myself. A couple of things I want to point out in this article as they are configuration details which are easy to forget about.

  • Make sure to specify additional isolation addresses, one in each site (das.isolationAddress0 – 1).
  • Disable the default isolation address if it  can’t be used to validate the state of the environment during a partition (if the gateway isn’t available in both sides).
  • Disable Datastore heartbeating, without traditional external storage there is no reason to have this.
  • Enable HA Admission Control and make sure it is set to 50% for CPU and Memory.
  • Keep VMs local by creating “VM/Host” should rules.

And I think that covers most of it, well summarized relatively briefly compared to the excellent document Cormac developed with all details you can wish for. Make sure to read that if you want to know every aspect.

vSAN licensing / packaging

Duncan Epping · Sep 14, 2015 ·

I’ve seen many questions on vSAN packaging over the last months so I figured I would share a table that shows what is possible with which license. A lot of the confusion is around the “ROBO” use case, and I want to make it crystal clear that you can deploy a 2-node ROBO configuration using Standard, Advanced or the special “vSAN for ROBO” 25VM pack that will be made available. Anyway, when it comes to functionality the table below should make it crystal clear what is included with what.

Before anyone asks, “stretched clusters” refers to the vSAN stretched cluster workflow / feature. Two data center rooms in the same building leveraging external witness capabilities through the stretched cluster workflow requires “Advanced”. Three datacenters stretched across campus distance using “fault domains” does not require Advanced, but can use Standard.

Also note that “vSAN Advanced” is included in the “Horizon Advanced” and the “Horizon Enterprise” Suites. If you have either of those, I highly recommend testing vSAN, I am seeing more and more customers taking advantage of it, a great storage platform which performs extremely and is really simple to manage is included in your suite, why not use it?!

The below table shows what the current licensing/packaging looks like for vSAN 6.6. Note that for vSAN 6.5 “all-flash” is now available in all licensing levels. In vSAN 6.6 “QoS” has been dropped down to Standard, and “Local Site Protection for Stretched Clusters” and “vSAN Encryption” have been added to Enterprise. For pricing, please contact your partner or a VMware sales rep.

vSAN
Standard
vSAN
Advanced
vSAN EnterprisevSAN for ROBO StandardvSAN for ROBO Advanced
SPBMXXXXX
Read/Write SSD CachingXXXXX
Distributed RAIDXXXXX
Distributed SwitchXXXXX
Snapshots / ClonesXXXXX
Rack AwarenessXXXXX
Health MonitoringXXXXX
vSphere Replication *XXXXX
Two Node Robo ConfigurationXXXXX
Two Node Direct ConnectXXXXX
All-FlashXXXXX
Quality of ServiceXXXXX
Dedupe and CompressionXXX
RAID-5/6XXX
Stretched ClusterX
Local Site Protection for Stretched ClustersX
vSAN EncryptionX

* vSphere Replication is new with a 5 minute RPO, this was exclusive certified for vSAN. In some material you will see this being referred too as vSAN Replication.

Full licensing white paper can be found here,

Virtual SAN 6.1 available today!

Duncan Epping · Sep 10, 2015 ·

What more do I need to say? vSphere 6.0 U1 was released today and it ships with Virtual SAN 6.1. By now you’ve all seen my posts on what’s new for VSAN 6.1 and you’ve hopefully seen the demo we created for stretched clustering. If you want to play with 6.1 yourself then you can find it here:

  • VSAN 6.1 Product download page
  • VSAN 6.1 Release Notes
  • VSAN 6.1 Administration Guide
  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 49
  • Page 50
  • Page 51
  • Page 52
  • Page 53
  • Interim pages omitted …
  • Page 71
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Also visit!

For the Dutch-speaking audience, make sure to visit RunNerd.nl to follow my running adventure, read shoe/gear/race reviews, and more!

Do you like Hardcore-Punk music? Follow my Spotify Playlist!

Do you like 80s music? I got you covered!

Copyright Yellow-Bricks.com © 2026 · Log in