I had a question from a customer earlier who wanted to test the vSAN Stretched Cluster functionality that was introduced in 9.0 called Site Maintenance. Yes it is indeed what you would expect it to be, a new feature that allows you to place a whole site into maintenance mode at once. Very useful, but this customer was unable to find the button in the UI. Which, by the way, is not strange, as this capability (along with the Manual Site Takeover) capability is only available through an RPQ request at the moment, and it is also only available for vSAN OSA for now, so keep that in mind when filing an RPQ through your Broadcom/VMware contact! When you get approved, you will be informed on how you can get this functionality enabled, and then it will be added in the UI on the fault domain level as shown in the screenshot below, taken from my 9.0 lab!
vsan stretched cluster
vSAN Component vote recalculation with Witness Resilience, the follow up!
I wrote about the Witness Resilience feature a few years ago and had a question on this topic today. I did some tests and then realized I already had an article describing how it works, but as I also tested a different scenario I figured I would write a follow up. In this case we are particularly talking about a 2-node configuration, but this would also apply to stretched cluster.
In a stretched cluster, or a 2-node, configuration when a data site goes down (or is placed into maintenance mode) a vote recalculation will automatically be done on each object/component. This is to ensure that if now the witness ends up failing, the objects/VMs will remain accessible. How that works I’ve explained here, and demonstrated for a 2-node cluster here.
But what if the Witness fails first? Well, I can explain it fairly easily, then the VMs will be inaccessible if the Witness goes down. Why is that? Well because the votes will not be recalculated in this scenario. Of course, I tested this and the screenshots below demonstrate it.
This screenshot shows the witness as Absent and both the “data” components have 1 vote. This means that if we fail one of those hosts the component will become inaccessible. Let’s do that next and then check the UI for more details.
As you can see below, the VM is now inaccessible. This is the result of the fact that there’s no longer a quorum, as 2 out of 3 votes are dead.
I hope that explains how this works.
Doing site maintenance in a vSAN Stretched Cluster configuration
I thought I wrote an article about this years ago, but it appears I wrote an article about doing maintenance mode with a 2-node configuration instead. As I’ve received some questions on this topic, I figured I would write a quick article that describes the concept of site maintenance. Note that in a future version of vSAN, we will have an option in the UI that helps with this, as described here.
First and foremost, you will need to validate if all data is replicated. In some cases, we see customers pinning data (VMs) to a single location without replication, and those VMs will be directly impacted if a whole site is placed in maintenance mode. Those VMs will need to be powered off, or you will need to make sure those VMs are moved to the location that remains running if they need to stay running. Do note, if you flip “Preferred / Secondary” and there are many VMs that are site local, this could lead to a huge amount of resync traffic. If those VMs need to stay running, you may also want to reconsider your decision to replicate those VMs though!
These are the steps I would take when placing a site into maintenance mode:
- Verify the vSAN Witness is up and running and healthy (see health checks)
- Check compliance of VMs that are replicated
- Configure DRS to “partially automated” or “Manual” instead of “Fully automated”
- Manually vMotion all VMs from Site X to Site Y
- Place each ESXi host in Site X into maintenance mode with the option “no data migration”
- Power Off all the ESXi hosts in Site X
- Enable DRS again in “fully automated” mode so that within Site Y the environment stays balanced
- Do whatever needs to be done in terms of maintenance
- Power On all the ESXi hosts in Site X
- Exit maintenance mode for each host
Do note, that VMs will not automatically migrate back until the resync for that given VM has been fully completed. DRS and vSAN are aware of the replication state! Additionally, if VMs are actively doing IO when hosts in Site X are going into maintenance mode, the state of data stored on hosts within Site X will differ. This concern will be resolved in the future by providing a “site maintenance” feature as discussed at the start of this article.
VCF-9 Vision for a Federated Storage View and vSAN (stretched cluster) visualizations!
As mentioned last week, the sessions at Explore Barcelona were not recorded. I still wanted to share with you what it is we are working on, so I decided to record and share a few demos, and some of the slides we presented. In this video, I show our vision for a Federated Storage View for both vSAN and more traditional storage systems. This federated view will not only provide insights in terms of capacity and performance capabilities, it will also provide you with a visualization of a stretched cluster configuration. This is something that I have been asking for for a while now, and it looks like this will become a reality in VCF 9 at some point. As this revolves all around visualization, I would urge you to watch the below video. And as always, if you have feedback, please leave a comment!
Doing network/ISL maintenance in a vSAN stretched cluster configuration!
I got a question earlier about the maintenance of an ISL in a vSAN Stretched Cluster configuration which had me thinking for a while. The question was what would you do with your workload during maintenance. I guess the easiest of course is to power off all VMs and simply shutdown the cluster, for which vSAN has a UI option, and there’s a KB you can follow. Now, of course, there could also be a situation where the VMs need to remain running. But how does this work when you end up losing the connection between all three locations? Normally this would lead to a situation where all VMs will become “inaccessible” as you will end up losing quorum.
As said, this had me thinking, you could take advantage of the “vSAN Witness Resiliency” mechanism which was introduced in vSAN 7.0 U3. How would this work?
Well, it is actually pretty straight forward, if all hosts of 1 site are in maintenance mode, failed, or powered off, the votes of the witness object for each VM/Object will be recalculated within 3 minutes. When this recalculation has completed the witness can go down without having any impact on the VM. We introduced this capability to increase resiliency in a double-failure scenario, but we can (ab)use this functionality also during maintenance. Of course I had to test this, so the first step I took was placing all hosts in 1 location into maintenance mode (no data evac). This resulted in all my VMs being vMotioned to the other site.
Now next I checked with RVC if my votes were recalculated or not. As stated, depending on the number of VMs this can take around 3 minutes in total, but usually will probably be quicker. After the recalculation had been completed I powered off the Witness, and this was the result as shown below, all VMs were still running.
Of course, I had to double check on the commandline using RVC (you can use the command “vsan.vm_object_info” to check a particular object for instance) to ensure that indeed the components of those VMs were still “ACTIVE” instead of “ABSENT”, and there you go!
Now when maintenance has been completed, you simply do the reverse, you power on the witness, and then you power on the hosts in the other location. After the “resync” has been completed the VMs will be rebalanced again by DRS. Note, DRS rebalancing (or should rules being applied) will only happen when the resync of the VM has been completed.