For the last episode of 2024, I invited Deji Akomolafe to discuss running Microsoft workloads on top of VCF. I’ve known Deji for a long time, and if anyone is passionate about VMware and Microsoft technology, it is him. Deji went over the many caveats, and best practices when it comes to running for instance SQL on top of VMware VCF (or vSphere for that matter). NUMA, CPU Scheduling, latency sensitive settings, power settings, virtual disk controllers, just some of the things you can expect in this episode. You can listen to the episode on Spotify, Apple, or via the embedded player below.
Server
Unexplored Territory Episode 086 – VCF 9 and vSAN 9 storage and data protection vision with Pete Koehler
I just rebooted the Unexplored Territory Podcast after a break of 2 months. In this episode I discuss the VCF 9 and vSAN 9 storage and data protection vision with my colleague and good friend Pete Koehler. I hope you enjoy the show!
VCF-9 announcements at Explore Barcelona – vSAN Site Takeover and vSAN Site Maintenance
At Explore in Barcelona we had several announcements and showed several roadmap items which we did not reveal in Las Vegas. As the sessions were not recording in Barcelona, I wanted to share with you the features I spoke about at Explore which are currently planned for VCF 9. Please note, I don’t know when these features will be generally available, and there’s always a chance they are not released at all.
I created a video of the features we discussed, as I also wanted to share the demos with you. Now for those who don’t watch videos, the functionality that we are working on for VCF-9 is the following, I am just going to do a brief description, as we have not made a full public announcement about this, and I don’t want to get into trouble.
vSAN Site Maintenance
In a vSAN stretched cluster environment when you want to do site maintenance, today you will need to place every host into maintenance mode one by one. This not only is an administrative/operational burden, it also increases the chances of placing the wrong hosts into maintenance mode. On top of that, as you need to do this sequentially, it could also be that the data stored on host-1 in site A differs from host-2 in site A, meaning that there’s an inconsistent set of data in a site. Normally this is not a problem as the environment will resync when it comes back online, but if the other data site fails, now that existing (inconsistent) data set cannot be used to recover. With Site Maintenance we not only make it easier to place a full site into maintenance mode, we also remove that risk of data inconsistency as vSAN coordinates the maintenance and ensures that the data set is consistent within the site. Fantastic right?!
vSAN Site Takeover
One of the features I felt we were lacking for the longest time was the ability to promote a site when 2 out of the 3 sites had failed simultaneously. This is where Site Takeover comes into play. If you end up in a situation where both the Witness Site and a data site goes down at the same time, you want to be able to still recover. Especially as it is very likely that you will have healthy objects for each VM in that second site. This is what vSAN Site Takeover will help you with. It will allow you to manually (through the UI or script) inform vSAN that even though quorum is lost, it should make the local RAID set for each of the VMs impacted accessible again. After which, of course, vSphere HA would instruct the hosts to power-on those VMs.
If you have any feedback on the demos, and the planned functionality, feel free to leave a comment!
vSAN Data Protection, what can you do with it today?
Last week I posted a blog about how to get vSAN Data Protection up and running, but I never explained why you may want to do this in the first place? Before we dive into it, it probably is smart to also mention that vSAN Data Protection leverages the new snapshotting capability which is part of the vSAN Express Storage Architecture (vSAN ESA). vSAN ESA has a snapshotting capability that infinitely scales. It literally is 100x better than the snapshotting mechanism that is part of vSAN OSA. The reason for this is simple, with vSAN OSA (and VMFS for that matter) when you create a snapshot a new object is created and IO is redirected. With vSAN ESA we basically copy the meta data structure and keep using the existing objects for IO, which means we don’t need to traverse multiple files for reads for instance, and when we delete a snapshot we don’t need to copy data… As it is all about metadata changes / tracking.
Now in vSphere / vSAN 8.0 Update 3 we introduce a new capability called vSAN Data Protection. This provides you the ability to schedule snapshots, create immutable snapshots, clone snapshots, restore snapshots etc. All this through the vSphere Client interface, of which you can see an example below.
Now, in the above screenshot you see that half of the VMs are protected, and the other half is not. Why is that? Well, simply because I decided to only protect half of my VMs when I created my snapshot schedule. This is an “opt-in” mechanism, so you create a schedule, you decide which VMs are part of a schedule and which are not! Let’s take a look at the scheduling mechanism. You can find the mechanism at the cluster level under “Configuration -> vSAN -> Data Protection”. When you go there you see the above screen and you can click on “protection groups” and then “Create protection group”.
This will then present you a screen where you can define the name of the protection group, enable “immutability mode” and select the “Membership” of the protection group. Personally I like the Dynamic VM Patterns option. Just makes things easy. Like in this example as a pattern I used “*de*”, which means that all VMs with “de” in the name will be snapshotted whenever the schedule triggers a snaphot.
As mentioned, vSAN Data Protection includes ALL the virtual machines which match the pattern when the snapshot is taken. So if you create a schedule like for instance the following, it could take close to an hour before the VM is snapshotted for the very first time. If you feel this is too long then there’s of course also an option to manually snapshot the VM.
The schedules you can create can be really complex (up to 10 schedules), I just created an example of what is possible, but there’s many different options and combinations you can create. Note, a VM can be part of three different protection groups at most, and I also want to point out that the protection group is not a consistency group. Also note, you can have at most 200 snapshots per object/VM, so that is also something to take into consideration.
Now when the VMs are snapshotted you get a few extra options available, you have “snapshot management” at the VM level individually, and of course you can also manage things at the protection group level. You will see options like “restore” and “clone” and this is where it gets interesting, as this will allow you to go back to a particular point in time if needed from a particular snapshot, and also clone the VM from a particular snapshot. Why would you clone it? Well if you would want to test something against a specific dataset for instance, or if you want to do some type of digital forensics and analyze a VM for whatever reason.
One thing that is also good to mention is that with vSAN Data Protection, you can also recover VMs which have been deleted from vCenter Server. This is one of those must have features in my opinion, as this is one of those things that does occasionally happen, unfortunately. Power Off VM, delete … oops, wrong one! When it comes to the recovery of a VM, the process is just straight forward. You select the snapshot and click “restore VM”, or “clone VM”, depending on what you would expect as the outcome. Restore means the current instance is powered off, and the snapshot is the restore point when you power it on again. Of course, when you do a clone you simply get a second instance of that VM. Note that when you create a clone, it is a linked clone, so it is extremely fast to instantiate.
The last thing I want to discuss is the immutability mode. I want to first caution people, this is what you think it is… it is IMMUTABLE! This means that you cannot delete the snapshots or change the schedule etc, let me quote the documentation:
Enable immutability mode on a protection group for additional security. You cannot edit or delete this protection group, change the VM membership, edit or delete snapshots. An immutable snapshot is a read-only copy of data that cannot be modified or deleted, even by an attacker with administrative privileges.
That is why I wanted to caution people, because if you create an immutable protection group with hundreds of VMs by accident… Yes, snapshots will be immutable, you cannot delete the protection group or the snapshots, or change the snapshot schedule. No, not even as an administrator. Do note, you can delete the VM if you want…
Anyway, I hope this gave a brief overview of what you can do with vSAN Data Protection at the moment. I’ll play around with it some more, and report back when there’s more to share 🙂
vSAN Stretched: Why is the witness not part of the cluster when the link between a data site and the witness fails?
Last week I received a question about vSAN Stretched which had me wondering for a while what on earth was going on. The person who asked this question was running through several failure scenarios, some of which I have also documented in the past here. The question I got is what is supposed to happen when I have the following scenario as shown in the diagram and the link between the preferred site (Site A) and the witness fails:
The answer, at least that is what I thought, was simple: All VMs will remain running, or said differently, there’s no impact on vSAN. While doing the test, indeed the outcome I documented, which is also documented in the Stretched Clustering Guide and the PoC Guide was indeed the same, the VMs remain running. However, one of the things that was noticed is that when this situation occurs, and indeed the connection between Site A and the Witness is lost, the witness is somehow no longer part of the cluster, which is not what I would expect. The reason I would not expect this to happen is because if a second failure would occur, and for instance the ISL between Site A and Site B goes down, it would direclty impact all VMs. At least, that is what I assumed.
However, when I triggered that second failure and I disconnected the ISL between Site A and Site B, I saw the witness re-appearing again immidiately, I saw the witness objects going from “absent” to “active”, and more importantly, all VMs remained running. The reason this happens is fairly straight forward, when running a configuration like this vSAN has a “leader” and a “backup”, and they each run in a seperate fault domain. Both the leader and the backup need to be able to communicate with the Witness for it to be able to function correctly. If the connection between Site A and the Witness is gone, then either the leader or the backup can no longer communicate with the Witness and the Witness is taken out of the cluster.
So why does the Witness return for duty when the second failure is triggered? Well, when the second failure is triggered the leader is restarted in Site B (as Site A is deemed lost), and the backup is already running in Site B. As both the leader and the backup can communicate again with the witness, the witness returns for duty and so will all of the components automatically and instantly. Which means that even though the ISL has failed between Site A and B after the witness was taken out of the cluster, all VMs remain accessible as the witness is reintroduced instantly to ensure availability of the workload. Pretty cool! (Thanks to vSAN engineering for providing these insights on why this happens!)