• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

BC-DR

Does vSAN Data Protection work with vSAN Stretched Clusters and can snapshots be stretched?

Duncan Epping · Oct 18, 2024 · Leave a Comment

I have written a few articles about vSAN Data Protection now, and my last article featured a nice vSAN DP demo. A very good question was asked in the comment section, and it was about vSAN Stretched Clusters. Basically, the question was whether Snapshots are also stretched across locations. This is a great question, as there are a couple of things which are probably worth explaining again.

vSAN Data Protection relies on the snapshot capability which was introduced with vSAN ESA. This snapshot capability in vSAN ESA is significantly different than with vSAN OSA or with VMFS. With vSAN OSA and VMFS when you create a snapshot a new object (vSAN) or file (VMFS) is created. With vSAN ESA this is no longer the case as we don’t create additional files or objects, but we create a copy of the metadata structure instead. This is why vSAN ESA snapshots perform much better than vSAN OSA or VMFS snapshots do, as we no longer need to traverse multiple files or objects to read data. We can simply use the same object, and leverage the metadata structure to keep track of what has changed.

Now, with vSAN, as most of you hopefully know, object (and it’s components) are placed across the cluster based on what is specified within the storage policy that is associated with the object or VM. In other words, if the policy states FTT=1 and RAID-1, then you will see 2 copies of the data. If the policy states the data needs to be stretched across locations, and within each location be protected with RAID-5, then you will see a RAID-1 configuration across sites and a RAID-5 configuration within each site. As vSAN ESA snapshots are an integral part of the object, the snapshots automatically follow all requirements as defined within the policy. In other words, if the policy says stretched then the snapshot will also automatically be stretched.

There is one caveat I want to call out, and for that I want to show a diagram. The diagram below shows the Data Protection Appliance, aka the snapshot manager appliance. As you can see, it states “metadata decoupled from appliance” and it links somehow to a global namespace object. This global namespace object is where all the details of the protected VMs (and more) is being stored. As you can imagine, both the Snapshot Manager, as well as the Global Namespace object should also be stretched. For the global namespace object this means that you need to ensure that the default datastore policy is set to “stretched”, and of course for the snapshot manager appliance you can simply select the correct policy when provisioning the appliance. Either way, make sure the default datastore policy aligns with the disaster recovery and data protection policy.

Does vSAN Data Protection work with vSAN Stretched Clusters and can snapshots be stretched?

I hope this helps those exploring vSAN Data Protection in a stretched cluster configuration!

vSAN Data Protection Demo!

Duncan Epping · Oct 16, 2024 · Leave a Comment

I created a vSAN Data Protection Demo for VMware Explore in Las Vegas and Barcelona, and as I got some questions from people about vSAN Data Protection I figured I would share the demo on Youtube and post it on my blog. I have written a few articles on vSAN Data Protection already, and I am starting to notice that some customers are starting to test it and even use it in production where possible. As mentioned before, vSAN Data Protection is only available with vSAN ESA as it leverages the new snapshotting capabilities. The brand new UI is introduced through the Snap Manager Appliance. Make sure to download, deploy, and configure that first.

New vCLS architecture with vSphere 8.0 Update 3

Duncan Epping · Sep 24, 2024 · 1 Comment

Some of you may have seen this, others may have not, as I had a question today around vCLS retreat mode with 8.0U3 I figured I would write something on the topic quickly. Starting with vSphere 8.0 Update we introduced a new architecture for vCLS aka vSphere Cluster Services. Pre-vSphere 8.0 Update the vCLS architecture was based on virtual machines with Photon OS. These VMs were there to assist in enabling and disabling primarily DRS. If something was wrong with these VMs then DRS would also be unable to function normally. In the past many of you have probably experienced situations where you had to kill and delete the vCLS VMs to restore functionality of DRS, for that VMware introduced a feature called “retreat mode” which basically killed and deleted the VMs for you. There were some other challenges with the vCLS VMs and as a result the team decided to create a new design for vCLS.

Starting with vSphere 8.0 Update 3 vCLS is now implemented as what I would call a container runtime, sometimes referred to as a Pod VM or PodCRX. In other words, when you upgrade to vSphere 8.0 Update 3 you will see your current vCLS VMs be deleted, and these new shiny vCLS VMs will pop up. How do you know if these VMs are created using a different mechanism? Well you can simply just see that in the UI as demonstrated below. See the “CRX” mention in the UI?

New vCLS architecture with vSphere 8.0 Update 3

So you may ask yourself, why should I even care? Well the thing is, you should not indeed. The new vCLS architecture uses less resources per VM, there are less vCLS VMs deployed to begin with (two instead of three), and they are more resilient. Also, when a host is for instance placed into maintenance mode while it has a vCLS VM running, that vCLS instance is deleted and recreated elsewhere. Considering the VMs are stateless and tiny, that is much more efficient than trying to vMotion it. Note, vMotion and SvMotion of these new (Embedded as they call them) type of vCLS VMs isn’t even supported to begin with.

Normally, vCLS retreat mode shouldn’t be needed anymore, but if you do end up in a situation where you need to clean up these instances, Retreat Mode is still fully supported with 8.0 U3 as well. You can find the Retreat Mode option in the same place as before, on your cluster object under “Configure –> vSphere Cluster Services –> General –> Edit vCLS Mode”. Simply select “Retreat Mode” and the clean up should happen automatically. When you want the VMs to be recreated, simply go back to the same UI and select “System managed”. This should then lead to the vCLS VMs being recreated.

I hope this helps,

 

vSAN Data Protection, what can you do with it today?

Duncan Epping · Jul 2, 2024 · 13 Comments

Last week I posted a blog about how to get vSAN Data Protection up and running, but I never explained why you may want to do this in the first place? Before we dive into it, it probably is smart to also mention that vSAN Data Protection leverages the new snapshotting capability which is part of the vSAN Express Storage Architecture (vSAN ESA). vSAN ESA has a snapshotting capability that infinitely scales. It literally is 100x better than the snapshotting mechanism that is part of vSAN OSA. The reason for this is simple, with vSAN OSA (and VMFS for that matter) when you create a snapshot a new object is created and IO is redirected. With vSAN ESA we basically copy the meta data structure and keep using the existing objects for IO, which means we don’t need to traverse multiple files for reads for instance, and when we delete a snapshot we don’t need to copy data… As it is all about metadata changes / tracking.

Now in vSphere / vSAN 8.0 Update 3 we introduce a new capability called vSAN Data Protection. This provides you the ability to schedule snapshots, create immutable snapshots, clone snapshots, restore snapshots etc. All this through the vSphere Client interface, of which you can see an example below.

Now, in the above screenshot you see that half of the VMs are protected, and the other half is not. Why is that? Well, simply because I decided to only protect half of my VMs when I created my snapshot schedule. This is an “opt-in” mechanism, so you create a schedule, you decide which VMs are part of a schedule and which are not! Let’s take a look at the scheduling mechanism. You can find the mechanism at the cluster level under “Configuration -> vSAN -> Data Protection”. When you go there you see the above screen and you can click on “protection groups” and then “Create protection group”.

This will then present you a screen where you can define the name of the protection group, enable “immutability mode” and select the “Membership” of the protection group. Personally I like the Dynamic VM Patterns option. Just makes things easy. Like in this example as a pattern I used “*de*”, which means that all VMs with “de” in the name will be snapshotted whenever the schedule triggers a snaphot.

As mentioned, vSAN Data Protection includes ALL the virtual machines which match the pattern when the snapshot is taken. So if you create a schedule like for instance the following, it could take close to an hour before the VM is snapshotted for the very first time. If you feel this is too long then there’s of course also an option to manually snapshot the VM.

The schedules you can create can be really complex (up to 10 schedules), I just created an example of what is possible, but there’s many different options and combinations you can create. Note, a VM can be part of three different protection groups at most, and I also want to point out that the protection group is not a consistency group. Also note, you can have at most 200 snapshots per object/VM, so that is also something to take into consideration.

Now when the VMs are snapshotted you get a few extra options available, you have “snapshot management” at the VM level individually, and of course you can also manage things at the protection group level. You will see options like “restore” and “clone” and this is where it gets interesting, as this will allow you to go back to a particular point in time if needed from a particular snapshot, and also clone the VM from a particular snapshot. Why would you clone it? Well if you would want to test something against a specific dataset for instance, or if you want to do some type of digital forensics and analyze a VM for whatever reason.

One thing that is also good to mention is that with vSAN Data Protection, you can also recover VMs which have been deleted from vCenter Server. This is one of those must have features in my opinion, as this is one of those things that does occasionally happen, unfortunately. Power Off VM, delete … oops, wrong one! When it comes to the recovery of a VM, the process is just straight forward. You select the snapshot and click “restore VM”, or “clone VM”, depending on what you would expect as the outcome. Restore means the current instance is powered off, and the snapshot is the restore point when you power it on again. Of course, when you do a clone you simply get a second instance of that VM. Note that when  you create a clone, it is a linked clone, so it is extremely fast to instantiate.

The last thing I want to discuss is the immutability mode. I want to first caution people, this is what you think it is… it is IMMUTABLE! This means that you cannot delete the snapshots or change the schedule etc, let me quote the documentation:

Enable immutability mode on a protection group for additional security. You cannot edit or delete this protection group, change the VM membership, edit or delete snapshots. An immutable snapshot is a read-only copy of data that cannot be modified or deleted, even by an attacker with administrative privileges.

That is why I wanted to caution people, because if you create an immutable protection group with hundreds of VMs by accident… Yes, snapshots will be immutable, you cannot delete the protection group or the snapshots, or change the snapshot schedule. No, not even as an administrator. Do note, you can delete the VM if you want…

Anyway, I hope this gave a brief overview of what you can do with vSAN Data Protection at the moment. I’ll play around with it some more, and report back when there’s more to share 🙂

 

Unexplored Territory Podcast Episode 067 – Introducing Rubrik!

Duncan Epping · Feb 6, 2024 ·

You may have noticed it already, but the Unexplored Territory Podcast kind of pivoted. Instead of having two co-hosts we recently decided to only have 1 host and a guest. We noticed that during our conversations with our guests things became a bit more challenging when there were multiple hosts, and we hope that this small tweak makes the content easier to digest for you (the listener) as well. I recently sat down with Rubrik’s Jerry Rijnbeek to discuss not only introduce Rubrik, but more importantly discuss acronyms like “DSPM” (Data Security Posture Management), and how this helps customers in this age of ransomware. I hope you enjoy listening to the show as much as I did recording it. Thanks Jerry for your time and sharing your knowledge.

  • « Go to Previous Page
  • Page 1
  • Page 2
  • Page 3
  • Page 4
  • Interim pages omitted …
  • Page 63
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Advertisements




Copyright Yellow-Bricks.com © 2025 · Log in