This session was hosted by Michael Ng and Shobhan Lakkapragada and is all about Data Protection in the world of vSAN. Note that this was also a tech preview and features may or may not ever make it in to a future release. The session started with Shobhan explaining the basics of vSAN and the current solutions that are available for vSAN data resiliency, I am not going to rehash that as I am going to assume that you have read most of my articles on those topics already.
Vision: Native Data Protection for vSAN. Provide the ability to specify in policy how many snapshots you would like per VM and how often, and what the retention should be. These snapshots will be stored locally. However, it will also be possible to specify in policy if data needs to moved outside of the primary datacenter. For instance, move data once every 4 hours to the DR site or the Archival Site, also referred to “local protection” and “remote protection”. Not just to vSAN by the way for “remote protection”, but also NFS, Data Domain and even S3 based storage. This is the overall vision of what we are trying to achieve with the native data protection feature.
First problem we will need to solve is snapshotting. The current vSphere/vSAN snapshotting mechanism will not scale to the extend it will need to scale. A new snapshotting mechanism is being worked on which will give far better performance and scale. The design goal is to support up to 100 snapshots per VM with a low (minimal) performance impact. The technology is developed on vSAN, but not tied to vSAN, this may be expanded to vSphere overall.
Michael now took over and started diving deeper in the functionality that we are aiming to provide. First of all “native local data protection”. This is where the snapshots which are created through a schedule in a policy are stored locally on the datastore. This is a “first line of defense” mechanism where we can recover VMs really fast by simply going to a previous snapshot. Snapshots can be created in an application consistent state, even leveraging VSS providers. What is critical if you ask me is that all of this uses the familiar SPBM policies. If you know how to create a policy then you can configure data protection!
In the demo Michael showed the H5 interface next for vSAN Data Protection. A policy is created with the new capabilities that are there as part of vSAN Data Protection. It is shown how you can can specify RPO, RTO, application consistency etc. The policy is created and next that policy is now attached to VMs. Next the snapshot catalog view was demoed. The H5 UI shows the catalog on a per VM basis, but of course there are various views. In this case the per VM view shows all the snapshots, whether they are locally stored or remotely, and it provides you the option to move back and forth in time. Very useful when you need to restore an older snapshot. When you click a snapshot you will then see all the details of that snapshot.
In the next demo Michael shows how to restore a snapshot, not the most spectacular demo, why not? Well because it is dead simple. First he simulates a data file corruption and then goes to the H5 UI, right clicks the VM and goes to the restore option. Next selects the snapshot he wants to restore and even restores it as “new VM”, which is a linked clone, but it can also be restored as a fully independent VM. In the case you want to restore fully independently a linked clone (sort of) will be created and in the back-end the instance will be migrated to being independently. So the recover is instantly and over time the task of making it independently will complete. During the recovery by the way, there’s even the option to have the VM recovered without networking, or you can customize the VM as well to avoid conflicts.
When the recovery finished Michael showed how the “corrupted file” was succesfully restored. Or actually I should say, the VM was restored to the ‘last known good state’, as this is not a file level restore but a VM level restore.
Besides snapshotting / restoring it is of course also possible to closely monitor the state of your protected VMs. Creating snapshots is important, but being to restore them is even more important. Custom health checks are being developed for vSAN Data Protection which shows you the current state of data protection in your environment. Is the service running, are VM snapshots created, are they crash consistent?
And with that the session ended. Very impressive demoes and interesting feature, I cannot wait to see this being released! Again, when the session is published, I will share the link. Thanks Michael and Shobhan.
Stefan Gourguis says
Hello Duncan
Great Post!
Can we expect thre Integrated Data Protection for vsan Feature only for the newest vsan Branch on Release or will that be available also for “old” Releases like vsan 6.2?
Br Stefan
Matt says
This sounds promising! We moved away from traditional backups to storage-based volume snaps with replication years ago and that’s one major thing that’s kept us from really getting into VSAN. We do not want to go back to the old method of either agent-based backups or integrated backups simply because of the strain it puts on the environment when trying to back up 20TB every night oh and not having to pay a dime to a backup vendor. This type of tech may bring us a reality in VSAN that’s similar to how we operate today.
John Nicholson says
If your using CBT (Changed Block Tracking) for VADP based backups with vSAN you do NOT need to install agents, and you don’t need to backup 20TB every night. Only the changed blocks.
Note some backup software that does source side Dedupe/Compression (Avamar, Comvault) will backup even less even if your change rate is 20TB.
Another thing to watch is 90% of the cause of high change environments is DBA’s who don’t trust the storage admin’s backups doing full database dumps a few times a day. Once you wrangle this in, you can generally do backups fairly quickly using CBT in most environments.
As far as strain on the environment snapshot merges no longer use helper snapshots on merge (vSphere 6.0 introduced this), and read amplification from snapshots was largely mitigated in vSAN 6.0. Snapshots still have overhead but it has been reduced significantly.
Michael S Ng says
Great post Duncan! Shobhan and I a super excited about the prospects of native Data Protection features in vSAN.
Brett says
Awesome to see. All the use SPBM is getting leads me to believe we need to allow policies to be stacked on a VM rather than having to create a policy for every scenario, something I have noticed becomes an issue while testing vVols.
virtuallysensei says
Can you share the link for the VMworld Tech talk video for STO1770BU , searched youtube and vmworld website , didnot find this .
Duncan Epping says
hootjr29 says
I too, would enjoy watching the STO1770BU has anyone found that yet? I too searched around a bit
Duncan Epping says