• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

BC-DR

Does vSAN Enhanced Durability work when you have a limited number of hosts?

Duncan Epping · Apr 19, 2021 ·

Last week I had a question about how vSAN Enhanced Durability works when you have a limited number of hosts. In this case, the customer had a 3+3+1 stretched cluster configuration, and they wondered what would happen when they would place a host into maintenance mode. Although I was pretty sure I knew what would happen, I figured I would test it in the lab anyway. Let’s start with a high-level diagram of what the environment looks like. Note I use a single VM as an example, just to keep the scenario easy to follow.

In the diagram, we see a virtual disk that is configured to be stretched across locations, and protected by RAID-1 within each location. As a result, you will have two RAID-1 trees each with two components and a witness, and of course, you would have a witness component in the witness location. Now the question is, what happens when you place esxi-host-1 into maintenance mode? In this scenario, vSAN Enhanced Durability will want to create a “durability component”. This durability component is used to commit all new write IO to. This will allow vSAN to resync fast after maintenance mode, and enhances durability as we would still have 2 copies of the (new) data.

However, in the scenario above we only have 3 hosts per location. The question then is, where is this delta component created then? As normally with maintenance mode you would need a 4th host to move data to. Well, it is simple, in this case, what vSAN does is it creates a “durability component” on the host where the witness resides, within the location of course. Let me show you in a diagram, as that makes it clear instantly.

By adding the Durability component next to the witness on esxi-host-3, vSAN enhances durability even in this stretched cluster situation, as it provides a local additional copy of new writes. Now, of course I tested this in my lab. So for those who prefer to see a demo, check out the youtube video below.

How long does it take before a host is declared failed?

Duncan Epping · Jan 26, 2021 ·

I had a question this week around the failure of a host. The question was how long it takes before a host is declared failed. Now let’s be clear, failed means “dead” in this case, not isolated or partitioned. It could be the power has failed, the host has gone completely unresponsive, or anything else where there’s absolutely no response from the host whatsoever. In that scenario, how long does it take before HA has declared the VM dead? Now note, the below timeline is in a traditional infrastructure. Also note, that this is theoretical, when everything is optimal.

  • T0 – Secondary Host failure.
  • T3s – The Primary Host begins monitoring datastore heartbeats for 15 seconds.
  • T10s – The host is declared unreachable and the Primary will ping the management network of the failed host.
    • This is a continuous ping for 5 seconds.
  • T15s – If no heartbeat datastores are configured, the host will be declared dead.
  • T18s – If heartbeat datastores are configured and there have been no heartbeats, the host will be declared dead, restarts will be initiated.

Now, when a Primary Host fails the timeline looks a bit different. This is mainly because first, a new Primary Host will need to be elected. Also, we need to ensure that the new primary has received the latest state of all secondary hosts.

  • T0 – Primary Host failure.
  • T10s – Primary election process initiated.
  • T25s – New primary elected and reads the protectedlist.
    • New primary waits for secondary hosts to report running VMs
  • T35s – Old primary declared unreachable.
  • T50s – Old primary declared dead, new primary initiates restarts for all VMs on the protectedlist which are not running.

Keep in mind, this does not mean that VMs will be restarted with 18 seconds, or 35 seconds, for that matter. When the host is declared dead, or a new primary is elected, the restart process starts. The VMs that need to be restarted will first need to be placed, and when placed, they will need to be restarted. All of these steps will take time. On top of that, depending on the operating system and the apps running within the VM, the time it takes before the restart is fully completed could vary a lot between VMs. In other words, although the state is declared rather fast, the actual total time it takes to restart can vary and is definitely not an exact science.

VMs which are not stretched in a stretched cluster, which policy to use?

Duncan Epping · Dec 14, 2020 ·

I’ve seen this question popping up regularly. Which policy setting (“site disaster tolerance” and “failures to tolerate”) should I use when I do not want to stretch my VMs? Well, that is actually pretty straight forward, in my opinion, you really only have two options you should ever use:

  • None – Keep data on preferred (stretched cluster)
  • None – Keep data on non-preferred (stretched cluster)

Yes, there is another option. This option is called “None – Stretched Cluster” and then there’s also “None – Standard Cluster”. Why should you not use these? Well, let’s start with “None – Stretched Cluster”. In the case of “None – Stretched Cluster”, vSAN will per object decide where to place it. As you hopefully know, a VM consists of multiple objects. As you can imagine, this is not optimal from a performance point of view, as you could end up having a VMDK being placed in Site A and a VMDK being placed in Site B. Which means it would read and write from both locations from a storage point of view, while the VM would be sitting in a single location from a compute point of view. It is also not very optimal from an availability stance, as it would mean that when the intersite link is unavailable, some objects of the VM would also become inaccessible. Not a great situation. What would it look like? Well, potentially something like the below diagram!

Then there’s “None – Standard Cluster”, what happens in this case? When you use “None – Standard Cluster” with “RAID-1”, what is going to happen is that the VM is configured with FTT=1 and RAID-1, but in a stretched cluster “FTT” does not exist, and FTT automatically will become PFTT. This means that the VM is going to be mirrored across locations, and you will have SFTT=0, which means no resiliency locally. It is the same as “Dual Site Mirroring”+”No Data Redundancy”!

In summary, if you ask me, “none – standard cluster” and “none – stretched cluster” should not be used in a stretched cluster.

vSphere HA reporting not enough failover resources fault with stretched cluster failure scenario

Duncan Epping · Nov 20, 2020 ·

Last few months I had a couple of customers asking why vSphere HA was reporting “not enough failover resources” fault in a stretched cluster failure scenario for virtual machines that are still up and running. Now before I explain why, let’s paint a picture first to make it clear what is happening here. When you run a stretched cluster you can have a scenario where a particular VM (or multiple VMs) are not mirrored/replicated across locations. Now note, with vSAN you can specify for any given VM on a VM level how and if the VM should be available across locations. Typically you would see a VM with RAID-1 across locations, and then RAID-1/5/6 within a location. However, you can also have a scenario where a VM is not replicated across locations, but from a storage point of view only available within a location, this is depicted in the diagram below.

Now in this scenario, when Site A is somehow partitioned from Site B, you will see alarms/errors which indicate that vSphere HA has tried to restart the VM that is located in Site B in Site A and that is has failed as a result of not having enough failover resources.

This, of course, is not the result of not having sufficient failover resources, but it is the result of the fact that Site A does not have access to the required storage components to restart the VM. Basically what HA is reporting is that it doesn’t have the resources which have the ability to restart the impacted VM(s).

Now, if you have paid attention, you will probably wonder why HA tries to restart the VM in the first place, as the VM will still be running in this scenario. Why is it still running? Well the VM isn’t stretched, and this is a partition and not an isolation, which means the isolation response doesn’t kill the VM. So why restart it? Well, as Site A is partitioned from Site B, Site A does not know what the status is of Site B. Site A only knows that Site B is not responding at all, and the only thing it can do is assume the full site has failed. As a result it will attempt a failover for all VMs that were/are running in Site B and were protected by vSphere HA.

Hope that explains why this happens. If you are not sure you understand the full scenario, I recorded a quick five minute video actually walking through the scenario and explaining what happens. You can watch that below, or simply go to youtube.

Announcing VMware Cloud Disaster Recovery! (VCDR)

Duncan Epping · Sep 30, 2020 ·

Most of you probably saw the announcements around the acquisition of Datrium not too long ago. One of the major drivers for that acquisition was the Disaster Recovery solution which Datrium developed. This week at VMworld this service was announced as a new VMware disaster recovery option. The service is named VMware Cloud Disaster Recovery, and it provides the ability to replicate workloads from on-prem into cloud storage, and recover from cloud storage into VMware Cloud on AWS! The three key pillars of the service are ease of use, fast recovery, cloud economics.

The solution is extensively covered in three VMworld sessions (HCI2876, HCI2886, HCI2865). I have watched all three and will provide a short summary here. What capabilities does VMware Cloud DR (VCDR) provide and why is VMware heading into this space?

The why was well explained by Mark Chuang in HCI2876, customers are saying that:

  • “DR is very complex and expensive to manage, and I can’t add IT Headcount”
  • “Our data grows 10-15% every year, with physical DR it is hard to accommodate the growth in the datacenter to meet the needs”
  • “We only test full DR once a year because it is disruptive. Any time there is a major change, how can we know it still works? It is a huge issue!”

I guess that makes it clear why VMware is interested in this space, it is a huge problem for customers and the solution typically comes at a high cost. VMware has always been in the business of solving complex solutions in preferably a simple way, and that is exactly what VMware Cloud Disaster Recovery delivers, a simple solution at a relatively low cost.

So what does it bring from a feature/functionality stance?

it all starts with cloud economics, to which ease-of-use also contributes, in my opinion. VMware Cloud Disaster Recovery is super simple to configure and it replicates data to “cheap and deep” cloud storage. This ensures that the cost can be kept low, and note that all of the typical cost that comes with cloud storage (network etc) are all included in the service offering by VMware. The challenge however typically with cloud storage is that it is relatively slow when it comes to restoring, but this is where the “on-demand” capabilities come into play. VMware Cloud DR provides the ability to instantly power-on workloads through a live mount option, without the need to convert the stored data back to a VM format.

When configuring the VMware Cloud DR solutions you will need to install/configure a DRaaS Connector on-prem. This on-prem Connector connects you to the SaaS platform and will provide the required details to the SaaS Orchestrator, note that you can have multiple DRaaS connectors for resiliency and performance reasons. When the connection is configured you will then be able to create “Protection Groups” and “DR Plans”. Those who have worked with Site Recovery Manager will recognize the terms. For those who haven’t:

  • Protection Groups – These groups list the workloads which will be protected by VMware Cloud DR. Of course you can define the protection schedule, basically how many snapshots need to be shipped remote cloud storage per day/week/month.
  • DR Plans – These plans list workloads that would need to be failed over when the plan is triggered, and for instance, include the order in which the workloads need to be powered on. Also, if workloads need to get a different IP address in the cloud, then you can specify this here also.

Of course besides creating protection groups and DR plans you have the ability to test and failover the workloads in those plans, again, very similar to what Site Recovery Manager offers. Before I forget, you will have the option of course to select the snapshot you want to recover from. So you can go back to any point in time. What is unique here is that VMs are powered without (initially) moving data from cloud storage to your VMware Cloud on AWS. It basically mounts an NFS share from the SaaS platform and the scale-out file system ensures that the VMs can be instantly be powered on. After you have tested the recovery you can then decide to migrate the VMs to your SDDC, or you can of course also discard the workloads if that is something you desire. Last but not least, of course, you also have the ability to replicate back to on-prem, so that you can bring your workloads back whenever you have recovered your environment from the disaster that occurred and you are ready to run those workloads on-prem again.

Now there are many more details, but I am not going to share those in this post, I may do a couple of additional blogs at a future time. I hope the above gives a good overview of what the offering will provide. For more details, I would recommend watching the VMworld sessions on this topic (HCI2876, HCI2886, HCI2865). The last thing I want to share though is where the solution will be available, or at least what is being planned. As shown below, the offering should be available in multiple regions soon.

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 3
  • Page 4
  • Page 5
  • Page 6
  • Page 7
  • Interim pages omitted …
  • Page 63
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Advertisements




Copyright Yellow-Bricks.com © 2025 · Log in