BC-DR

vSphere HA internals: restart placement changes in vSphere 7!

Duncan Epping · May 13, 2020 ·

Frank and I are looking to update the vSphere Clustering deep dive to vSphere 7. While scoping the work I stumbled on to something interesting, and this is the change that was introduced for the vSphere HA restart mechanism, and specifically the placement of VMs in vSphere 7. In previous releases vSphere HA had a straight forward way of doing placement for VMs when VMs need to be restarted as a result of a failure. In vSphere 7.0 this mechanism was completely overhauled.

So how did it work pre-vSphere 7?

HA uses the cluster configuration
HA uses the latest compatibility list it received from vCenter
HA leverages a local copy of the DRS algorithm with a basic (fake) set of stats and runs the VMs through the algorithm
HA receives a placement recommendation from the local algorithm and restarts the VM on the suggested host
Within 5 minutes DRS runs within vCenter, and will very likely move the VM to a different host based on actual load

As you can imagine this is far from optimal. So what is introduced in vSphere 7? Well, we introduce two different ways of doing placement for restarts in vSphere 7:

Remote Placement Engine
Simple Placement Engine

The Remote Placement Engine, in short, is the ability for vSphere HA to make a call to DRS for the recommendation of the placement of a VM. This will take the current load of the cluster, the VM happiness, and all configured affinity/anti-affinity/vm-host affinity rules into consideration! Will this result in a much slower restart? The great thing is that the DRS algorithm has been optimized over the past years and it is so fast that there will not be a noticeable difference between the old mechanism and the new mechanism. Added benefit of course for the engineering team is that they can remove the local DRS module, which means there’s less code to maintain. How this works is that the FDM Master communicated with the FDM Manager which runs in vCenter Server. FDM Manager communicates with the DRS service to request a placement recommendation.

Now some of you will probably wonder what happens when vCenter Server is unavailable, well this is where the Simple Placement Engine comes into play. The team has developed a new placement engine that basically takes a round-robin approach, but does consider of course “must rules” (VM to Host) and the compatibility list. Note, affinity, or anti-affinity rules, are not considered when SPE is used instead of RPE! This is a known limitation, which is considered to be fixed in the future. If a host, for instance, is not connected to the datastore the VM is running on that needs to be restarted than that host is excluded from the list of potential placement targets. By the way, before I forget, version 7 also introduced a vCenter heartbeat mechanism as a result. HA will be heart beating the vCenter Server instance to understand when it will need to resort to the Simple Placement Engine vs the Remote Placement Engine.

I dug through the FDM log to find some proof of these new mechanisms, (/var/log/fdm.log) and found an entry that shows there are indeed two placement engines:

Invoking the RPE + SPE Placement Engine

RPE stands for “remote placement engine”, and SPE for “simple placement engine”. Where Remote of course refers to DRS. You may ask yourself, how do you know if DRS is being called? Well, that is something you can see in the logs in the DRS log files, when a placement request is received, the below entry shows up in the log file:

FdmWaitForUpdates-vim.ClusterComputeResource:domain-c8-26307464

This even happens when DRS is disabled and also when you use a license edition which does not include DRS even, which is really cool if you ask me. If for whatever reason vCenter Server is unavailable, and as a result DRS can’t be called, you will see this mentioned in the FDM log, and as shown below, it will use the Simple Placement Engine’s recommendation for the placement of the VM:

Invoke the placement service to process the placement update from SPE

A cool and very useful small HA enhancement if you ask me for vSphere 7.0!

** Disclaimer: This article contains references to the words master and/or slave. I recognize these as exclusionary words. The words are used in this article for consistency because it’s currently the words that appear in the software, in the UI, and in the log files. When the software is updated to remove the words, this article will be updated to be in alignment. **

Startup intro: SaaS-based backup solution Clumio

Duncan Epping · Apr 6, 2020 ·

Last week I saw an update from one of the Clumio founders on twitter. It reminded me that I had promised to take a look at their product. This week I had a meeting set up with Clumio and we went over their product and how to configure it briefly. Clumio is a SaaS based backup solution that was founded in 2017 by former PernixData, Nutanix, EMC folks. The three founders are Poojan Kumar, Kaustubh Patil, and Woon Jung, and those three you may remember from PernixData. One thing to point out is that they had 3 rounds of funding (~190 million dollars) so far and they came out of stealth around VMworld 2019. Coincidentally they won the Gold award for Best of VMworld in the data protection category, and best of show for the entire show, not bad for a first VMworld. I guess that I have to point out that although I would classify them as backup/recovery today, they are adding new functionality weekly and “backup/recovery” is probably not a fair category, data protection is more appropriate and it would not surprise me if that evolves to data management and protection over time. If you are not a fan of reading, simply head over to my youtube video on Clumio, otherwise, just continue below.

So how does it work conceptually? Well they basically have a SaaS solution, but you will need to install an OVA (they call it a cloud connector) in your environment to connect to the SaaS platform for VMware on-premises and VMware Cloud on AWS. When you connect AWS EBS they use a cloud formation template. This cloud connector is a 4 vCPU/8GB virtual machine that then needs the ability to connect to “the outside world” of course. The Cloud Connector is stateless and requires no updates. You can run this Cloud Connector appliance in multiple clusters, on-prem, or in VMware Cloud on AWS and once they are registered you will see those data sources in your portal. This is nice as you can see all your data sources across public and private clouds in one single pane of glass. You will have the ability to define “backup schemes” by creating policies. These policies can of course then be associated with objects. These objects can be VMs, Clusters and even vCenter Server instances. This means that if you assign a policy to vCenter Server that every new VM created will inherit the policy automatically. You may wonder, where is your data stored? Your data is stored in S3 buckets that are part of the Clumio SaaS-based platform. Customers are isolated from each other, they will have their own dedicated S3 buckets, and these buckets are created and maintained by Clumio, you as a customer only interact with Clumio! [Read more…] about Startup intro: SaaS-based backup solution Clumio

Install and Configure vSphere Replication with SRM 8.3 on vSphere 7

Duncan Epping · Apr 2, 2020 ·

I haven’t really done much with vSphere Replication and Site Recovery Manager (SRM) in the past years as my main focus has been vSAN. I figured I would get two clusters up and running and install and configure both vSphere Replication as well as SRM on top of vSphere 7. Although the installation and configuration are pretty straight forward, there are a few steps which are important. Below the steps I took to get things up and running.

Deploy the vSphere Replication appliance in both clusters
Deploy the Site Recovery Manager appliance in both clusters
Go to “https://<ip of vSphere Replication appliance>:5480” in first cluster
- username: root
- password: what ever you specified!
Click on “Configuration Page” link
Specify Password of vCenter Server in the Password field
Click “Apply Network Settings”
Click “Save and Restart Service”
Accept the SSL Certificate
Repeat the above for the second cluster!
Now go to your vSphere H5 Client and wait until the vSphere Replicated tasks are completed
Log out of the vSphere H5 Client and log back in for both clusters
Now go to the first cluster / vCenter server
Now click on “Menu” and then “Site Recovery”
Click “open Site Recovery”
Click “New Site Pair”
Fill out the details of the second vCenter Server
Click Next and Connect if you get a security alert and are certain this is the correct vCenter instance
Select the correct listed vCenter instance and vSphere Replication appliance
Click Next and Finish, now you will see a task within vCenter that states “Connect vSphere Replication Sites”
Now you have vSphere Replication running and you can replicate VMs from one location to the other manually if and when desired.

Creating a vSAN 7.0 Stretched Cluster

Duncan Epping · Mar 31, 2020 ·

A while ago I wrote this article and created this demo showing the creation of a vSAN Stretched Cluster. I bumped into the article/video today when I was looking for something and figured that it is time to recreate the vSAN Stretched Cluster demo. I recorded it today, using vSphere / vSAN 7, and just wanted to share it with you here. Hope you enjoy it.

Can I still provision VMs when a vSAN Stretched Cluster site has failed? Part II

Duncan Epping · Dec 18, 2019 ·

3 years ago I wrote the following post: Can I still provision VMs when a VSAN Stretched Cluster site has failed? Last week I received a question on this subject, and although officially I am not supposed to work on vSAN in the upcoming three months I figured I could test this in the evening easily within 30 minutes. The question was simple, in my blog I described the failure of the Witness Host, what if a single host fails in one of the two “data” fault domains? What if I want to create a snapshot for instance, will this still work?

So here’s what I tested:

vSAN Stretched Cluster
4+4+1 configuration
- Meaning, 4 hosts in each “data site” and a witness host, for a total of 8 hosts in my vSAN cluster
Create a VM with cross-site protection and RAID-5 within the location

So I first failed a host in one of the two data sites. When I fail the host, the following is what happens when I create a VM with RAID-1 across sites and RAID-5 within a site:

Without “Force Provisioning” enabled the creation of the VM fails
When “Force Provisioning” is enabled the creation of the VM succeeds, the VM is created with a RAID-0 within 1 location

Okay, so this sounds similar to the originally described scenario, in my 2016 blog post, where I failed the witness. vSAN will create a RAID-0 configuration for the VM. When the host returns for duty the RAID-1 across locations and RAID-5 within each location is then automatically created. On top of that, you can snapshot VMs in this scenario, the snapshots will also be created as RAID-0. One thing to mind is that I would recommend removing “force provisioning” from the policy after the failure has been resolved! Below is a screenshot of the component layout of the scenario by the way.

I also retried the witness host down scenario, and in that case, you do not need to use the “force provisioning” option. One more thing to note. The above will only happen when you create a RAID configuration which is impossible to create as a result of the failure. If 1 host fails in a 4+4+1 stretched cluster you would like to create a RAID-1 across sites and a RAID-1 within sites then the VM would be created with the requested RAID configuration, which is demonstrated in the screenshot below.