DRS rules still active when DRS disabled?

I just received a question around DRS rules and why they are still active when DRS is disabled. I was under the impression this was something I already blogged about, but I cannot find it. I know some others did, but they reported this behaviour as a bug… which it isn’t actually.

Below is a screenshot of the VM/Host Rules screen for vSphere 6.0, it allows you to create rules for clusters… Now note I said “clusters” not DRS in specific. In 6.0 the wording in the UI has changed to align with the functionality vSphere offers. These are not DRS rules, but rather cluster rules. Whether you use HA or DRS, these rules can be used when either of the two is configured.

Note that not all types of rules will automatically be respected by vSphere HA. One thing which you can now also do in the UI is specify if HA should ignore or respect rules, very useful if you ask me and makes life a bit easier:

What does support for vMotion with active/active (a)sync mean?

Having seen so many cool features being released over the last 10 years by VMware you sometimes wonder what more they can do. It is amazing to see what level of integration we’ve see between the different datacenter components. Many of you have seen the announcements around Long Distance vMotion support by now.

When I saw this slide something stood out to me instantly and that is this part:

  • Replication Support
    • Active/Active only
      • Synchronous
      • Asynchronous

What does this mean? Well first of all “active/active” refers to “stretched storage” aka vSphere Metro Storage Cluster. So when it comes to long distance vMotion some changes have been introduced for sync stretched storage. (** note that “active/active” storage is not required for long distance vMotion**)With stretched storage writes can come from both sides at any time to a volume and will be replicated synchronously. Some optimizations have been done to the vMotion process to avoid writes during switchover to avoid any delay during the process as a result of replication traffic.

For active/active asyncronous the story is a bit different. Here again we are talking about “stretched storage” but in this case the asynchronous flavour. One important aspect which was not mentioned in the deck is that async requires Virtual Volumes. Now, at the time of writing there is no vendor yet who has a VVol capable solution that offers active/active async. But more important, is this process any different than the sync process? Yes it is!

During the migration of a virtual machine which uses virtual volumes, with an “active/active async” configuration backing it, the array is informed that a migration of the virtual machine is taking place and is requested to switch from asynchronous replication to synchronous. This to ensure that the destination is in-sync with the source when the VM is switched over from side A to side B. Besides switching from async to sync when the migration has completed the array is informed that the migration has completed. This allows the array to switch the “bias” of the VM for instance, especially in a stretched environment this is important to ensure availability.

I can’t wait for the first vendor to announce support for this awesome feature!

Another way to fix your non compliant host profile

I found out there is another way to fix your non compliant host profile problems with vSphere 6.0 when you have SAS drives which are detected as shared storage while they are not. This method is a bit more complicated though and there is a command line script that you will need to use: /bin/sharedStorageHostProfile.sh. It works as follows:

  • Run the following to dump all your local details in a folder on your first host
    /bin/sharedStorageHostProfile.sh local /folder/youcreated1/
  • Run the following to dump all your local details in a folder for your second host, you can do this on your first host if you have SSH enabled
    /bin/sharedStorageHostProfile.sh remote /folder/youcreated2/ <name or ip of remote host>
  • Copy the outcome of the second host to folder where the outcome of your first host is stored. You will need to copy the file “remote-shared-profile.txt”.
  • Now you can compare the outcomes by running:
    /bin/sharedStorageHostProfile.sh compare /folder/youcreated1/
  • After comparing you can run the configuration as follows:
    /bin/sharedStorageHostProfile.sh configure /folder/youcreated1/
  • Now the disks which are listed as cluster wide resources but are not shared between the hosts will be configured as non-shared resources. If you want to check what will be changed before running the command you can simply do a “more” of the file the info is stored in:
    more esxcli-sharing-reconfiguration-commands.txt
    esxcli storage core device setconfig -d naa.600508b1001c2ee9a6446e708105054b --shared-clusterwide=false
    esxcli storage core device setconfig -d naa.600508b1001c3ea7838c0436dbe6d7a2 --shared-clusterwide=false

You may wonder by now if there isn’t an easier way, well yes there is. You can do all of the above by running the following simple command. I preferred to go over the steps so at least you know what is happening.

/bin/sharedStorageHostProfile.sh automatic <name-or-ip-of-remote-host>

After you have done this (first method or second method) you can now create your host profile of your first host. Although the other methods I described in the post of yesterday are a bit simpler, I figured I would share this as well as you never know when it may come in handy!

Get your download engines running, vSphere 6.0 is here!

Yes the day is finally there, vSphere 6.0 / SRM / VSAN (and more) finally available. So where do you find it? Well that is simple… here:

Have fun!

All Flash VSAN – One versus multiple disk groups

A while ago I wrote this article on the topic of “one versus multiple disk groups“. The summary was that you can start with a single disk group, but that from a failure domain perspective having multiple disk groups is definitely preferred. Also from a performance stance there could be a benefit.

So the question now is, what about all-flash VSAN? First of all, same rules apply: 5 disk groups max, each disk group 1 SDD for caching and 7 devices for capacity. There is something extra to consider though. It isn’t something I was aware off until I read the excellent Design and Sizing Guide by Cormac. It states the following:

In version 6.0 of Virtual SAN, if the flash device used for the caching layer in all-flash configurations is less than 600GB, then 100% of the flash device is used for cache. However, if the flash cache device is larger than 600GB, the only 600GB of the device is used for caching. This is a per-disk group basis.

Now for the majority of environments this won’t really be an issue as they typically don’t hit the above limit, but it is good to know when doing your design/sizing exercise. The recommendation of 10% cache to capacity ratio still stands, and this is used capacity before FTT. If you have a requirement for a total of 100TB, then with FTT=1 that is roughly 50TB of usable capacity. When it comes to flash this means you will need a total of max 5TB flash. That is 5TB of flash in total, with 10 hosts that would be 500GB per host and that is below the limit. But with 5 hosts that would be 1TB per host which is above the 600GB mark and would result in 400GB per host being unused.

When there is a requirement to have more than 600GB of write cache capacity for all-flash it is required to create multiple disk groups. Personally I would always recommend this anyway. So when you do the sizing make sure to take this in to consideration and create multiple diskgroups when you have the chance!