vCloud Director and Storage DRS default Affinity

I have had multiple discussions about Storage DRS default affinity internally and now saw this question on the VMTN Community Forums. It is a long question, but basically it comes down to this:

What are the best practices for Storage DRS Datastore Clusters when using vCloud Director?

Seems like a fair question to me, and it comes from the fact that when you create a thin provisioned VMs with multiple disks and they grow it can be difficult to move VMs around when the diskspace threshold is reached. Let me rephrase that, by default it can be difficult to move VMs around. The reason for this is that the default affinity rules for disk is: “Keep VMDKs together by default“. This is depicted in the screenshot below, which I stole from Frank’s article on this exact topic.

Default VMDK affinity rule

As a best practice I would definitely recommend unticking the “Keep VMDKs together by default” when creating a Datastore Cluster in a vCloud Director environment. (Do note that this is NOT supported when using “linked clones / fast provisioning”.) By doing so Storage DRS will have far more options to move virtual disks around as it can make decisions per disk instead of per VM. By changing the default; Storage DRS will find the optimal placement per disk (for more in-depth info read  Franks article). Of course you could stick to the default and then change the “VMDK affinity” per virtual machine, but that means you need to go to vCenter and that kind of defeats the whole purpose of vCloud Director right.

Do note that it can make commandline based troubleshooting slightly more difficult, but I guess the fact that Storage DRS will have way more options to balance and avoid things like “out-of-space conditions” seems like a big win to me. I guess this doesn’t only apply to vCloud Director based infrastructures… This applies to every infrastructure, VMDK anti-affinity can make life better!

Be Sociable, Share!

    Comments

    1. Hi Duncan, thanks for this article!
      After I read lazyllama_uk’s post on VMTN I still have a question, would you mind explaining?
      Q: How would this setting impact Fast-Provisioned vApps on top of these Datastore Clusters?
      Q: How would it impact Storage DRS’s default “maximum space savings” strategy for placing clones?
      Q: Would it anyhow impact Support Terms?
      Q: Does Storage DRS distinguish OrgVDC alignment of VMDKs of VMs?
      If I get this right, change of this parameter would _not_ significantly improve balancing of Fast-Provisioned vApp’s disks deployed from templates, as they depend on availability of shadow VMs on each DS (as per “Whats-New-VMware-vSphere-51-Storage-Technical-Whitepaper.pdf” and “3a Architecting a VMware vCloud.pdf”). And, in vCD5.1, Fast-Provisioning is turned on by default on all OrgVDCs.
      Unticking the above-said option would have the effect on Storage DRS initial placement logic when creating a _new_ vApp (as in lazyllama_uk case). Once this new vApp is Saved as Template, this option wouldn’t supersede default “maximum space savings” logic of StorageDRS 2.0 and vCD5.1, to avoid creating new shadow VMs all over, or cause too many “foreign” evacuations strategically, because base disks of multi-VMDK VMs are now spread over DS cluster (evacuation is in higher order than creation of shadow VMs when it comes to Fast-Provisioned, – space savings first, performance sacrifices).
      Hence, (to some sense) it would be even right thing to stick both disks together for a new vApp in an OrgVDC with FP enabled, even with the setting unticked, right? Keep together -> less problems.
      Even if they are separated when New vApp is created, per-VM, – once this vApp is saved as Template, going forward clones would be on the same DS as base disks, untill “dead end capacity-wise” – storage DRS would be able to balance off “Fat” or “Noisy” Neighbors from other non-Fast-Provisioned OrgVDCs, but that’s it.
      Another possible effect I can imagine – in a multi-Org environment, if StorageDRS doesn’t know what OrgVDC a VMDK belongs to, enabling this policy on a whole DS Cluster could kick in more unnecessary evacuations of VMDKs of those OrgVDCs that use non-Fast-Provisioned setting, every time a new Fast-Provisioned vApp with multi-VMDK VMs is deployed. Just because evacuation precedes the priority of creation of a new shadow VM… and now there are much more DS es where Fast-Provisioned vApps reside (as initial placement spreads them by unticking the rule above)…
      Maybe it’s all misunderstanding )), – hope you wouldn’t mind sharing your thoughts.
      But it’s all soooo dynamic now. Good old days are gone.

    2. Nigel Hardy says:

      Thanks for the response to my Community Forums post, Duncan. Much appreciated.

      We did have a moment of strangeness during testing where a Shadow VM ended up being split across multiple datastores, which prevented it being used for cloning. As there were no clones associated with that Shadow VM we were able to clean it up, and have since set the Shadow VMs to have “keep VMDKs together” turned on.

    3. Nigel Hardy says:

      Hmm… just got an error message in vCD saying:
      “Virtual Machine spans datastores. This is not supported in vCloud Director. Migrate the virtual machine so that the virtual machine and all its disks are on a single datastore.”

      So I guess that we DO need to keep VMDKs together after all.
      Or someone needs to remove the error message :)

    4. Hi Nigel, – so seems like it does makes sense to follow Duncan’s suggestion for ProvvDC’s DSCluster, but only if those OrgvDCs that would be using it aren’t enabled for Fast Provisioning.
      As per vCAT 3.1, document “3a Architecting a VMware vCloud.pdf”, page 31 – “Linked clone configurations that span across datastores are not supported in vCloud Director 5.1.”
      I can imagine how CPU intensive and evacuation aggressive would it be to calculate StorageDRS decisions should this be allowed.

    Speak Your Mind

    *