Automating vCloud Director Resiliency whitepaper released

About a year ago I wrote a whitepaper about vCloud Director resiliency, or better said I developed a disaster recovery solution for vCloud Director. This solution allows you to fail-over vCloud Director workloads between sites in the case of a failure. Immediately after it was published various projects started to implement this solution. As part of our internal project our PowerCLI guru’s Aidan Dalgleish and Alan Renouf started looking in to automating the solution. Those who read the initial case study probably have seen the manual steps required for a fail-over, those who haven’t read this white paper first

The manual steps in the vCloud Director Resiliency whitepaper is exactly what Alan and Aidan addressed. So if you are interested in implementing this solution then it is useful to read this paper new white paper about Automating vCloud Director Resiliency as well. Nice work Alan and Aidan!

Stretched vCloud Director infrastructure

A while back I wrote about design considerations when designing or building a stretched vCloud Director infrastructure. Since then I have been working on a document in collaboration with Lee Dilworth, and this document should be out soon hopefully. As various people have asked for the document I decided to throw it in to this blog post so that the details are already out there.

** Disclaimer: this article has not been reviewed by the technical marketing team yet, this is a preview of what will possibly be published. When the official document is published I will add a link to this article **


VMware vCloud® Director™ 5.1 (vCloud Director) gives enterprise organizations the ability to build secure private clouds that dramatically increase datacenter efficiency and business agility. Coupled with VMware vSphere® (vSphere), vCloud Director delivers cloud computing for existing datacenters by pooling vSphere virtual resources and delivering them to users as catalog-based services. vCloud Director helps you build agile infrastructure-as-a-service (IaaS) cloud environments that greatly accelerate the time-to-market for applications and responsiveness of IT organizations.

Resiliency is a key aspect of any infrastructure but is even more important in “Infrastructure as a Service” (IaaS) solutions. This solution overview was developed to provide additional insight and information in how to architect and implement a vCloud Director based solution on a vSphere Metro Storage Cluster infrastructure.

Architecture Introduction

This architecture consists of two major components. The first component is the geographically separated vSphere infrastructure based on stretched storage solution, here after referred to as the vSphere Metro Storage Cluster (vMSC) infrastructure. The second component is vCloud Director.

Note –  Before we dive in to the details of the solution we would like to call out the fact that vCloud Director is not site aware. If incorrectly configured availability could be negatively impacted in certain failure scenarios.

[Read more...]

Upgrade vCloud Director 1.5 on vSphere 5.1 to vCD 5.1.1?

One of my colleagues, Matthew Meyer, posted a list of cool videos he produced. These videos show how to upgrade vCloud Director 1.5 on vSphere 5.0 to vCloud Director 5.1.1 running on vSphere 5.1.  Nice right?! Thanks Matt for creating these, awesome work. (I normally don’t use the “read more” option, but as there are 8 videos in total I will only show two on the front page. Hit “Continue Reading” if you want to see the rest!)

VMware vCenter Server 5.0 to 5.1 Upgrade

VMware vCenter Single Sign-On Installation

[Read more...]

Thinking about a stretched vCloud Director deployment

Lately I have been thinking about what it would take to deploy a stretched vCloud Director (vCD) infrastructure. “The problem” with a vCloud Director infrastructure is that there are so many moving components, this makes it difficult to figure out how to protect each component. Let me point out that I do not have all the definitive answers to this yet, I am writing this article to get a better understanding of the problem myself. If you do not agree with my reasoning please feel free to comment, as I need YOUR help defining the recommended practices around vCD on a stretched infrastructure.

I listed the components I used in my lab:

  • vCenter Server Management
  • vCenter Server Cloud Resources
  • vCloud Director Cells
  • vShield Manager
  • Database Server

That would be 5 moving components, but in reality we are talking more around 8. The thing here is that vCenter Server also has multiple components:

  • Single Sign On
  • Inventory Service
  • Web Client
  • vCenter Server

How do I protect these 8 components? The first 5 listed will be individual VMs and vCloud Director itself will be multiple cells even. What would this look like?

As you can see there are multiple vCenter Servers, one manages the Management Cluster and its components. While the other manages the “Cloud Resource Cluster”. Lets start listing all the components and discuss what the options are and if we can protect them in a special way or not.

vCenter Server (cloud resources and management)

vCenter Server can be protected through various methods. There is vCenter Heartbeat and of course we have vSphere HA (including VM Monitoring). First of all it is key to realize that neither of these solutions are fully “non-disruptive”. Both vSphere HA and vCenter Heartbeat will cause a slight disruption. vSphere HA will simply restart your VM when a host has failed, and vSphere HA – VM Monitoring can restart the Guest OS when the VM has failed. vCenter Heartbeat is a more intelligent solution, it can detect outages using a heartbeat mechanism and respond to that.

I guess the question is availability vs operational simplicity. How important is vCenter Server availability in your environment? Setting up vSphere HA and VM Monitoring is a matter of seconds. Installing and configuring vCenter Heartbeat is probably hours… And think about upgrade processes etc. I personally prefer not using vCenter Heartbeat but going for vSphere HA and VM Monitoring in this scenario, how about you?

What about these vCenter services like SSO / Inventory Service / Web Client etc. Although in a way, from a scalability/performance perspective, it might make sense to split things out… It also makes your environment more vulnerable to failures. What if 1 VM in your “vCenter service chain” is down. That might render your whole solution unusable. I would personally prefer to have vCenter Server, Inventory Service and the Web Client to be installed in a single VM. I can imagine that for SSO you would like to split it out, so that when you have multiple vCenter Server instances you can link them to the same SSO instance.

As mentioned SSO potentially could be deployed in an HA fashion. HA with regards to SSO is an active/standby solution, but I have been told there are other ways of deploying it and more info would be released soon.

Recommended Practice: I am a big fan of keeping things simple. Keep vCenter and at a minimum the Inventory Service together, and potentially the Web Client. Although Heartbeat has the potential of decreasing vCenter Server downtime, in many cloud environments SLAs are around vCloud workload availability and not about vCenter itself. One component that I would recommended to configure in a HA fashion is SSO. Without SSO you cannot login, this is critical for operations.

vCloud Director

Hopefully all of you are aware that vCloud Director can easily scale by deploying new “cells” as we call it. A cell is simply said a virtual machine running the vCD software. These cells are all connected to the same database and can handle load. Not only can they handle load, but they can also continue where another stopped. So from an Availability perspective this is ideal. I already depicted this in the diagram above by the way.

Recommended Practice: Deploy multiple vCloud Director cells in your management cluster. Ensure that at a minimum two cells reside on each of the “sites” of your stretched cluster. In order to achieve this vSphere DRS VM-Host affinity groups should be used!

vShield Manager

vShield Manager is one of the difficult components. It is a single virtual machine. You can protect it using vSphere HA but that is about it as the VM has multiple vCPUs which rules out FT. So what would make sense in this case? I would try to ensure that the vShield Manager is in the same site as vCenter Server. In the case there is a network failure between sites, at least the vShield Manager and vCenter Server can communicate when needed.

Recommended Practice: The vShield Manager virtual appliance resides in the same site as the vCenter Server, in other words it is a recommended practice to have both be part of the same vSphere DRS VM-Host affinity group. It is also recommended to leverage vSphere HA – VM Monitoring to allow for automatic restarts to occur in the case of a host or guest failure.


This is the challenging one… As of vCloud Director 5.1 it is supported to cluster your database. So you could potentially cluster the vCD database. However this Database Server will host more than just vCD, it will probably also host the vCenter Server database and potentially other bits and pieces like Chargeback / Orchestrator etc. Not all of these support a clustered database solution today unfortunately. It is difficult defining a recommended practice in this case. Although Database Clustering will theoretically increase availability it will also complicate operations. From an operational perspective the difficult part is how to manage site isolations. Just imagine the network between Site-A and Site-B is down but all components are still running. What will you do with the database?

This is definitely one I am not sure about what to do with…


As you can see this is not a fully worked out set of recommended practices guide yet, there is still stuff to be figured out and I am going through the exercise as we speak. If you have an opinion about this, and I am sure many do, don’t hesitate to leave a comment!

VMFS File Sharing Limits increased to 32

I was reading this white paper about VMware View 5.1 and VMFS File Locking today. It mentions the 8 host cluster limitation for VMware View with regards to linked clones and points to VMFS file sharing limits as the cause for this. While this is true in a way, VMware View 5.1 is limited to 8 host clusters for linked clones on VMFS Datastores, the explanation doesn’t cover all details or reflect the current state of vSphere / VMFS. (Although there is a fair bit of details in there about VMFS prior to vSphere 5.1.)

What the paper doesn’t mention is that in vSphere 5.1 this “file sharing limit” has been increased from 8 to 32 for VMFS Datastores. Cormac Hogan wrote about this a while ago. So to be clear, VMFS is fully capable today of sharing a file with 32 hosts in a cluster. VMware View doesn’t support that yet unfortunately, but for instance VMware vCloud Director 5.1 does support it today.

I still suggest reading the white paper, as it does help getting a better understanding of VMFS and View internals!

Database clustering support for vCloud Director added in version 5.1!

Those who have been architecting vCloud Director environments from the early days know that this has always been a pain point. I personally have had many discussions with product management and engineering to get support for database clustering like Oracle RAC or Microsoft clustering services for MS SQL. Unfortunately neither 1.0 and 1.5 supported it. So the big questions always was, when will database clustering support for vCloud Director be added?

I had a couple of discussions around this again last week and noticed it was still not listed until someone pointed me to the vCAT 3.0 documents. Hidden on page 110 of document “3a Architecting a VMware vCloud.pdf” I found the following statement:

VMware vCloud component database resiliency is provided through database clustering. Microsoft Cluster Service for SQL and Oracle RAC are supported.

Yes I do realize that this is not a KB article, or even mentioned in the vCloud Director documentation. I have requested the docs to be revised and a KB to be created. Hopefully those will follow soon, for now this statement is all we needed! When the docs are revised or a KB is published I will add the references to this article.

<update – 18/Oct/2012> KB just got added – </update>

vCloud Suite 5.1 available

No I didn’t set my alarm clock like Eric Sloof, just to be one of the first to post it… hence the reason this is “late”. But I got some more lined up for you though in the upcoming days. Now that the vCloud Suite 5.1 is available. Make sure to start your download engines and prep to upgrade. Before you start downloading, make sure to hit the launch page. I created a nice short URL for it

VMware NOW – Get the Latest Info on VMware Product Launches:

Download links:

What’s new docs: