I just posted the slidedecks that I presented at VMworld on Virtual SAN up on slideshare. The recording and the slides will probably at some point also show up on vmware.com but as I had many requests from people to share the material I figured I would do that straight after the event. If you have any questions don’t hesitate to ask.
I received a couple of questions last week about HA restarts in the scenario where a full site failure has occurred or a part of the storage system has failed and needs to be taken over by another datacenter. Yes indeed this is related to stretched clusters and HA restarts in a DR/DA event.
The questions were straight forward, how does the restart time-out work and what happens after the last retry? I wrote about HA restarts and the sequence last year, so lets just copy and paste that here:
- Initial restart attempt
- If the initial attempt failed, a restart will be retried after 2 minutes of the previous attempt
- If the previous attempt failed, a restart will be retried after 4 minutes of the previous attempt
- If the previous attempt failed, a restart will be retried after 8 minutes of the previous attempt
- If the previous attempt failed, a restart will be retried after 16 minutes of the previous attempt
You can extend the restart retry by increasing the value “das.maxvmrestartcount”. And then after every 15/16 minutes a new restart will be attempted. The question this triggered though is why would it even take 4 retries? The answer I got was: we don’t know if we will be able to fail over the storage within 30 minutes and if we will have sufficient compute resources…
Here comes the sweet part about vSphere HA, it actually is a pretty smart solution, it will know if VMs can be restarted or not. In this case as the datastore is not available there is absolutely no point in even trying and HA as such will not even bother. As soon as the storage becomes available though the restart attempts will start. Same applies to compute resource, if for whatever reason there is insufficient unreserved compute resources to restart your VMs then HA will wait for them to become available… nice right!?! Do note I emphasized the word “unreserved” as that is what HA cares about and not actually about used resources.
A while back I wrote about design considerations when designing or building a stretched vCloud Director infrastructure. Since then I have been working on a document in collaboration with Lee Dilworth, and this document should be out soon hopefully. As various people have asked for the document I decided to throw it in to this blog post so that the details are already out there.
** Disclaimer: this article has not been reviewed by the technical marketing team yet, this is a preview of what will possibly be published. When the official document is published I will add a link to this article **
VMware vCloud® Director™ 5.1 (vCloud Director) gives enterprise organizations the ability to build secure private clouds that dramatically increase datacenter efficiency and business agility. Coupled with VMware vSphere® (vSphere), vCloud Director delivers cloud computing for existing datacenters by pooling vSphere virtual resources and delivering them to users as catalog-based services. vCloud Director helps you build agile infrastructure-as-a-service (IaaS) cloud environments that greatly accelerate the time-to-market for applications and responsiveness of IT organizations.
Resiliency is a key aspect of any infrastructure but is even more important in “Infrastructure as a Service” (IaaS) solutions. This solution overview was developed to provide additional insight and information in how to architect and implement a vCloud Director based solution on a vSphere Metro Storage Cluster infrastructure.
This architecture consists of two major components. The first component is the geographically separated vSphere infrastructure based on stretched storage solution, here after referred to as the vSphere Metro Storage Cluster (vMSC) infrastructure. The second component is vCloud Director.
Note – Before we dive in to the details of the solution we would like to call out the fact that vCloud Director is not site aware. If incorrectly configured availability could be negatively impacted in certain failure scenarios.
Today I received an email about the vSphere Metro Storage Cluster paper I wrote, or better said about stretched clusters in general. I figured I would answer the questions in a blog post so that everyone can chip in / read etc. So lets show the environment first so that the questions are clear. Below is an image of the scenario.
Below are the questions I received:
If a power outage occurs at Frimley the 2 hosts get a message by the UPS that there is a power outage. After 5 minutes (or any other configured value) the next action should start. But what will be the next action? If a scripted migration to a host at Bluefin starts, will DRS move some VMs back to Frimley? Or could the VMs get a mark to stick at Bluefin? Should the hosts at Frimley placed into Maintenance mode so the migration will be done automatically? And what happens if there is a total power outage both at Frimley and Bluefin? How a controlled shutdown across hosts could be arranged?
Lets start breaking it down and answer where possible. The main question is how do we handle power outages. As in any datacenter this is fairly complex. Well the powering-off part is easy, powering everything on in the right order isn’t. So where do we start? First of all:
- If you have a stretched cluster environment and, in this case, Frimley data center has a power outage, it is recommended to place the hosts in maintenance mode. This way all VMs will be migrated to the Bluefin data center without disruption. Also, when power returns it allows you to do check on the host before introducing them to the cluster again.
- If maintenance mode is not used and a scripted migration is done virtual machines will be migrated back probably by DRS. DRS is triggered every 5 minutes (at a minimum). Avoid this, use maintenance mode!
- If there is an expected power outage and the environment is brought down it will need to be manually powered on in the right order. You can also script this, but a stretched cluster solution doesn’t cater for this type of failure unfortunately.
- If there is an unexpected power outage and the environment is not brought down then vSphere HA will start restarting virtual machines when the hosts come back up again. This will be done using the “restart priority” that you can set with vSphere HA. It should be noted that the “restart priority” is only about the completion of the power-on task, not about the full boot of the virtual machine itself.
I hope that clarifies things.
I had a question today around what the vSphere HA option advanced setting das.maskCleanShutdownEnabled is about. I described why it was introduced for Stretched Clusters but will give a short summary here:
Two advanced settings have been introduced in vSphere 5.0 Update 1 to enable HA to fail-over virtual machines which are located on datastores which are in a Permanent Device Loss state. This is very specific to stretchec cluster environments. The first setting is configured on a host level and is “disk.terminateVMOnPDLDefault”. This setting can be configured in /etc/vmware/settings and should be set to “True”. This setting ensures that a virtual machine is killed when the datastore it resides on is in a PDL state.
The second setting is a vSphere HA advanced setting called “das.maskCleanShutdownEnabled“. This setting is also not enabled by default and it will need to be set to “True”. This settings allows HA to trigger a restart response for a virtual machine which has been killed automatically due to a PDL condition. This setting allows HA to differentiate between a virtual machine which was killed due to the PDL state or a virtual machine which has been powered off by an administrator.
But why is “das.maskCleanShutdownEnabled” needed for HA? From a vSphere HA perspective there are two different types of “operations”. The first is a user initiated power-off (clean) and the other is a kill. When a virtual machine is powered off by a user, part of the process is setting the property “runtime.cleanPowerOff” to true.
Remember that when “disk.terminateVMOnPDLDefault” is configured your VMs will be killed when they issue I/O. This is where the problem arises, in a PDL scenario it is impossible to set “runtime.cleanPowerOff” as the datastore, and as such the vmx, is unreachable. As the property defaults to “true” vSphere HA will assume the VMs were cleanly powered off. This would result in vSphere HA not taking any action in a PDL scenario. By setting “das.maskCleanShutdownEnabled” to true, a scenario where all VMs are killed but never restarted can be avoided as you are telling vSphere HA to assume that all VMs are not shutdown in a cleanly matter. In that case vSphere HA will assume VMs are killed UNLESS the property is set.
If you have a stretched cluster environment, make sure to configure these settings accordingly!