Server

Win a free VMworld ticket by sharing your vC Ops story update…

Duncan Epping · Jun 7, 2012 ·

A few weeks back I wrote a blog entry on a “Cool Contest: Tell your vC Ops Story”, and since then we have had some really great submissions but of course we can always use more. As a member of the judging panel I thought I’d share a few thoughts on those who may be interested in entering, but may be hesitant for other reasons.

“What if my story is not good enough”? Don’t let this deter you from entering, you’d be surprised at how success stories can actually seem routine in your own day to day, but in the grander scheme of things prove invaluable to others that may not have the same insight as you.
“I want the ticket to VMworld, but I am not a seasoned presenter”, This is your opportunity to shine! That is why the panelists (me included) and the VMware team will both help you both prepare (slideware, visuals and dry runs) and present side by side with you – making the experience all the more enjoyable and fruitful!

There is also a good blog on the VMware blog site that provides more guidance on how to enter and advice to what the judges will be looking for. One last piece of advice, do not put off your entry any longer – the contest runs until July 11^th, and the anticipated “mad rush” at the end will only lump your entry with those trying to make the deadline, so just enter now!

vSphere 5.0 HA restarting of VMs with no access to storage?

Duncan Epping · Jun 6, 2012 ·

I had a question today around the restart of VMs with no access to storage by HA. The question was if HA would try to restart the VM and time out after 5 times. With the follow up question, if HA would try again when the storage would return for duty.

By default HA will try to restart a VM up to 5 times in roughly 30 minutes. If the master does not exceed it will stop trying. On top of that HA manages a “compatibility list”. This list will contain the details around which VM can be restarted and where. In other words; which hosts have access to the datastores and network portgroup required for this VM to successfully power-on. Now if for whatever reason there are no compatible hosts available for the restart then HA will not try to restart the VM.

But what if the problem is resolved? As soon as the problem is resolved, and reported as such, the compatibility list will be updated. When the list is updated HA will continue with the restarts again.

It might also be good to know that if for whatever reason the master fails, a new master will continue trying to restart the VM. It will start with 5 new attempts and not take the number of restart attempts that the previous master did into account.

** Disclaimer: This article contains references to the words master and/or slave. I recognize these as exclusionary words. The words are used in this article for consistency because it’s currently the words that appear in the software, in the UI, and in the log files. When the software is updated to remove the words, this article will be updated to be in alignment. **

VM Monitoring only using VMware Tools heartbeat?

Duncan Epping · Jun 5, 2012 ·

I had this question twice this week and did a quick search on my blog and I wrote an article about it a while back, but I figured it wouldn’t hurt to repeat some of that and expand on it. I copied / pasted this from part from our book as I think it it spot on!

VM/App monitoring uses a heartbeat mechanism kind of similar to HA. If heartbeats, and, in this case, VMware Tools heartbeats, are not received for a specific (and configurable) amount of time, the virtual machine will be restarted. These heartbeats are monitored by the HA agent and are not sent over a network, but stay local to the host.

Although the heartbeat produced by VMware Tools is reliable, VMware added a further verification mechanism. To avoid false positives, VM Monitoring also monitors I/O activity of the virtual machine. When heartbeats are not received AND no disk or network activity has occurred over the last 120 seconds, per default, the virtual machine will be reset. Changing the advanced setting “das.iostatsInterval” can modify this 120-second interval.

Which isolation response should I use?

Duncan Epping · May 31, 2012 ·

I wrote this article about split brain scenarios for the vSphere Blog. Based on this article I received some questions around which “isolation response” to use. This is not something that can be answered by a simple “recommended practice” and applied to all scenarios out there. Note that below has got everything to do with your infrastructure. Are you using IP-Based storage? Do you have a converged network? All of these impact the decision around the isolation response.

The following table however could be used to make a decision:

Likelihood that host will retain access to VM datastores	Likelihood that host will retain access to VM network	Recommended Isolation policy	Explanation
Likely	Likely	Leave Powered On	VM is running fine so why power it off?
Likely	Unlikely	Either Leave Powered On or Shutdown	Choose shutdown to allow HA to restart VMs on hosts that are not isolated and hence are likely to have access to storage
Unlikely	Likely	Power Off	Use Power Off to avoid having two instances of the same VM on the VM network
Unlikely	Unlikely	Leave Powered On or Power Off	Leave Powered on if the VM can recover from the network/datastore outage if it is not restarted because of the isolation, and Power Off if it likely can’t.

vFabric Application Director

Duncan Epping · May 29, 2012 ·

I am not going to pretend I am the devops expert here but I was playing around with vFabric Application Director last week and I thought it was really cool solution. I can really see the value for development teams, but also for large support services teams and even education org’s. Being able to create deployment / configuration plans for application stacks by simply dragging and dropping is something I would have loved to have when I supported various development and support teams in a previous life.

I could have saved a lot of time… during physical machine and virtual machine provisioning, installation of software (automated if I was lucky), configuration and figuring out how it all worked with this weird database flavor or new Operating System. With App Director, yes you will need to figure all of it out once, you can easily repeat the same steps over and over again. You can select different Guest OS’s, different databases, different apps etc.

If you are in the same boat as I once was I would suggest watching this video and giving App Director a test run just to figure out if it can simplify your life!