5.1

vSphere HA fail-over in action – aka reading the log files

Duncan Epping · Oct 17, 2012 ·

I had a discussion with Benjamin Ulsamer at VMworld and he had a question about the state of a host when both the management network and storage network was isolated. My answer was that in that case the host will be reported as “dead” as there is no “network heartbeat” and no “datastore heartbeat”. (more info about heartbeating here) Funny thing is when you look at the log files you do see isolated instead of dead. Why is that? Before we answer it lets go through the log files and paint the picture:

Two hosts (esx01 and esx02) with a management network and an iSCSI storage network. vSphere 5.0 is used and Datastore Heartbeating is configured. For whatever reason for the network of esx02 is isolated (both storage and management as it is a converged environment. So what can you see in the log files?

Lets look at “esx02” first:

16:08:07.478Z [36C19B90 info ‘Election’ opID=SWI-6aace9e6] [ClusterElection::ChangeState] Slave => Startup : Lost master
- At 16:08:07 the network is isolated
16:08:07.479Z [FFFE0B90 verbose ‘Cluster’ opID=SWI-5185dec9] [ClusterManagerImpl::CheckElectionState] Transitioned from Slave to Startup
- The host recognizes it is isolated and drops from Slave to “Startup” so that it can elect itself as master to take action
16:08:22.480Z [36C19B90 info ‘Election’ opID=SWI-6aace9e6] [ClusterElection::ChangeState] Candidate => Master : Master selected
- The host has elected itself as master
16:08:22.485Z [FFFE0B90 verbose ‘Cluster’ opID=SWI-5185dec9] [ClusterManagerImpl::CheckHostNetworkIsolation] Waited 5 seconds for isolation icmp ping reply. Isolated
- Can I ping the isolation address?
16:08:22.488Z [FFFE0B90 info ‘Policy’ opID=SWI-5185dec9] [LocalIsolationPolicy::Handle(IsolationNotification)] host isolated is true
- No I cannot, and as such I am isolated!
16:08:22.488Z [FFFE0B90 info ‘Policy’ opID=SWI-5185dec9] [LocalIsolationPolicy::Handle(IsolationNotification)] Disabling execution of isolation policy by 30 seconds.
- Hold off for 30 seconds as “das.config.fdm.isolationPolicyDelaySec” was configured
16:08:52.489Z [36B15B90 verbose ‘Policy’] [LocalIsolationPolicy::GetIsolationResponseInfo] Isolation response for VM /vmfs/volumes/a67cdaa8-9a2fcd02/VMWareDataRecovery/VMWareDataRecovery.vmx is powerOff
- There is a VM with an Isolation Response configured to “power off”
16:10:17.507Z [36B15B90 verbose ‘Policy’] [LocalIsolationPolicy::DoVmTerminate] Terminating /vmfs/volumes/a67cdaa8-9a2fcd02/VMWareDataRecovery/VMWareDataRecovery.vmx
- Lets kill that VM!
16:10:17.508Z [36B15B90 info ‘Policy’] [LocalIsolationPolicy::HandleNetworkIsolation] Done with isolation handling
- And it is gone, done with handling the isolation

Lets take a closer look at “esx01”, what does this host see with regards to the management network and storage network isolation of “esx02”:

16:08:05.018Z [FFFA4B90 error ‘Cluster’ opID=SWI-e4e80530] [ClusterSlave::LiveCheck] Timeout for slave @ host-34
- The host is not reporting itself any longer, the heartbeats are gone…
16:08:05.018Z [FFFA4B90 verbose ‘Cluster’ opID=SWI-e4e80530] [ClusterSlave::UnreachableCheck] Beginning ICMP pings every 1000000 microseconds to host-34
- Lets ping the host itself, it could be the FDM agent is dead.
16:08:05.019Z [FFFA4B90 verbose ‘Cluster’ opID=SWI-e4e80530] Reporting Slave host-34 as FDMUnreachable
16:08:05.019Z [FFD5BB90 verbose ‘Cluster’] ICMP reply for non-existent pinger 3 (id=isolationAddress)
- As it is just a 2 node cluster, lets make sure I am not isolated myself, I got a reply so I am not isolated!
16:08:10.028Z [FFFA4B90 verbose ‘Cluster’ opID=SWI-e4e80530] [ClusterSlave::UnreachableCheck] Waited 5 seconds for icmp ping reply for host host-34
16:08:14.035Z [FFFA4B90 verbose ‘Cluster’ opID=SWI-e4e80530] [ClusterSlave::PartitionCheck] Waited 15 seconds for disk heartbeat for host host-34 – declaring dead
- There is also no datastore heartbeat so the host must be dead. (Note that it cannot see the difference between a fully isolated host and a dead host when using IP based storage on the same network.)
16:08:14.035Z [FFFA4B90 verbose ‘Cluster’ opID=SWI-e4e80530] Reporting Slave host-34 as Dead
- It is officially dead!
16:08:14.036Z [FFE5FB90 verbose ‘Invt’ opID=SWI-42ca799] [InventoryManagerImpl::RemoveVmLocked] marking protected vm /vmfs/volumes/a67cdaa8-9a2fcd02/VMWareDataRecovery/VMWareDataRecovery.vmx as in unknown power state
- We don’t know what is up with this VM, power state unknown…
16:08:14.037Z [FFE5FB90 info ‘Policy’ opID=SWI-27099141] [VmOperationsManager::PerformPlacements] Sending a list of 1 VMs to the placement manager for placement.
- We will need to restart one VM, lets provide its details to the Placement Manager
16:08:14.037Z [FFE5FB90 verbose ‘Placement’ opID=SWI-27099141] [PlacementManagerImpl::IssuePlacementStartCompleteEventLocked] Issue failover start event
- Issue a failover event to the placement manager.
16:08:14.042Z [FFE5FB90 verbose ‘Placement’ opID=SWI-e430b59a] [DrmPE::GenerateFailoverRecommendation] 1 Vms are to be powered on
- Lets generate a recommendation on where to place the VM
16:08:14.044Z [FFE5FB90 verbose ‘Execution’ opID=SWI-898d80c3] [ExecutionManagerImpl::ConstructAndDispatchCommands] Place /vmfs/volumes/a67cdaa8-9a2fcd02/VMWareDataRecovery/VMWareDataRecovery.vmx on __localhost__ (cmd ID host-28:0)
- We know where to place it!
16:08:14.687Z [FFFE5B90 verbose ‘Invt’] [HalVmMonitor::Notify] Adding new vm: vmPath=/vmfs/volumes/a67cdaa8-9a2fcd02/VMWareDataRecovery/VMWareDataRecovery.vmx, moId=12
- Lets register the VM so we can power it on
16:08:14.714Z [FFDDDB90 verbose ‘Execution’ opID=host-28:0-0] [FailoverAction::ReconfigureCompletionCallback] Powering on vm
- Power on the impacted VM

That is it, nice right… and is just a short version of what is actually in the log files. It contains a massive amount of details! Anyway, back to the question… if not already answered, the remaining host in the cluster sees the isolated host as dead as there is no:

network heartbeat
response to a ping to the host
datastore heartbeat

The only thing the master can do at that point is to assume the “isolated” host is dead.

** Disclaimer: This article contains references to the words master and/or slave. I recognize these as exclusionary words. The words are used in this article for consistency because it’s currently the words that appear in the software, in the UI, and in the log files. When the software is updated to remove the words, this article will be updated to be in alignment. **

Can I get your book for free?

Duncan Epping · Oct 11, 2012 ·

Well not from me, but CloudPhysics has a nice book give-away going on at the moment for the VMworld Barcelona attendees! So what do you need to do?

How To Win

Email us at info@cloudphysics.com with a subject of “Book”. No message is needed.

Register at http://www.cloudphysics.com/ by clicking “SIGN UP”.

Install the CloudPhysics Observer vApp to activate your dashboard.

Eligibility

You are attending VMworld Barcelona.

You are a new CloudPhysics user.

You fully install the CloudPhysics ‘Observer’ vApp in your vSphere environment.

That is an easy way of getting the book for free right? So I suggest you head over and sign up to make sure you are part of the first 150 users that gets a free book!

VXLAN requirements

Duncan Epping · Oct 4, 2012 ·

When I was writing my “Configuring VXLAN” post I was trying to dig up some details around VXLAN requirements and recommendations to run a full “VMware” implementation. Unfortunately I couldn’t find much, or at least not a single place with all the details. I figured I would gather all I can find and throw it in to a single post to make it easier for everyone.

Virtual:

vSphere 5.1
vShield Manager 5.1
vSphere Distributed Switch 5.1.0
Portgroups will be configured by vShield Manager, recommend to use either “LACP Active Mode”, “LACP Passive Mode” or “Static Etherchannel”
- When “LACP” or “Static Etherchannel” (Cisco only) is configured note that a port/ether channel will need to be created on the physical side
- “Fail Over” is supported, but not recommended
- You cannot configure the portgroup with “Virtual Port ID” or “Load Based Teaming”, these are not supported
Requirement for MTU size of 1600 (Kamau explains why here)

Physical:

Recommend to have DHCP available on VXLAN transport VLANs, fixed IP also works though!
VXLAN port (UDP 8472) is opened on firewalls (if applicable)
Port 80 is opened from vShield Manager to the Hosts (used to download the “vib / agent”)
For Link Aggregation Control Protocol (LACP), 5- tuple hash distribution is highly recommended but not a hard requirement
MTU size requirement is 1600
Strongly recommended to have IGMP snooping enabled on L2 switches to which VXLAN participating hosts are attached. IGMP Querier must be enabled on router or L3 switch with connectivity to the multicast enabled networks when IGMP snooping is enabled.
If VXLAN traffic is traversing routers –> multicast routing must be enabled
- The recommended Multicast protocol to deploy for this scenario is Bidirectional Protocol Independent Multicast (PIM-BIDIR), since the Hosts act as both multicast speakers and receivers at the same time.

That should capture most requirements and recommendations. If anyone has any additions please leave a comment and I will add it.

** Please note, proxy arp is not a requirement for a VXLAN / VDS implementation, only when Cisco Nexus 1000v is used this is a requirement **

References:
VXLAN Primer by Kamau
vShield Administration Guide
Internal training ppt
KB 2050697 (note my article was used as the basis for this KB)

Limit the amount of eggs in a single basket through vSphere 5.1 DRS

Duncan Epping · Oct 1, 2012 ·

A while back I had discussion with someone and he asked me if it was possible to limit the amount of eggs in a single basket, in other words limit the amount of VMs per host. The reason this customer wanted to do this was to limit the impact of a failure. They had roughly 1500 VMs in their cluster and some hosts carried 50 VMs while other had 20 or 80. This is the nature of DRS though and totally expected.

If one of these hosts would fail, and lets say they had 80 VMs the impact of that would be substantial. To minimize the risk they wanted to limit the amount of VMs per host. I had thought about this before and had already asked the HA and DRS team if they could do anything around this. The DRS team started looking in to it and to my surprise they managed to get it in quick.

In VMworld 2012 session “VSP2825: DRS: Advanced Concepts, Best Practices and Future Directions” by Ajay Gulati and Aashish Parikh a solution is presented. (You can watch this session for free on youtube, highly recommended!) This solution is a new vSphere DRS advanced setting which is introduced in vSphere 5.1.

 LimitVMsPerESXHost

Note that when you configure this setting it might impact the performance of your virtual machines as it could limit the load balancing mechanism of your cluster. If you have no requirements to limit the amount of VMs per ESXi host, don’t do it. When this setting is configured, vSphere DRS will not allow migrations to a host which has reached the threshold and will also not admit new VMs to the host if it has reached the threshold.

Out on iBooks finally – vSphere 5.1 Clustering Deepdive

Duncan Epping · Oct 1, 2012 ·

It took about about a month to get this published, but here it finally is: vSphere 5.1 Clustering Deepdive on iBooks.

Yeah yeah, we know… you also want Nook and lulu.com says it is pending so that means it probably takes a couple of days before it is up on Barnes and Nobles as well.