• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

Seeing unexpected error messages during ISL failure with Stretched Cluster for secondary site

Duncan Epping · Jun 22, 2023 ·

I had a question this week from one of our field specialists, he ran into a situation where he saw lots of error messages about the fact that vSphere HA could not restart a certain workload during an ISL failure. Let me first explain the scenario, and also explain what vSAN does and doesn’t do. Let’s take the below situation.

Let’s assume Datacenter A is the “preferred site”, and Datacenter B is the “secondary site”. In case the ISL between Datacenter A and Datacenter B fails, the Witness (in a 3rd location) will bind itself automatically with Datacenter A. This means that VMs in Datacenter B will lose access to the vSAN Datastore.

From an HA perspective Datacenter A will have a primary (previously called master), and so will Datacenter B. The primary will detect that there are VMs that are not running, and it will try to restart these VMs. It will try to do this on both sides, and of course the site where access to the vSAN datastore is lost will see the restart fail.

Now here is the important aspect, of course depending on where/how vCenter Server is connected to these locations, it may, or may not, receive information about successful and unsuccessful restarts. I’ve seen situations where vCenter Server could only communicate with the primary in Datacenter B, and this would just lead to unsuccessful failover messages, while in reality all VMs were restarted in Datacenter A. The UI can give a hint by the way when you are in that situation, it will provide you the info on which host is the primary, and it will also tell you that there’s a “network isolation” or a “network partition”, and in this case of course that would be a “network partition”.

Related

Server, vSAN VMware, vsan

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Advertisements




Copyright Yellow-Bricks.com © 2025 · Log in