• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

stretched cluster

Some questions about Stretched Clusters with regards to power outages

Duncan Epping · Oct 9, 2012 ·

Today I received an email about the vSphere Metro Storage Cluster paper I wrote, or better said about stretched clusters in general. I figured I would answer the questions in a blog post so that everyone can chip in / read etc. So lets show the environment first so that the questions are clear. Below is an image of the scenario.

Below are the questions I received:

If a power outage occurs at Frimley the 2 hosts get a message by the UPS that there is a power outage. After 5 minutes (or any other configured value) the next action should start. But what will be the next action? If a scripted migration to a host at Bluefin starts, will DRS move some VMs back to Frimley? Or could the VMs get a mark to stick at Bluefin? Should the hosts at Frimley placed into Maintenance mode so the migration will be done automatically? And what happens if there is a total power outage both at Frimley and Bluefin? How a controlled shutdown across hosts could be arranged?

Lets start breaking it down and answer where possible. The main question is how do we handle power outages. As in any datacenter this is fairly complex. Well the powering-off part is easy, powering everything on in the right order isn’t. So where do we start? First of all:

  1. If you have a stretched cluster environment and, in this case, Frimley data center has a power outage, it is recommended to place the hosts in maintenance mode. This way all VMs will be migrated to the Bluefin data center without disruption. Also, when power returns it allows you to do check on the host before introducing them to the cluster again.
  2. If maintenance mode is not used and a scripted migration is done virtual machines will be migrated back probably by DRS. DRS is triggered every 5 minutes (at a minimum). Avoid this, use maintenance mode!
  3. If there is an expected power outage and the environment is brought down it will need to be manually powered on in the right order. You can also script this, but a stretched cluster solution doesn’t cater for this type of failure unfortunately.
  4. If there is an unexpected power outage and the environment is not brought down then vSphere HA will start restarting virtual machines when the hosts come back up again. This will be done using the “restart priority” that you can set with vSphere HA. It should be noted that the “restart priority” is only about the completion of the power-on task, not about the full boot of the virtual machine itself.

I hope that clarifies things.

NetApp is now officially vMSC certified

Duncan Epping · Jul 27, 2012 ·

As I had many people asking about this over the last couple of months I figured I would share it. I just noticed that NetApp is now finally officially vSphere Metro Storage Cluster certified (see SAN HCL). NetApp has certified their platform for the following array types:

  • NFS
  • iSCSI

Yes indeed, FC is currently not listed… But for me the great news is that NFS is listed! A KB article has been published with all the details… make sure to read it if you are looking to deploy a stretched cluster with NetApp and vSphere 5.0.

VMworld 2012 here I come

Duncan Epping · Jun 27, 2012 ·

I just got the news that two of my VMworld sessions have been accepted. I wanted to share with you which ones so you can keep track (if you want):

  • BCO1159 – Architecting and Operating a vSphere Metro Storage Cluster by Lee Dilworth and Duncan Epping
    In this session Lee Dilworth and Duncan Epping will discuss the design and operational considerations for vSphere Metro Storage Clusters environments, also commonly referred to as stretched cluster environments. Best practices around implementation and design will be shared. Various failure scenarios which can occur in a stretched storage environment are discussed in-depth including how vSphere 5.x responds to these failures. We will cover the implication on your vSphere HA, DRS and Storage DRS configuration and provide recommendations how to increase availability and simplify operations!
  • VSP1504 – Ask the Expert vBloggers with Rick Scherer, Frank Denneman, Chad Sakac, Scott Lowe and Duncan Epping
    Back by high demand, the Ask the Expert vBloggers panel session. Show up and ask any question you like to a panel consisting out of know community members! This was one of the best voted sessions last year and with people like Frank, Rick, Scott and Chad sitting next to me I know it is going to be awesome again. Lets just hope Rick brings his buzzer again so he can buzz Chad when he starts preaching again 🙂

In a couple of weeks when all sessions are listed I will also create a nice “Top 20 – VMworld Sessions” article again, but for now I want to thank everyone who voted and am hoping to see all of you at VMworld.

  • « Go to Previous Page
  • Go to page 1
  • Interim pages omitted …
  • Go to page 6
  • Go to page 7
  • Go to page 8

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive", the “vSphere Clustering Technical Deep Dive” series, and the host of the "Unexplored Territory" podcast.

Upcoming Events

29-08-2022 – VMware Explore US
07-11-2022 – VMware Explore EMEA
17-11-2022 – VMUG UK
….

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2022 · Log in