ESXi DCUI Shutdown vs vCenter Shutdown of a host

Today on the community forums someone mentioned he had shutdown his host and that he expected vSphere HA to restart his virtual machines. For whatever reason he got in a situation where all of his VMs were still running but he couldn’t do much anymore with them and as such he wanted to kill the host so that HA could safely restart the virtual machines. However when he shutdown his host nothing happened, the VMs remained powered off. Why did this happen?

I had seen this before in the past, but it never really sunk in until I saw the questions from this customer. I figured I would test it just to see what happened and if I could spot a difference in the vSphere HA logs. I powered on a VM on one of my hosts and moved off all other VMs. I then went to the DCUI of the host and gave a “shutdown” using F12. I tailed the FDM log on one of my hosts and spotted the following log message:

2014-04-04T11:41:54.882Z [688C2B70 info 'Invt' opID=SWI-24c018b] [VmStateChange::SavePowerChange] vm /vmfs/volumes/4ece24c4-3f1ca80e-9cd8-984be1047b14/New Virtual Machine/New Virtual Machine.vmx curPwrState=unknown curPowerOnCount=0 newPwrState=powered off clnPwrOff=true hostReporting=host-113

In the above scenario the virtual machine was not restarted even though the host was shutdown. I did the exact same exercise again, but only this time I did the shutdown using the vCenter Web Client. After I witnessed the VM being restarted I also noticed a difference in the FDM log:

2014-04-04T12:12:06.515Z [68040B70 info 'Invt' opID=SWI-1aad525b] [VmStateChange::SavePowerChange] vm /vmfs/volumes/4ece24c4-3f1ca80e-9cd8-984be1047b14/New Virtual Machine/New Virtual Machine.vmx curPwrState=unknown curPowerOnCount=0 newPwrState=powered on clnPwrOff=false hostReporting=host-113

The difference is the power-off state that is reported by vSphere HA. In the first scenario the virtual machine is marked as “clnPwrOff=true” which basically tells vSphere HA that an administrator has powered off the virtual machine, this is what happened when “shutdown” was initiated through the DCUI and hence no restart took place. (It seems that ESXi initiates a shutdown of all running virtual machines.) In the second scenario vSphere HA reported that the VM was not cleanly powered off (“clnPwrOff=false”), and as such it restarted the virtual machine as it assumed something bad had happened to it.

So what did we learn? If you, for whatever reason, want vSphere HA to restart your virtual machines which are currently running on a host that you want to shutdown, make sure that you use the vCenter Web Client instead of the DCUI!

Disclaimer: my tests were conducted using vSphere 5.5 Update 1. I believe that at some point in the past “shutdown” via the DCUI would also allow HA to restart the VMs. I am now investigating why this has changed and when. When I find out I will update this post.

Startup News Flash part 16

Number 16 of the Startup News Flash, here we go:

Nakivo just announced the beta program for 4.0 of their backup/replication solution. It adds some new features like: recovery of Exchange objects directly from compressed and deduplicated VM backups, Exchange logs truncation, and automated backup verification. If you are interested in testing it, make sure to sign up here. I haven’t tried it, but they seem to be a strong upcoming player in the backup and DR space for SMB.

SanDisk announced a new range of SATA SSDs called “cloudspeed”. They released 4 different models with various endurance levels and workload targets, of course ranging in sizes from 100GB up to 960GB depending on the endurance level selected. Endurance level ranges from 1 up to 10 full drive writes per day. (Just as an FYI, for VSAN we recommend 5 full drive writes per day as a minimum) Performance numbers range between 15k to 20k write IOps and 75 to 88K read IOps. More details can be found in the spec sheet here. What interest me most is the FlashGuard Technology that is included, interesting how SanDisk is capable of understanding wear patterns and workloads to a certain extend and place data in a specific way to prolong the life of your flash device.

CloudPhysics announced the availability of their Storage Analytics card. I gave it a try last week and was impressed. I was planning on doing a write up on their new offering but as various bloggers already covered it I felt there was no point in repeating what they said. I think it makes a lot more sense to just try it out, I am sure you will like it as it will show you valuable info like “performance” and the impact of “thin disks” vs “thick disks”. Sign up here for a 30day free trial!

30K for a VSAN host @theregister? I can configure one for 2250 USD!

I’ve been following the posts from the Register on VSAN and was surprised when they posted the cost of the hosts they configured: 30K each. With 3 at a minimum they concluded that for 90K you could buy yourself a nice legacy storage system. I don’t disagree with that to be honest… for 90K you can buy a nice legacy storage system. I guess you need to ask yourself first though what you will do with that 90K storage system by itself? Not much indeed, as you would need compute resources sitting next to it in order to do anything. So if you want to make a comparison, do not compare a full VSAN environment (or any other hyper-converged solution out there) to just a storage system at it just doesn’t make sense.

Now that still doesn’t make these hosts cheap I can hear you think, and again I agree with that. Although I have absolutely no clue where the 30K came from, and judging by the tweets this morning most people don’t know and feel it probably was overkill. Call me crazy, but I can configure a fully supported VSAN configuration for about 2250 USD (just HW) on the Dell website.

  • Dell T320
  • Intel Xeon E5-2420 1.90GHz 6 Core
  • Perc H310 Disk Controller
  • 32GB Memory
  • 1 x 7200RPM 1TB NL-SAS
  • 1 x 100GB Intel S3700 SSD (or dell equal drive)
  • 5 x 1GbE NIC Port

I would like to conclude that VSAN would be a lot cheaper than those legacy solutions, less than 7500 USD for 3 hosts is peanuts right?!? Yes I know, the above configuration wouldn’t fit many use cases (except for maybe a ROBO deployment where only a couple of VMs are needed) and that was the whole point of the exercise showing how pointless these exercises can be. You can twist these numbers anyway you like, and you can configure your VSAN hosts any way you like as long as the components (HDD/SSD/Controller) are on the VSAN HCL and the system is on the vSphere HCL. PS: Dear Register, next time you run through the exercise, you may want to post the configuration you selected… It makes things a bit clearer.

VSAN – Misconfiguration Detected

Although Cormac Hogan already wrote about this I figured I would repeat some of his work. It seems like various folks are hitting this issue where an error is thrown while configuring VSAN: Misconfiguration Detected. The misconfiguration in this case refers to how the physical network has been configured. In order for VSAN to be successfully configured your layer 2 VSAN network will need to be enabled for multicast traffic. (below a screenshot of the error which I borrowed from Cormac… thanks Cormac)

In order to successfully configure VSAN you can do two things, now lets be clear that I am not the networking expert and personally I would always advise to discuss with your networking team what the best option is. Here are your two options:

  • Enable IGMP Snooping for your VSAN network (VLAN) and define an IGMP Snooping Querier. Default setting on most Cisco switches is IGMP Snooping enabled but without an IGMP Snooping Querier. In this configuration VSAN will not be able to configure correctly!
  • Disable IGMP Snooping for your VSAN network (VLAN). Please note that you can typically disable IGMP Snooping globally and per VLAN, in this case if you want to disable it… disable it on your VLAN!

Please consult your network vendor documentation on how to do this.

Top 25 bloggers 2014 results are out…

The top 25 bloggers 2014 voting results are out. This year the competition was insane, and I know that I say this every year but if you look at bloggers like Cormac Hogan, Derek Seaman, Frank Denneman, Chris Wahl and William Lam you know what I am talking about.

1400+ people voted, 15 new blogs in the top 50, 5 new blogs in the Top 25, and a new blog in the Top 10. A big thank you to every who has voted for me again, I am honored and humbled to have been voted number 1. I want to call out the top 5 as I have worked closely with most of them the last years and it has been a great pleasure: William Lam(2), Frank Denneman(3), Cormac Hogan(4) and Scott Lowe(5). Each of them has consistently produced excellent material. I have been very very impressed by what they’ve released over the last year and hope everyone keeps putting out their material as I very much enjoy reading it.

Congrats to everyone else who made the list, if you are curious who they are head over to the full top bloggers list on Eric’s blog. Maybe even better, watch the awesome show Eric, John, Rick and David recorded… It is very entertaining!