I guess the most important part of Jeff’s post is this:
We’ve drilled into these scenarios further and asked customers, who have currently have Live Migration capabilities, if they have changed their servicing process. In particular, when do they perform their hardware servicing. Is it during business hours 9-5? The overwhelming answer is, “No, we still schedule server downtime and notify folks of the scheduled downtime.”
Even customers with Live Migration still wait until off hours to service the hardware.
I don’t know Jeff’s customers, but it seems like they’re not the most brilliant system engineers in the world. I don’t know a single system engineer who would wait with servicing their hardware if there’s a warning on his system and when he has the opportunity to live migrate. With an 8:1 consolidation rate the importance of fully functional hardware also increased 8 times. What are you going to tell your manager when a hardware device reached it’s threshhold and just stops working? “Sorry, I know we have VMotion but I wanted to service after business hours because I did not want to disturb anyone!”. Well I know what the reaction of your manager will be.
I’ve seen a lot of crooked comparisons, but this is by far the best I’ve seen in years. Especially the part about 5, 10, 20 seconds of downtime. What about your SQL Servers or Exchange. If you could avoid downtime wouldn’t you want to? All these are just excuses Microsoft tries to find for not releasing a full working product with a real live migration functionality. Come one guys, you announced it… did not get the thing working in time, and you are telling the world that nobody needs it. Who are you kidding?
And about patching, the Windows 2008 Core footprint is indeed small compared to the full edition… But it doesn’t even come close to the 32MB ESXi footprint. I’m not even gonna talk about Microsoft patch reputation.
Jeff’s post also pointed me towards another blog where the writer James talks about the same issues. In the comments “vaibhavbagaria” points out a nice pro VMware detail:
The other annoying thing is that MS solution needs two LUNs for each of the servers, one for Quorum and one for Storage. VMware shares a single LUN between upto 16 physical servers. So you could have 14 Active and 2 Standby servers for failover protection.
With Hyper-V, one would need 28 servers and 28 LUNs.
And with ESX 3.5 it’s 32 Servers in a cluster and or 32 Servers attached to a single LUN. So make that 32 Active ESX Servers, no standby because you will have failover possibilities with using your hardware. The MS score would be 32 active and 32 standby with 32 LUNs. Well that would give you a nice consolidation rate I guess and really reduce the energy costs. Talking green…
James O’Neill just replied to my post with the following:
You could also have 8 all active nodes and achieve the same thing. I think we only go to 8 so you would have to have each one running at 7/8th capacity. VMWare could run at 31/32, against our 28/32