Yellow Bricks

UML diagram your VM, vdisks and snapshots by @lucd22

Duncan Epping · Apr 7, 2010 ·

Somehow I missed out on this excellent script/blog about diagramming your vmdk’s and associated snapshot trees for the Planet V12n Top 5 post I do for the VMTN Blog weekly.

Luc Dekens is one of the leading PowerCLI script gurus and created this amazing script that creates a diagram of the relationship between VMs, VMDKs and Snapshots. Now you might wonder what the use case would be when there is a one to one relationship like the following:

Many will understand the relationship when you have a single snapshot. But is that still the case when you have multiple snapshots running on multiple disks? Probably not, check this diagram to get an idea:

Great work Luc, and my apologies for not selecting it for the Planet V12n Top 5 as it definitely deserved a spot.

vShield Manager

Duncan Epping · Apr 6, 2010 ·

I was working on a vShield Zones setup a couple of days ago. I have been a couple of times already but somehow the following details seem to slip every time and I find myself digging it up in the manual, hence the reason for this article. A reminder to myself:

vShield Manager login(page 24): admin/default
Configure IP Address with following command (page 35): setup

Where are my files?

Duncan Epping · Apr 1, 2010 ·

I was working on an automated build procedure yesterday of ESX hosts in a cloud environment. I stored my my temporary post configuration script in /tmp/ as I have been doing since 3.0.x. When the installation was finished the host rebooted and I waited on the second reboot to occur, which is part of my post configuration. Weird thing is it never happened.

So I assumed I made a mistake and went over my script. Funny thing is it just looked fine. For troubleshooting purposes I decided to strip my script and only do a “touch /tmp/test” in the %post section to see if the file would be created or not. I also removed the “automatic reboot” after the installation. When the installation was finished I went into the console and noticed my file “test” in /tmp. So I rebooted the system and checked /tmp again…. gone. HUH?

I figured it had something to do with the installer. I installed ESX manually, including a “/tmp” partition, and booted the server. I copied a bunch of random files into /tmp and rebooted the server… again the files were deleted. Now I might be going insane, but I am pretty certain this used to work just fine in the good old days ESX 3.0.X. Apparently something changed, but what?

After some googling and emailing I discovered that this a change in behaviour is a known issue (release notes). When ESX 4.0 is booted the “/etc/init.d/vmware” cleans out /tmp. (See below) Something you might want to take into account when using /tmp.

# Clear /tmp to create more space
if IsLocalFileSystem /tmp ; then
rm -rf /tmp/*
fi

I want to thank my colleague from VMware GSS Fintan Comyns for pointing this out.

What’s the point of setting “–IOPS=1” ?

Duncan Epping · Mar 30, 2010 ·

To be honest and completely frank I really don’t have a clue why people recommend setting “–IOPS=1” by default. I have been reading all these so called best practices around changing the default behaviour of “1000” to “1” but none of these contain any justification. Just to give you an example take a look at the following guide: Configuration best practices for HP StorageWorks Enterprise Virtual Array (EVA) family and VMware vSphere 4. The HP document states the following:

Secondly, for optimal default system performance with EVA, it is recommended to configure the round robin load balancing selection to IOPS with a value of 1.

Now please don’t get me wrong, I am not picking on HP here as there are more vendors recommending this. I am however really curious how they measured “optimal performance” for the HP EVA. I have the following questions:

What was the workload exposed to the EVA?
How many LUNs/VMFS volumes were running this workload?
How many VMs per volume?
Was VMware’s thin provisioning used?
If so, what was the effect on the ESX host and the array? (was there an overhead?)

So far none of of the vendors have published this info and I very much doubt, yes call me sceptical, that these tests have been conducted with a real life workload. Maybe I just don’t get it but when consolidating workloads a threshold of a 1000 IOPS isn’t that high is it? Why switch after every single IO? I can imagine that for a single VMFS volume this will boost the performance as all paths will be equally hit and load distribution on the array will be optimal. But for a real life situation where you would have multiple VMFS volumes this effect decreases. Are you following me? Hmmm, let me give you an example:

Test Scenario 1:

1 ESX 4.0 Host
1 VMFS volume
1 VM with IOMeter
HP EVA and IOPS set to 1 with Round Robin based on the ALUA SATP

Following HP’s best practices the Host will have 4 paths to the VMFS volume. However as the HP EVA is an Asymmetric Active Active array(ALUA) only two paths will be shown as “optimized”. (For more info on ALUA read my article here and Frank’s excellent article here.) Clearly when IOPS is set to 1 and there’s a single VM pushing IOs to the EVA on a single VMFS volume the “stress” produced by this VM would be equally divided on all paths without causing any spiky behaviour. In contrary to what a change of paths every “1000 IOs” might do. Although a 1000 is not a gigantic number it will cause spikes in your graphs.

Now lets consider a different scenario. Let’s take a more realistic one:

Test Scenario 2:

8 ESX 4.0 Hosts
10 VMFS volumes
16 VMs per volume with IOMeter
HP EVA and IOPS set to 1 with Round Robin based on the ALUA SATP

Again each VMFS volume will have 4 paths but only two of those will be “optimized” and thus be used. We will have 160 VMs in total on this 8 Host cluster and 10 VMFS volumes which means 16 VMs per VMFS volume. (Again following all best practices.) Now remember we will only have two optimized paths per VMFS volume and we have 16 VMs driving traffic to a volume, but not only 16 VMs this is also coming from 8 different hosts to these Storage Processors. Potentially each host is sending traffic down every single path to every single controller…

Let’s assume the following:

Every VM produces 8 IOps on average
Every host runs 20 VMs of which 2 will be located on the same VMFS volume

This means that every ESX host changes the path to a specific VMFS volume every 62 seconds(1000/(2×8)), with 10 volumes that’s a change every 6 seconds on average per host. With 8 hosts in a cluster and just two Storage Processors… You see where I am going? Now I would be very surprised if we would see a real performance improvement when IOPS is set to 1 instead of the default 1000. Especially when you have multiple Hosts running multiple VMs hosted on multiple VMFS volumes. If you feel I am wrong here or work for a Storage Vendor and have access to the scenarios used please don’t hesitate to join the discussion.

<update> Let me point out though that every situation is different, if you have had discussions with your storage vendor based on your specific requirements and configuration and this recommendation was given… Do not ignore it, ask why and if it indeed fits –> implement! Your storage vendor has tested various configurations and knows when to implement what, this is just a reminder that implementing “best practices” blind is not always the best option!</update>

Cool new HA feature coming up to prevent a split brain situation!

Duncan Epping · Mar 29, 2010 ·

I already knew this was coming up but wasn’t allowed to talk about it. As it is out in the open on the VMTN community I guess I can talk about it as well.

One of the most common issues experienced with VMware HA is a split brain situation. Although currently undocumented, vSphere has a detection mechanism for these situations. Even more important the upcoming release ESX 4.0 Update 2 will also automatically prevent it!

First let me explain what a split brain scenario is, lets start with describing the situation which is most commonly encountered:

4 Hosts – iSCSI / NFS based storage – Isolation response: leave powered on

When one of the hosts is completely isolated, including the Storage Network, the following will happen:

Host ESX001 is completely isolated including the storage network(remember iSCSI/NFS based storage!) but the VMs will not be powered off because the isolation response is set to “leave powered on”. After 15 seconds the remaining, non isolated, hosts will try to restart the VMs. Because of the fact that the iSCSI/NFS network is also isolated the lock on the VMDK will time out and the remaining hosts will be able to boot up the VMs. When ESX001 returns from isolation it will still have the VMX Processes running in memory. This is when you will see a “ping-pong” effect within vCenter, in other words VMs flipping back and forth between ESX001 and any of the other hosts.

As of version 4.0 ESX(i) detects that the lock on the VMDK has been lost and issues a question if the VM should be powered off or not. Please note that you will(currently) only see this question if you directly connect to the ESX host. Below you can find a screenshot of this question.

With ESX 4 update 2 the question will be auto-answered though and the VM will be powered off to avoid the ping-pong effect and a split brain scenario! How cool is that…