How to change the SRM change of power state time out values

One of my customers recently asked if it was possible to change the time-out for a power state change, at the same time this question was asked and answered on an internal mailing list. I thought it would be nice to document it. An example of a power state change task would be the shutdown that is initiated by SRM when you run a recovery plan. The default value is 120 seconds which might not be long enough and could lead to issues when a power off is forced. You can increase or decrease this value by editing the SRM configuration file (vmware-dr.xml). Look for the following section:

<Recovery>
<powerStateChangeTimeout>120</ powerStateChangeTimeout>
</Recovery>

Like stated above, the time-out value is in seconds. The default value is 120 and it can be changed according to your requirements. This change will be effective when the SRM service has been restarted. (If you can’t find this section in the XML file, just add it…)

Partitioning your ESX host – part II

A while back I published an article on partitioning your ESX host. This was based on 3.5, and of course with vSphere this has slightly changed. Let me start by quoting a section from the install and configure guide.

You cannot define the sizes of the /boot, vmkcore, and /vmfs partitions when you use the graphical or text installation modes. You can define these partition sizes when you do a scripted installation.

The ESX boot disk requires 1.25GB of free space and includes the /boot and vmkcore partitions. The /boot partition alone requires 1100MB.

The reason for this is the fact that the service console is a VMDK. This VMDK is stored on the local VMFS volume by default in the following location: esxconsole-<system-uuid>/esxconsole.vmdk. By the way, “/boot” has been increased as a “safety net” for future upgrades to ESX(i).

So for the manual installations there are three partitions less to worry about. I would advise to use the following sizes for the rest of the partitions, and I would also recommend to rename the local VMFS partition during installation. The default name is “Storage1″, my recommendation would be “<hostname>-localstorage”.

Primary:
/     - 5120MB
Swap  - 1600MB
Extended Partition:
/var  - 4096MB
/home - 2048MB
/opt  - 2048MB
/tmp  - 2048MB

With the disk sizes these days you should have more than enough space for a roughly 18GB for ESX in total.

vSphere performance

The last couple of weeks I’ve seen all these performance numbers(most not publicly available though)  of vSphere, one even more impressing than the other. I think every one will agree that the latest one is really impressive, 364.00 IOPS is just insane. There’s no load vSphere can’t handle, when correctly sized of course.

But something that even made a bigger impression on me, as a consolidation fanatic, is the following line from the latest performance study:

VMware’s new paravirtualized SCSI adapter (pvSCSI) offered 12% improvement in throughput at 18% less CPU cost compared to LSI virtual adapter

Now this may not sound like much, but when you are running 50 hosts it will make a difference. It will save you on cooling / rack space / power / hardware / maintenance, in other words this will have it’s effect on your ROI and TCO. This is the kind of info that I would love to see more, where did we cut down on “overhead”… Which improvements will make our consolidation numbers go up?!

Block sizes and growing your VMFS

I had a discussion on block sizes after the post on thin-provisioned disks with some of my colleagues. For those that did not read this post here’s a short recap:

If you create a thin provisioned disk on a datastore with a 1MB blocksize the thin provisioned disk will grow with increments of 1MB. Hopefully you can see where I’m going. A thin provisioned disk on a datastore with an 8MB blocksize will grow in 8MB increments. Each time the thin-provisioned disk grows a SCSI reservation takes place because of meta data changes. As you can imagine an 8MB blocksize will decrease the amount of meta data changes needed, which means less SCSI reservations. Less SCSI reservations equals better performance in my book.

As some of you know the locking mechanism has been improved with vSphere, yes there’s a good reason why they call it “optimistic locking”. In other words why bother increasing your block size if the locking mechanism has improved?

Although the mechanism behaves differently it does not mean that locking does not need to occur. In my opinion it’s still better to have 1 lock vs 8 locks if a VMDK need to grow. But there’s another good reason, with vSphere comes growable VMFS volumes. You might start with a 500GB VMFS volume and a 1MB block size, but when you expand the disk this block size might not be sufficient when you create new VMs. Keep in mind that you can’t modify the block size, while you just might have given people the option to create disks beyond the limit of the block size. (Mind: you will receive an error, it’s not possible.)

So what about overhead? Will my 1KB log files all be created in 8MB blocks? Cause this would mean a large overhead and might be a valid reason to use 1MB block sizes!

No it will not. VMFS-3 solves this issue by offering a sub-block allocator. Small files use a sub block to reduced overhead. A sub block of a 1MB block size volume is 1/16th the size of the block. For an 8MB block size volume it’s 1/128th. In other words, the sub-blocks are 64KB large in both cases and thus the overhead is the same in both cases as well.

Now my question to you guys, what do you think? Would it make sense to always use an 8MB blocksize… I think it would

Unique LUN IDs?

I posted an article on LUN IDs and VCB in November 2008. It still seems to be a misconception that ESX uses LUN IDs to uniquely identify a LUN. As of 3.5 this isn’t the case anymore. When an array has “NAA Identifier” capabilities these will be used for uniquely identifying LUNs. And yes most arrays, currently, have these capabilities.

The NAA ID is also what’s being used to identify SAN LUN snapshot/clones by the way. ESX 3.5 compares the LUN ID to the metadata of the VMFS header, if it’s a different ID ESX knows it can’t be the same LUN that’s being presented and ignores it. If you do want to use the LUN you would either have to resignature it or set “disallowsnapshotlun” to “0″ of course…

Keep in mind, it’s still a best practice to use consistent LUN IDs throughout your environment. ESX doesn’t care anymore, but your life is a Sys Admin will be a lot easier if you use unique and consistent LUN numbering.

Determining vClone growth rate

A quick intro: my name is Ian Gibbs, and I’m a former VMware PS Consultant like Duncan currently is. I’ve just transitioned to a role with a VMware customer where I’m responsible for delivering VMware View for real to 3000 users. Duncan has kindly invited me to share the things I learn with the world. I hope to post some scripts and things I create to help make life easier for you all.

This week I have been redesigning the storage layout for the View implementation. There’ll be a few other posts to come on this as it has turned out to be a massive topic, but this sub-task has been to determine the rate at which the vClone disks are growing so that I can size the datastores properly. We have around 100 pilot VMs and I wanted to see how big each vClone disk was versus its age. This turns out to be harder than you’d imagine, as ESX/Linux/Unix file systems  don’t store file creation times. Anyway, a script was required and duly created. I hope you too find it useful. To use it:

  1. Download the script here
  2. Get the script on to an ESX server that can see the DS that contains the VMs you are interested in
  3. Mark it executable with chmod +x <script-filename>
  4. Install the bc RPM from here
  5. Run it and redirect the output to a CSV file.

My results average out roughly like this:

3hrs: 560Mb

20hrs: 700Mb

90hrs: 850Mb

We’ve moved the pagefile off C: to the UDD so my results will probably be lower than yours. Now to find out why it goes to half a gig so quickly…

Storage views, exploring the next version of ….

I was playing around with vSphere this weekend while replying to topics on the VMTN Community. One of the things often asked is storage reporting… (Snapshot info / Disk utilization etc) With ESX 3.5 / vCenter 2.5 it needs to be scripted and can be integrated into vCenter by using custom fields, but as you can imagine not everyone would like to add custom functionality to vCenter.

As of the next version of ESX/vCenter aka vSphere you can just click the storage tab on a host or VM. The following is the storage tab of a VM, click the pic for a large version:

And of course the storage view on a host:

And of course the storage view on a host:

And what about the new map functionality, drilling down to HBA level