SPC-2 set or not?

For those like me who see different types of Arrays daily it is hard to keep up with all the specific settings that need to be configured. Especially when we are talking about enterprise level storage there are several dependencies and requirements.

One of the settings that is often overlooked on EMC DMX storage is the SPC-2 bit. I already noticed a while back what kind of impact it can have on your environment and witness it again today.

During the creation of a VMFS volume we received an error which basically stated that it was impossible to create the volume. The error message was a bit misleading but I noticed in the detailed section that the LUN was identified as “sym.<identifier string>”. This normally should state “naa.<identifier string>” and that triggered me to check the documentation of the array.

When an additional front-end port is zoned to an ESX host, to provide further connectivity to devices, the SPC-2 bit must be set; otherwise, the Symmetrix devices will not be properly identified. Instead of identifying each device with their proper Network Authority Address (NAA), the devices will show up with a SYM identification number. Any device provisioned to the non-SPC-2 compliant port will be identified as a new device by the ESX host system.

Again, it is hard to keep up with every single vendor out there. Let alone all the different type of arrays and all the different settings. Luckily EMC acknowledged that and created the “EMC Storage Viewer for vSphere”. The EMC Storage Viewer actually shows you if the “SPC-2″ (amongst other settings) is enabled or not… This will save you a lot of pain and discussion with the Storage Team when push comes to shove. Definitely one of the reasons I would recommend to use this plugin.

For those facing spc-2 bit issues make sure to read “H4116-enabling-spc2-compl-emc-symmetrix-dmx-vmware-envnmt-wp.pdf”. (Available via EMC’s powerlink.)

Iomega IX4-200d

This friday I received a package. I felt 10-years old again, it was like unwrapping a Christmas present. One hell of a Christmas present I must say and I want to thank EMC and especially Chad Sakac! I un-boxed the two Iomega IX4-200d units and turned them on.

After a couple of minutes I had them up and running. It’s a matter of turning them on and wait until they receive an ip-address from your DHCP server. Of course I changed the DHCP address to a fixed address, this is literally a couple of clicks.

I guess that’s the story of the Iomega IX4-200d, everything is just a couple of clicks. You want to enable iSCSI? Three clicks. You want to set Quotas? Three clicks. You want to add a user? Three clicks… I see a trend don’t you?!

Within a matter of minutes I did not only had both devices running I also setup a replication schedule for the CIFS share… That’s another great thing about this device: CIFS, NFS, iSCSI, Apple File Sharing and FTP. There should at least be one that fits your needs. I will try to do some decent testing soon when I receive my new desktop and a decent 1Gb switch…

For now I can recommend the Iomega IX4-200d to everyone. They are simple to use and look awesome.

Performance: Thin Provisioning

I had a discussion about Thin Provisioning with a colleague last week. One of the reasons for me not to recommend it yet for high I/O VMs was performance. I had not seen a white-paper or test yet that showed their was little impact of growing the VMDK. Eric Gray of Vcritical.com had the scoop, VMware just published an excellent whitepaper called “Performance study of VMware vStorage Thin Provisioning“. I highly recommend it!

Surprisingly enough there is no performance penalty for writing to a Thin Provisioned VMDK when it comes to locking. I expected that due to SCSI reservations there would at least be some sort of hit but there isn’t. (Except for zero’ing of course, see paragraph below) The key take away for me still is: operational procedures.

Make sure you set the correct alarms when thin provisioning a VMDK. You need to regularly check what the level of “overcommitment” is, what the total capacity is and the percentage of disk space still available.

Another key take away is around performance though:

The figure shows that the aggregate throughput of the workload is around 180MBps in the post-zeroing phase of both thin and thick disks, and around 60MBps when the disks are in zeroing phase.

In other words, when the disk is zeroed out while writing there’s a HUGE and I mean HUGE performance hit. To avoid this for thick disks there’s an option called “eager zeroed thick”. Although this type is currently only available from the command line and takes longer to provision, as it zeroes out the disk on creation, it could lead to a substantial performance increase. This would only be beneficial for write intensive VMs of course, but it definitely is something that needs to taken into account.

Please note: On page two, bottom, it states that VMDKs on NFS are thin by default. This is not the case. It’s the NFS server that dictates the type of disks used. (Source: page 99)

VMFS Metadata size?

When designing your VMware vSphere / VI3 environment there are so many variables you need to take into account that it is easy to get lost. Something hardly anyone seem to be taking into account when creating VMFS volumes is that the metadata will also take up a specific amount of disk space. You might think that everyone has at least 10% disk space free on a VMFS volume but this is not the case. Several of my customers have dedicated VMFS volumes for a single VMDK and  noticed during the creation of a VMDK that they just lost a specific amount of MBs. Most of you guessed by now that that is due to the metadata but how much disk space will the actually metadata consume?

There’s a simple formula that can be used to calculate how much disk space the metadata will consume. This formula used to be part of the “SAN System Design and Deployment Guide” (January 2008) but seems to have been removed in the updated versions.

Approximate metadata size in MB = 500MB + ((LUN Size in GB – 1) x 0.016KB)

For a 500GB LUN this would result in the following:

500 MB + ((500 - 1) x 0.016KB) = 507,984 MB
Roughly 1% of the total disk size used for metadata

For a 1500MB LUN this would result in the following:

500 MB + ((1.5 - 1) x 0.016KB) = 500,008 MB
Roughly 33% of the total disk size used for metadata

As you can see for a large VMFS volume(500GB) the disk space taken up by the metadata is only 1% and can almost be neglected, but for a very small LUN it will consume a lot of the disk space and needs to be taken into account….

[UPDATE]: As mentioned in the comments, the formula seems to be incorrect. I’ve looked into it and it appears that this is the reason it was removed from the documentation. The current limit for metadata is 1200MB and this should be the number you should use for sizing your datastores.

Changing the block size of your local VMFS during the install…

I did not even knew it was possible but on the VMTN Community Forums user PatrickD revealed a workaround to set a different block size for your local VMFS. Of course the question remains why you would want to do this and not create a dedicated VMFS for your Service Console and one for your VMs. Anyway, it’s most definitely a great work around thanks Patrick for sharing this.

There isn’t an easy way of doing that right now. Given that a number of people have asked for it we’re looking at adding it in future versions.

If you want to do this now, the only way to do it is by mucking around with the installer internals (and knowing how to use vi). It’s not that difficult if you’re familiar with using a command line. Try these steps for changing it with a graphical installation:

  1. boot the ESX installation DVD in text mode
  2. switch to the shell (Alt-F2)
  3. ps | grep Xorg
  4. kill the PID which comes up with something like “Xorg -br -logfile …”. On my system this comes up as PID 590, so “kill 590″
  5. cd /usr/lib/vmware/weasel
  6. vi fsset.py
  7. scroll down to the part which says “class vmfs3FileSystem(FileSystemType):”
  8. edit the “blockSizeMB” parameter to the block size that you want. it will currently be set to ’1′. the only values that will probably work are 1, 2, 4, and 8.
  9. save and exit the file
  10. cd /
  11. /bin/weasel

After that, run through the installer as you normally would. To check that it worked, after the installer has completed you can go back to a different terminal (try Ctl-Alt-F3 since weasel is now running on tty2) and look through /var/log/weasel.log for the vmfstools creation command.

Hope that helps.

Block sizes, think before you decide

I wrote about block sizes a couple of times already but I had the same discussion twice over the last couple of weeks at a customer site and on Twitter(@VirtualKenneth) so lets recap. First the three articles that started these discussions: vSphere VM Snapshots and block size, That’s why I love blogging… and Block sizes and growing your VMFS.

I think the key take aways are:

  • Block sizes do not impact performance, neither large or small, as the OS dictates the block sizes used.
  • Large block sizes do not increase storage overhead as sub-blocks are used for small files. The sub-blocks are always 64KB.
  • With thin provisioning there theoretically are more locks when a thin disk is growing but the locking mechanism has been vastly improved with vSphere which means this can be neglected. A thin provisioned VMDK on a 1Mb block size VMFS volume grows in chunks of 1MB and so on…
  • When separating OS from Data it is important to select the same block size for both VMFS volumes as other wise it might be impossible to create snapshots.
  • When using a virtual RDM for Data the OS VMFS volume must have an appropriate block size. In other words the maximum file size must match the RDM size.
  • When growing a VMFS volume there is no way to increase the block size and maybe you will need to grow the volume to grow the VMDK. Which might possibly be beyond the limit of the maximum file size.

My recommendation would be to forget about the block size. Make your life easier and standardize, go big and make sure you have the flexibility you need now and in the future.

EMC Powerpath/VE

My colleague Lee Dilworth, SRM/BC-DR Specialist, pointed me out to an excellent whitepaper by EMC. This whitepaper describes the difference between Powerpath/VE and MRU, Fixed and Round Robin.

Key results:

  • Powerpath/VE provides superior load-balancing performance across multiple paths using FC or iSCSI.
  • Powerpath/VE seamlessly integrates and takes control of all device I/O, path selection, and failover without the need for additional configuration.
  • VMware NMP requires that certain configuration parameters be specified to achieve improved performance.

I recommend reading the whitepaper to get a good understanding of where a customer would benefit from using EMC Powerpath/VE. The whitepaper gives a clear picture of the load balancing capabilities of Powerpath/VE compared to MRU, Fixed and Round Robin.  It also shows that there’s less manual configuration to be done when using Powerpath/VE, and as just revealed by Chad Sakac on twitter an integrated patching solution will be introduced with ESX/vCenter 4.0 Update 1!