20% discount on the vSphere 4.0 Quick Start Guide

Just noticed that you can get 20% discount if you order you vSphere Quick Start Guide via Lulu.com. The coupon code is “HOHOHO”, indeed Merry Christmas to all of you.

ESXi Lessons Learned 2 revised

I mentioned in ESXi Lessons Learned 2 that Jumbo Frames were not supported for the VMkernel. It seems that it’s a mistake in the ESXi Config Guide as mentioned in this blog article on the VMware ESXi Chronicles Blog. Charu already contacted me about it but we needed to wait to update you guys before we received the definitive green light.

Charu Chaubal:

I am happy to say that this is merely an error in the documentation.  In fact, ESXi 4.0 DOES support Jumbo Frames on VMkernel networking interfaces.  The correction will hopefully appear in a new release of the documentation, but in the meantime, go ahead and configure Jumbo frames for your ESXi 4.0 hosts.

Cool Tool Update: RVTools 2.7.3

Rob de Veij has just updated RVTools to version 2.7.3. This excellent tool now includes the following additional features and fixes:

Version 2.7.3 (December 19, 2009)

  • Files in .snapshot directories are no longer reported as zombies.
  • CTK files are no longer reported as zombies.
  • The problems with VM files which are placed in the root directory are now solved.
  • Under some condition the filter screen terminated with an exception. This is fixed now.
  • New fields on vDisk tab: ThinProvisioned and split.
  • New field on vTools tab: Virtual machine hardware version.

Read the full releasenotes/documentation here. And make sure to get this new version, in my opinion this is one of the most valuable tools out there.

Where should you get SRA’s from?

I received Michael White’s(VMware BCDR Specialist SE) weekly newsletter over the weekend and the following is a question I also receive on a regular basis so why not blog it?!

I had a disagreement with a friend about where to get SRA’s from. He was under the impression that we didn’t have the arrays in our premises for all of the SRA’s on the market and so it was OK to take an SRA from a vendor as they could test it.  The fact is we do have most, or all of  the arrays for each SRA in-house but that is actually not relevant.  It is important to only take SRA’s from the VMware web site for a different reason.  When a vendor finishes updating or writing an SRA, it is run against a special program that produces a log.  The SRA and log are sent to VMware and we check them out.  Sometimes they are sent back for improving or fixes.  This continues until the SRA passes and then it is posted on our web site.  If you took the SRA from the vendor you may accidentally get an SRA that in a week or a month we might decline and send back to be fixed.  So please, make sure you get the only safe copy of an SRA available, and that is from our web site!

vscsiStats output in esxtop format?

This week we(Frank Denneman and I) played around with vscsiStats, it’s a weird command and hard to get used to when you normally dive into esxtop when there are performance issues. While asking around for more info on the metrics and values someone emailed us nfstop. I assumed it was NDA or at least not suitable for publication yet  but William Lam pointed me to a topic on the VMTN Communities which contains this great script. Definitely worth checking out. This tool parses the vscsiStats output into an esxtop format. Below a screenshot of what that looks like:

vscsiStats

I was doing performance troubleshooting with Frank Denneman this week and we wanted to use “vscsiStats” to verify if there was any significant latency.

We checked multiple whitepapers before we went onsite and our primary source was this excellent article by Scott Drummonds. After start vscsiStats and receiving a “successful started”  we waited for 15 minutes and verified if we could see any data at all. Unfortunately we did not see anything. What is happening here? We checked the build/patch level and it was ESX 3.5 Update 4. Nothing out of the ordinary I would say. After trying several VMs we still did not see anything with “vscsiStats -s -w <worldID>”. For some weird reason, in contrary to what all blog articles are stating and what Scott Drummonds states we had to use the following command:

vscsiStats -s -t -w <worldID>

This might not be the case in most situations, but again we had to add “-t” to capture any data. You can find the world ID of the VM you want to monitor the performance by using the following command:

vscsiStats -l

After a couple of minutes you can verify if any data is being collected by using the following command:

vscsiStats -p all -w <worldID>

If you want to save your data in a CSV file to import it in Excel use the following:

vscsiStats -p all -c -w <worldID> > /tmp/vmstats-<vmname>.csv

Don’t forget to stop the monitoring:

vscsiStats -x -w <worldID>

So what’s the outcome of this all? Well with vscsiStats you can create great diagrams which for instance show the latency. This can be very useful in NFS environments as esxtop does not show this info:

If you don’t want to do this by hand, check out this article by Gabe.

Cleaning up orphaned replicas in View

If, like me, you have been through all the versions of View Composer and the broker since its introduction, various bugs and broken recompositions will have left you with a large amount of detritus in your VMwareViewComposerReplicaFolder, making it hard to keep an eye on the proper operation of the Composer, and in my case, causing a datastore to run out of space and subsequent operations to fail. Time for a clean up.

This is decently documented here, but how do you know which ones you can delete?

I don’t and have never worked in the Composer team, so corrections and additions welcome on the below especially where I have marked (???), but observation of the tasks shows the process is as follows:

  1. Copy parent VM at certain snapshot to a new VM called temp-<ridiculousGUID> in the same place as the parent VM
  2. Delete that VM (??? Clearly something else is happening, but you watch the tasks)
  3. Copy that VM to each datastore and register as replica-<ridiculousGUID> (???)
  4. Create a linked clone off each replica in the same datastore, and register as source-<ridiculousGUID>
  5. For each VM, copy the source VM to a new directory and link it back to the replica

All well and good until this process breaks down and you’re left with the broken bodies of hapless VMs lying around. So you should have one source and one replica VM for each parent snapshot deployed in each datastore. The formula is

VMs in replica folder = <Num. parent snapshots in active use> x <Num. datastores> x 2

In my environment I have one parent VM snapshot in use by 40 VMs spread across 4 datastores. So:

1 snapshot x 4 datastores x 2 = 8 VMs in replica folder

So I should have 8 in there. What do I actually have?

Where did those two temps come from?

Err, 10. Those two temp- VMs ought to have been deleted by the composer. This is the view after I’d done aload of cleaning up – I originally had all sorts of dead source and replica VMs in there. How do I know which ones are actively in use and which can be deleted? A simple tip is to change the value of the Notes property of the parent VM, and redeploy your clones. Anything the Composer is still properly in charge of and not using will be deleted automatically. Anything else will be very visible. Look at that image again, and you’ll see that the two temp VMs have different date values in the Notes column. They are from a previous snapshot, and can be deleted. Follow the process in the link above to unprotect them, and then right-click and Delete from disk.

I have now deleted about 15 different source, replica and temp VMs in this way, and all operation is still normal.

Ian

Subscribe to RSS Feed Follow me on Twitter!