ESX

Changing the IP-address of an ESX host and HA

Duncan Epping · Jun 4, 2008 ·

Monday evening a colleague changed the ip-address of three VMware ESX hosts. He followed the standard VMware procedure, which usually works like a charm. In this case after the ip-address was changed HA did not work anymore. Disabling and enabling the HA resulted in the following error: “Configuration of host IP address is inconsistent on host …”

After a close inspection the following error was found in /var/log/vmware/vpx-rupgrade.log:

VMwareerrortext=ft_gethostbyname and hostname -i return different addresses: 10.21.10.81, 10.21.5.12 and 10.21.1.21

The command “hostname -i” resulted in the following:

[root@bla-01 /var/log/vmware]# hostname -i
10.21.1.21

The command “ft_gethostbyname” returned the following:

[root@bla-01 /opt/vmware/aam/bin]# ./ft_gethostbyname
10.21.10.81 bla-01
10.21.5.12 bla-01

So for some reason ESX resolved the wrong address. The hosts file wasn’t the problem, but FT_HOSTS which is automatically generated by the AAM Client(High Availability) was:

[root@bla-01 /etc]# more FT_HOSTS
# Auto-generated FT_HOSTS file. Timestamp: Mon Jun 2 19:05:09 2008
10.21.10.81 bla-01
10.21.5.12 bla-01
10.21.10.82 bla-02
10.21.5.14 bla-02
10.21.10.83 bla-03
10.21.5.16 bla-03

So I moved the FT_HOSTS to FT_HOSTS.BAK:

[root@bla-01 /etc]# mv FT_HOSTS FT_HOSTS.BAK

Reconfigured the cluster for HA and everything works like expected again:

[root@bla-01 /etc]# more FT_HOSTS
# Auto-generated FT_HOSTS file. Timestamp: Wed Jun 4 10:39:52 2008
10.21.1.21 bla-01
10.21.5.12 bla-01
10.21.1.22 bla-02
10.21.5.14 bla-02
10.21.1.23 bla-03
10.21.5.16 bla-03

Deleting the cluster, removing the hosts from the cluster and or reconfiguring HA did not once update the FT_HOSTS file. I would expect that with every “reconfigure for HA” action an update or check of the FT_HOSTS file would be done.

Good read: how many vm’s on 1 ESX host

Duncan Epping · May 25, 2008 ·

Check out this topic on the VMTN forum by Gabrie. It’s a good read about how many vm’s one would dare to run on an ESX host.

TexiWill:
This really depends. I know companies that are doing no more than a 10:1 or 20:1 compression, but there are other companies with 50+ VMs running on one box (at the time it was a DL760 with 8 CPUs and 64GBs of memory. I do know that the max vCPUs you can put on a system is still 8 * pCores and the larget box I have seen is the DL580G4 with 4 quad cores (16 cores) and 512GBs of memory….. So maximally 128 vCPUs…..

Ken.Cline:
I make this decision based on a couple things:

* – How important are the VMs in questions?
* If they’re truly “mission critical”, then I keep the number small – on the order of 10:1
* If they’re “important”, then let’s look at 20:1
* If they’re “who cares if they’re up”, then load ’em up!

* – How large is the environment? I like to deploy a minimum of two hosts (three makes me happier)
* 20 systems @ 2 hosts = 10:1, @ 3 hosts = 7:1
* 100 systems @ 2 hosts = I wouldn’t do it, @ 3 hosts = 34:1
* 1,000 systems – now you’re talking! @ 20 hosts = 50:1, @ 30 hosts = 34:1, @ 20 hosts = 50:1, @ 10 hosts = 100:1
* 10,000 systems – you can bet I’m going to have a few hosts with 50 to 60 (or more) VMs and some hosts with 10 (or less) VMs!

So, there’s not single “right” answer (other than “it depends”)

Site Recovery Manager, random thoughts

Duncan Epping · May 14, 2008 ·

I’ve been disconnected from the internet the last couple of days because of a UPC screw up. So I missed out on all the SRM blogging. I am just reading all the new blogs that were created over the last couple of days. Here are just a couple random thoughts…

The SRM docs talk about a “proteced” and a “recovery” site. Does this mean that SRM always needs an “active/passive” setup. In other words, can I only use 1 SAN for production.
Why? Well I could imagine that one would benefit from having some sort of load balancing with 2 VirtualDatacenters and 2 active SAN’s. Not only will the uptime increase, because if a site fails only half will failover, but also performance will be less of an issue having half of your active vm’s running on another SAN and cluster. And I’m not even gonna talk about TCO when half of your environment is doing nothing.
What about the VirtualCenter licensing? Do I need 2 VirtualCenter Licenses?
What about failing back? I played around with a beta and there’s no automated option for a fail back at this moment, when will it be available?
What about a third site? Think about switching datacenters every week or so for the insurance / banking industry. (Think Anti-Terrorism.)

VCB and Solaris 32 Bit VM’s

Duncan Epping · Apr 29, 2008 ·

One of my readers just emailed me the following, again thanks for this info which might me useful to any of you guys out there playing with VCB:

Today with the help of VMware Support I solved a strange problem.
With all my Solaris10-32Bit VM’ s I was getting an error, when I tried to backup them via VCB. Creating snapshot of the VM failed with “Creating a quiesced snapshot failed because the (user-supplied) custom pre-freeze script in the virtual machine exited with a non-zero return code”. But there doesn’t exists a pre or post script in all of the VMs.

So as you know, no snapshot means no backup of this VM. I monitored the hostd of the host, where the VM is running. There I saw this messages: ” Could not run custom freeze/thaw operation: Insufficient permissions in guest operating system”.

VMware support told me, that there is a problem within the VMTools in the Solaris VM’s. They know about this problem (I didn’t find anything about this in the internet) and will solve it in a future patch.

For now, the only way is to use the “-Q 0” switch with the vcbmounter command. This way VCB will ignore any pre or post scripts.

Christoph P.

So in short, -Q 0 disregards any pre or post scripts. Thanks Christoph for contributing to my blog!

Pegasus error after installing ESX 3.5 update 1

Duncan Epping · Apr 28, 2008 ·

After installing ESX 3.5 update 1 an error occurs during the boot proces:

Parsing error: parse error: Error adding class VMware_IdentityMemberOfCollection to the repository: CIM_ERR_NOT_FOUND: The requested object could not be found: “VMware_Identity”
Compiling omc-smash-interop-schema.mof into root/PG_Interop

A quick search on the VMTN forum revealed that I wasn’t the only one experiencing these problems. Luckily Mike Laspina already discovered how to fix this problem:

Here is what you will need to do.

Edit the roleauth-schema compiler directive to include the VMware_Identity class definition using nano /var/pegasus/vmware/install_queue/3_files/mofs/root/PG_Interop/roleauth-schema.mof

Add the bolded line above the pre-existing member directive.

#pragma include (“VMware_Identity.mof”)
#pragma include (“VMware_IdentityMemberOfCollection.mof”)

It also needs to be added in the standard cimv2 path.

nano /var/pegasus/vmware/install_queue/3_files/mofs/root/cimv2/roleauth-schema.mof

#pragma include (“VMware_Identity.mof”)
#pragma include (“VMware_IdentityMemberOfCollection.mof”)

Copy the missing file from the stardard cimv2 path to the shared path.

cp /var/pegasus/vmware/install_queue/3_files/mofs/root/cimv2/VMware_Identity.mof /var/pegasus/vmware/install_queue/3_files/mofs/root/PG_Interop/

Stop and start the service with these commands.

/etc/init.d/pegasus stop
/etc/init.d/pegasus start
Once the scripts completes the install_queues will be empty and the service will start much more quickly.

And according to the user mjilin VMware support is also aware and this issue will be addressed soon:

Dear ESX users,

Thanks for your timely feedback regarding upgrading to ESX/ESXi 3.5 Update 1.

As one user correctly pointed out, we use Pegasus to provide system management information, which third-party vendors can incorporate into their management applications.

We have identified the root cause of the issue and will provide fixes in an upcoming patch release. More information can be found in the Knowledge Base article 1004257.

Thanks for your information sharing in the community forum and keeping the discussion lively. We appreciate your support and feedback.

Best regards,

VMware ESX Product Tea