Service Console redundancy
The last couple of weeks more blogs and topics appeared around the warning VirtualCenter gives when there’s no service console redundancy. Several people posted about a workaround to clear this warning. The workaround is very easy: temporarily assign an additional nic to the service console vSwitch and reconfigure your HA. Notice that I used ” workaround” cause I definitely don’t see this as a solution for the problem. With the current technology there’s not much reason not to have a redundant service console in my opinion, especially when you are using HA. I know a nic hardly ever breaks but in this case probably more than 8 VM’s rely on this nic, the physical switch and the network cable it’s attached to. When I do VMware implementations it depends on the customer and the hardware which of the following three options I use. All are supported by VMware and each have their own pros and cons:
- vSwitch0 - 2 Physical nics(vmnic0 & vmnic2) - 2 Portgroups(Service Console & VMkernel)
Service Console active on vmnic0 and standby on vmnic2
VMkernel active on vmnic2 and standby on vmnic0
Each portgroup has a VLAN assigned and runs dedicated on its own nic, only in the case of a fault it’s switched over to the standby nic, but it will return to the original nic when the connection is up again. This is achieved by setting Rolling Failover to NO! In 3.5 this feature is named “Failback” and has to be set to YES!Pros: only need 2 nics in total for the Service Console and VMkernel, especially handy in Blade environments.
Cons: If the connection is dropped several time it will cause the nic to failover a lot which can cause HA to kick in. Need to set the Failure Detection Time to 30 seconds apposed to the 20 seconds in option 3. And need to have VLAN’s setup. - vSwitch0 - 2 Physical nics(vmnic0 & vmnic2) - 1 Portgroup(Service Console)
Service Console active on vmnic0 and vmnic2 with “virtual port id” load balancing.
vSwitch1 - 2 Physical nics(vmnic1 & vmnic3) - 1 Portgroup(VMkernel)
VMkernel active on vmnic1 and vmnic3 with “virtual port id” load balancing.
Each portgroups can have a VLAN ID assigned but you can also setup VLAN’s on the side of the physical switch.Pros: When network engineers want to keep VLAN configuration on the physical switch it’s possible with this setup. You can set Rolling Failover to yes(or Failback to No), this way it will not start “flapping”. Portgroups are active on both nics to keep switching over time as low as possible.
Cons: Need extra nics and less flexible with VLAN’s if it’s not tagged by VMware. Best practice is to set Failure Detection Time to at least 30 seconds. - vSwitch0 - 1 Physical nic(vmnic0) - 1 Portgroup(Service Console)
Service Console active on vmnic0.
vSwitch1 - 2 Physical nics(vmnic1 & vmnic3) - 2 Portgroups(VMkernel & Secondary Service Console)VMkernel active on vmnic1 and svmnic3 with “virtual port id” load balancing. Secondary Service Console active with an IP on the same subnet as the VMkernel, but a different subnet as the primary Service Console.Pros: You can define a lower “failure detection time” because of the fact that the service console is already active and doesn’t need to kick in. Failure Detection Time can be set to 20 seconds. No Spanning Tree problems for the Service Console will occur because it has two vswifs, and indeed 2 mac addresses.
Cons: Need to set an extra isolation address, and secondary Service Console needs to be in a different subnet because if you use the same subnet as the primary Service Console both IP adresses would resolve to the same mac. (See theether link below for more info on that one.)
I’ve implemented option 2 a lot, but it’s very prone to physical switch errors and spanning tree problems. Which made me reconsider and I think that option 3 is the less error prone, and in case of a failover or when HA needs to kick in it will within 20 seconds.
For more info check out these links:
VMware KB Article on Redunant SC’s
VMware KB Article on Isolation Addresses
VMware KB Article on HA best practices
Theether article on a secondary Service Console
Help on vSwitch settings for 3.5




January 14th, 2008 at 20:03
Service Console redundancy …
Duncan Epping over at Yellow Bricks revamped his new website and it looks great. The Yellow Bricks website just became online recently but already contains a lot of informative articles. Today Duncan published an article about how to workaround servic…
January 17th, 2008 at 10:33
[…] Klick! […]
January 21st, 2008 at 20:56
[…] vSwitch will also remove the warning. Duncan of Yellow Bricks also goes into more detail on Service Console redundancy on his blog as […]
January 28th, 2008 at 18:22
Need a little clarification on ‘Service Console redundancy’ in a Blade environment.
My Blades only have 2 NICs total.. how can I setup this redundancy (VM Kernel + SC) and still get my VM Network on a Vswitch?
In reading through the solution above it appears to have SC + VM Kernel on one Vswitch having the Physical NICs teamed to that…
Being somewhat new to VMware I am not seeing how to setup my limited 2 physical NICs to get everything we need. Right now the only solution I see is to create two Vswitches, one with SC + VM Network and a second Vswitch with SC + VM Kernel ?
Pardon my ignorance , - John
January 28th, 2008 at 18:46
Not ignorance… 2 nics, you probably have Dell Blades?
I would suggest bundle the 2 nics into 1 vswitch, and create portgroups for the VMkernel, Service Console and the different vlan’s your need. I’ve did the same setup about a year ago for a customer with Dell 1855 Blades. Works perfect if and when vlan’s are in place!
February 27th, 2008 at 19:40
Hi,
Reading the “Service Console redundancy, January 14th, 2008″ article, its 3rd option is not so clear about the way the Secondary Service Console should be set regarding the nic configuration. Is it to be set as active on both nics, or active passive?
Could you please bring some more light into this subject and get this config in more detail?
Thank you in advance,
JNBP
Reply to: jpacheco@dre.pt