5.5

VMworld session on vSphere Metro Storage Cluster on youtube!

Duncan Epping · Nov 16, 2013 ·

I didn’t even realize this, but just found out that the session Lee Dilworth and I did at VMworld on the subject of vSphere Metro Storage Clusters can actually be viewed for free on youtube!

There are some more sessions up on youtube, so make sure you have a look around!

VSAN performance: many SAS low capacity VS some SATA high capacity?

Duncan Epping · Nov 14, 2013 ·

Something that I have seen popping up multiple times now is the discussion around VSAN and spindles for performance. Someone mentioned on the community forums they were going to buy 20 x 600GB SAS drives for their VSAN environment for each of their 3 hosts. These were 10K SAS disks, which obviously outperform the 7200 RPM SATA drives. I figured I would do some math first:

Server with 20 x 600GB 10K SAS = $9,369.99 per host
Server with 3 x 4TB Nearline SAS = $4,026.91 per host

So that is about a 4300 dollar difference. Note that I did not spec out the full server, so it was a base model without any additional memory etc, just to illustrate the Perf vs Capacity point. Now as mentioned, of course the 20 spindles would deliver additional performance. Because after all you have additional spindles and better performing spindles. So lets do the math on that one taking some average numbers in to account:

20 x 10K RPM SAS with 140 IOps each = 2800 IOps
3 x 7200 RPM NL-SAS with 80 IOps each = 240 IOps

That is a whopping 2560 IOps difference in total. That does sound like an awe full lot doesn’t it? To a certain extent it is a lot, but will it really matter in the end? Well the only correct answer here is: it depends.

I mean, if we were talking about a regular RAID based storage system it would be clear straight away… the 20 disks would win for sure. However we are talking VSAN here and VSAN heavily leans on SSD for performance. Meaning that each diskgroup is fronted by an SSD and that SSD is used for both Read Caching (70% of capacity) and write buffering (30%) of capacity. Illustrated in the diagram below.

The real question is what is your expected IO pattern? Will most IO come from read cache? Do you expect a high data change rate and as such could de-staging be problematic when backed by just 3 spindles? Then on top of that, how and when will data be de-staged? I mean, if data sits in write buffer for a while it could be the data changes 3 or 4 times before being destaged, preventing the need to hit the slow spindles. It all depends on your workload, your IO pattern, your particular use case. Looking at the difference in price, I guess it makes sense to ask yourself the question what $ 4300 could buy you?

Well for instance 3 x 400GB Intel S3700 capable of delivering 75k read IOps and 35k write IOps (~800 dollars per SSD). That is extra, as with the server with 20 disks you would also still need to buy SSD and as the rule of thumb is roughly 10% of your disk capacity you can see what either the savings are or the performance benefits could be. In other words, you can double up on the cache without any additional costs compared to the 20-disk server. I guess personally I would try to balance it a bit, I would go for higher capacity drives but probably not all the way up to 4TB. I guess it also depends on the server type you are buying, will they have 2.5″ drive slots or 3.5″? How many drive slots will you have and how many disks will you need to hit the capacity requirements? Are there any other requirements? As this particular user mentioned for instance he expected extremely high sustained IOs and potentially full backups daily, as you can imagine that could impact the number of spindles desired/required to meet performance expectations.

The question remains, what should you do? To be fair, I cannot answer that question for you… I just wanted to show that these are all things one should think about before buying hardware.

Just a nice little fact, today a VSAN host can have 5 Disk Groups with 7 disks, so 35 disks in total. With 32 hosts in a cluster that is 1120 disks… That is some nice capacity right with 4TB disks that are available today.

I also want to point out that a tool is being developed as we speak which will help you making certain decisions around hardware, cache sizing etc. Hopefully more news on that soon,

** Update, as of 26/11/2013 the VSAN Beta Refresh allows for 7 disks in a disk group… **

VSAN and Network IO Control / VDS part 2

Duncan Epping · Nov 12, 2013 ·

About a week ago I wrote this article about VSAN and Network IO Control. I originally wrote a longer article that contained more options for configuring the network part but decided to leave a section out of it for simplicity sake. I figured as more questions would come I would publish the rest of the content I developed. I guess now is the time to do so.

In the configuration described below we will have two 10GbE uplinks teamed (often referred to as “etherchannel” or “link aggregation”). Due to the physical switch capabilities the configuration of the virtual layer will be extremely simple. We will take the following recommended minimum bandwidth requirements in to consideration for this scenario:

Management Network –> 1GbE
vMotion VMkernel –> 5GbE
Virtual Machine PG –> 2GbE
Virtual SAN VMkernel interface –> 10GbE

When the physical uplinks are teamed (Multi-Chassis Link Aggregation) the Distributed Switch load balancing mechanism is required to be configured as:

IP-Hash
or
LACP

It is required to configure all portgroups and VMkernel interfaces on the same Distributed Switch using either LACP or IP-Hash depending on the type physical switch used. Please note all uplinks should be part of the same etherchannel / LAG. Do not try to create anything fancy here as a physically and virtually incorrectly configured team can and probably will lead to more down time!

Management Network VMkernel interface = LACP / IP-Hash
vMotion VMkernel interface = LACP / IP-Hash
Virtual Machine Portgroup = LACP / IP-Hash
Virtual SAN VMkernel interface = LACP / IP-Hash

As various traffic types will share the same uplinks we also want to make sure that no traffic type can push out other types of traffic during times of contention, for that we will use the Network IO Control shares mechanism.

We will work under the assumption that only have 1 physical port is available and all traffic types share the same physical port for this exercise. Taking a worst case scenario approach in to consideration will guarantee performance even in a failure scenario. By taking this approach we can ensure that Virtual SAN always has 50% of the bandwidth to its disposal while leaving the remaining traffic types with sufficient bandwidth to avoid a potential self-inflicted DoS. When both Uplinks are available this will equate to 10GbE, when only one uplink is available the bandwidth is also cut in half; 5GbE. It is recommended to configure shares for the traffic types as follows:

Traffic Type	Shares	Limit
Management Network	20	n/a
vMotion VMkernel Interface	50	n/a
Virtual Machine Portgroup	30	n/a
Virtual SAN VMkernel Interface	100	n/a

The following diagram depicts this configuration scenario.

vSphere Metro Storage Cluster using Virtual SAN, can I do it?

Duncan Epping · Oct 31, 2013 ·

This question keeps on coming up over and over again lately, vSphere Metro Storage Cluster (vMSC) using Virtual SAN, can I do it? (I guess the real question is “should you do it”.) It seems that Virtual SAN/VSAN is getting more and more traction, even though it is still beta and people are trying to come up with all sorts of interesting usecases. At VMworld various people asked if they could use VSAN to implement a vMSC solution during my sessions and the last couple of weeks this question just keeps on coming up in emails etc.

I guess if you look at what VSAN is and does it makes sense for people to ask this question. It is a distributed storage solution with a synchronous distributed caching layer that allows for high resiliency. You can specify the number of copies required of your data and VSAN will take care of the magic for you, if a component of your cluster fails then VSAN can respond to it accordingly. This is what you would like to see I guess:

Now let it be clear, the above is what you would like to see in a stretched environment but unfortunately not what VSAN can do in its current form. I guess if you look at the following it becomes clear why it might not be a such a great idea to use VSAN for this use case at this point in time.

The problem here is:

Object placement: You will want that second mirror copy to be in Location B but you cannot control it today as you cannot define “failure domains” within VSAN at the moment.
Witness placement: Essentially you want to have the ability to have a 3rd site that functions as a tiebreaker when there is a partition / isolation event.
Support: No one has tested/certified VSAN over distance, in other words… not supported

For now, the answer is to the question can I use Virtual SAN to build a vSphere Metro Storage Cluster is: No, it is not supported to span a VSAN cluster over distance. The feedback and request from many of you has been heard loud and clear by our developers and PM team… And at VMworld it was already mentioned by one of the developers that he was intrigued by the use case and he would be looking in to it in the future. Of course, there was no mention of when this would happen or even if it would ever happen.

Virtual SAN and Network IO Control

Duncan Epping · Oct 29, 2013 ·

Since I started playing with Virtual SAN there was something that I more or less avoided / neglected and that is Network IO Control. However, Virtual SAN and Network IO Control should go hand-in-hand. (And as such the Distributed Switch.) Note that when using VSAN (beta) the Distributed Switch and Network IO Control come with it. I guess I skipped it as there were more exciting thing to talk about, but as more and more people are asking about it I figured it is time to discuss Virtual SAN and Network IO Control. Before we get started, lets list the type of networks we will have within the VSAN cluster:

Management Network
vMotion Network
Virtual SAN Network
Virtual Machine Network

Considering it is recommend to use 10GbE with Virtual SAN that is what I will assume with this blog post. In most of these cases, at least I would hope, there will be a form of redundancy and as such we will have 2 x 10GbE to our disposal. So how would I recommend to configure the network?

Lets start with the various portgroups and VMkernel interfaces:

1 x Management Network VMkernel interface
1 x vMotion VMkernel interface (All interfaces need to be in the same subnet)
1 x Virtual SAN VMkernel interface
1 x Virtual Machine Portgroup

Some of you might be surprised that I have only listed the vMotion VMkernel interface and the Virtual SAN VMkernel interface once… And after various discussions and thinking about this for those I figured I would keep things as simple as possible, especially considering the average IO profile of server environments.

By default we can make sure the various traffic types are separated on different physical ports, but we can also set limits and shares when desired. I do not recommend using limits though, why limit a traffic type when you can use shares and “artificially limit” your traffic types based on resource usage and demand?! Also note that shares and limits are enforced per uplink.

So we will be using shares, as shares only come in to play when there is contention. What we will do is take 20GbE in to account and carve it up. Easiest way, if you ask me, is to say each traffic type gets an X number of GbE assigned at a minimum which is based on some of the recommendations out there for these types of traffic:

Management Network –> 1GbE
vMotion VMK –> 5GbE
Virtual Machine PG –> 2GbE
Virtual SAN VMkernel interface –> 10GbE

Now as you can see “management”, “virtual machine” and vMotion” traffic share Port 1 and “Virtual SAN” traffic uses Port 2. This way we have sufficient bandwidth for all the various types of traffic in a normal state. We also want to make sure that no traffic type can push out other types of traffic, for that we will use the Network IO Control shares mechanism.

Now lets look at it from a shares perspective.You will want to make sure that for instance vMotion and Virtual SAN always has sufficient bandwidth. I will work under the assumption that I only have 1 physical port available and all traffic types share the same physical port. We know this is not the case, but lets take a “worst case scenario” approach.

Lets assume you have a 1000 shares in total and lets take a worst case scenario in to account where 1 physical 10GbE ports has failed and only 1 is used for all traffic. By taking this approach you ensure that Virtual SAN always has 50% of the bandwidth to its disposal while leaving the remaining traffic types with sufficients bandwidth to avoid a potential self-inflicted DoS.

Traffic Type	Shares	Limit
Management Network	20	n/a
vMotion VMkernel Interface	50	n/a
Virtual Machine Portgroup	30	n/a
Virtual SAN VMkernel Interface	100	n/a

You can imagine that when you select the uplinks used for the various types of traffic in a smart way that even more bandwidth can be leveraged by the various traffic types. After giving it some thought, this is what I would recommend per traffic type:

Management Network VMkernel interface = Explicit Fail-over order = P1 active / P2 standby
vMotion VMkernel interface = Explicit Fail-over order = P1 active / P2 standby
Virtual Machine Portgroup = Explicit Fail-over order = P1 active / P2 standby
Virtual SAN VMkernel interface = Explicit Fail-over order = P2 active / P1 standby

Why use Explicit Fail-over order for these types? The best explanation here is predictability. By separating traffic types we allow for optimal storage performance while also providing vMotion and virtual machine traffic sufficient bandwidth.

Also vMotion traffic is bursty and can / will consume all available bandwidth, so when combined with Virtual SAN on the same uplink you could see how these two could potentially hurt each other. Of course depending on the IO profile of your virtual machines and the type of operations being done. But you can see how a vMotion of a virtual machine provisioned with a lot of memory can impact the available bandwidth for other traffic types. Don’t ignore this, use Network IO Control!

Lets try to visualize things, makes it easier to digest. Just to be clear, dotted lines are “standby” and the others are “active”.

I hope this provides some guidance around how to configure Virtual SAN and Network IO Control in a VSAN environment. Of course there are various ways of doing it, this is my recommendation and my attempt to keep things simple and based on experience with the products.