VMware

How to remove a host from your Virtual SAN cluster

Duncan Epping · Jan 14, 2014 ·

The question “How to remove a host from your Virtual SAN cluster” has now popped up various times, so I figured I would write a short article around what the current procedure is. It is fairly straight forward to be honest, here we go:

Place host in maintenance mode
Delete disk group when “maintenance mode” is completed
Move host out of the cluster
Remove the VSAN VMkernel (not a requirement, but I prefer to clean things up)

That is it, now you can re-purpose the host for anything else.

How about an All Flash Virtual SAN?

Duncan Epping · Jan 10, 2014 ·

Yeah that title got your attention right… For now it is just me writing about it and nothing has been announced or promised. At VMworld I believe it was Intel who demonstrated the possibilities in this space, an All Flash Virtual SAN. A couple of weeks back during my holiday someone pointed me to a couple of articles which were around SSD endurance. Typically these types of articles deal with the upper-end of the spectrum and as such are irrelevant to most of us, and some of the articles I have read in the past around endurance were disappointing to be honest.

TechReport.com however decided to look at consumer grade SSDs. We are talking about SSDs like the Intel 335, Samsung 840 series, Kingston Hyper-X and the Corsair Neutron. All of the SSDs used had a capacity of around 250GB and are priced anywhere between $175 and $275. Now if you look at the guarantees given in terms of endurance, we are talking about anything ranging from “20GB of writes per day for the length of its three-year warranty” for the Intel (22TB in total) to three-year and 192TB in total for the Kingston, and anything in between for the other SSDs.

Tech Report had set their first checkpoint at 22TB. After running through a series of tests, which are described in the article, they compare the results between the various SSDs after 22TB writes. Great to see that all SSDs did what they are supposed to do and promised. All of them passed the 22TB mark without any issues. They had another checkpoint at the 200TB mark, which showed the first signs of weakness. As expected the lower end SSDs dropped out first. The next checkpoint was set at the 300TB mark, they also added an unpowered retention test to see how well they retain data when unplugged. So far impressive results, and a blog series I will follow with interest. The articles clearly show that from an endurance perspective the SSDs perform a lot better than most had assumed in the past years. It is fair to say that the consumer grade SSDs are up to the challenge.

Considering the low price points of these flash devices, I can see how an All Flash Virtual SAN solution would be possible leveraging these consumer grade SSDs as the capacity tier (reads) and using enterprise grade SSDs to provide write performance (write buffer). Hopefully we will start to see the capacity increase even further of these types of devices, today some of them go up to 500GB others up to 800GB, wouldn’t it be nice to have a 1TB (or more) version?

Anyway, I am excited and definitely planning on running some test with an all flash Virtual SAN solution in the future… What about you?

** 500TB blog update! **
** 600TB blog update! **
** 1PB blog update! **
** 2PB blog update **
** Conclusion **

How to calculate what your Virtual SAN datastore size should be

Duncan Epping · Jan 8, 2014 ·

I have had this question so many times I figured I would write an article about it, how to calculate what your Virtual SAN datastore size should be? Ultimate this determines which kind of server hardware you can use, which disk controller you need and which disks… So it is important that you get it right. I know the VMware Technical Marketing team is developing collateral around this topic, when that has been published I will add a link here. Lets start with a quote by Christian Dickmann one of our engineers as it is the foundation of this article:

In Virtual SAN your whole cluster acts as a hot-spare

Personally I like to work top-down, meaning that I start with an average for virtual machines or a total combined number. Lets take an example to go through the exercise, makes it a bit easier to digest.

Lets assume the average VM disk size is 50GB. On average the VMs have 4GB of memory provisioned. And we have 100 virtual machines in total that we want to run on a 4 host cluster. Based on that info the formula would look something like this:

(total number of VMs * average VM size) + (total number of VMs * average VM memory size) = total capacity required

In our case that would be:

(100 * 50GB) + (100 * 4GB) = 5400 GB

So that is it? Well not really, like every storage / file system there is some overhead and we will need to take the “failures to tolerate” in to account. If I set my “failures to tolerate” to 1 than I would have 2 copies of my VMs, this means I need 5400 GB * 2 = . Personally I also add an additional 10% in disk capacity to ensure we have room for things like: meta data, log files, vmx files and some small snapshots when required. Note that VSAN by default provisions all VMDKs as thin objects (note that swap files are thick, Cormac explained that here), so there should be room available regardless. Better safe than sorry though. This means that 10800 GB actually becomes 11880 GB. I prefer to round this up to 12TB. The formula I have been using thus looks as follows:

(((Number of VMs * Avg VM size) + (Number of VMs * Avg mem size)) * FTT+1) + 10%

Now the next step is to see how you divide that across your hosts. I mentioned we would have 4 hosts in our cluster. We have two options, we create a cluster that can re-protect itself after a full host failure or we create cluster that cannot. Just to clarify, in order to have 1 host of spare capacity available we will need to divide the total capacity by 3 instead of 4. Lets look at those two options, and what the impact is:

12TB / 3 hosts = 4TB per host (for each of the 4 hosts)
- Allows you re-protect (sync/mirror) all virtual machine objects even when you lose a full host
- All virtual machines will maintain availability levels when doing maintenance
- Requires an additional 1TB per host!
12TB / 4 hosts = 3TB per host (for each of the 4 hosts)
- If all disk space is consumed, when a host fails virtual machines cannot be “re-protected” as there would be no capacity to sync/mirror the objects again
- When entering maintenance mode data availability cannot be maintained as there would be no room to sync/mirror the objects to another disk

Now if you look at the numbers, we are talking about an additional 1TB per host. With 4 hosts, and lets assume we are using 2.5″ SAS 900GB Hitachi drives that would be 4 additional drives, at a cost of around 1000 per drive. When using 3.5″ SATA drives the cost would be a lot lower even. Although this is just a number I found on the internet it does illustrate that the cost of providing additional availability could be small. Prices could differ though depending on the server brand used. But even at double the cost, I would go for the additional drive and as such additional “hot spare capacity”.

To make life a bit easier I created a calculator. I hope this helps everyone who is looking at configuring hosts for their Virtual SAN based infrastructure.

What happens in a VSAN cluster in the case of an SSD failure?

Duncan Epping · Dec 19, 2013 ·

The question that keeps coming up over and over again at VMUG events, on my blog and the various forums is: What happens in a VSAN cluster in the case of an SSD failure? I answered the question in one of my blog posts around failure scenarios a while back, but figured I would write it down in a separate post considering people keep asking for it. It makes it a bit easier to point people to the answer and also makes it a bit easier to find the answer on google. Lets sketch a situation first, what does (or will) the average VSAN environment look like:

In this case what you are looking at is:

4 host cluster
Each host with 1 disk group
Each disk group has 1 SSD and 3 HDDs
Virtual machine running with a “failures to tolerate” of 1

As you hopefully know by now a VSAN Disk Group can hold 7 HDDs and requires an SSD on top of that. The SSD is used as a Read Cache (70%) and a Write Buffer (30%) for the components stored on it. The SSD is literally the first location IO is stored; as depicted in the diagram above. So what happens when the SSD fails?

When the SSD fails the whole Disk Group and all of the components will be reported as degraded or absent. The state (degraded vs absent) will depend on the type of failure, typically though when an SSD fails VSAN will recognize this and mark it as degraded and as such instantly create new copies of your objects (disks, vmx files etc) as depicted in the diagram above.

From a design perspective it is good to realize the following (for the current release):

A disk group can only hold 1 SSD
A disk group can be seen as a failure domain
- E.g. as such there could be a benefit in creating 2 x 3HDD+1SSD versus 6HDD+1SSD diskgroup
SSD availability is critical, select a reliable SSD! Yes some consumer grade SSDs do deliver a great performance, but they typically also burn out fast.

Let is be clear that if you run with the default storage policies you are protecting yourself against 1 component failure. This means that 1 SSD can fail or 1 host can fail or 1 disk group can fail, without loss of data and as mentioned typically VSAN will quickly recreate the impacted objects on top of that.

Doesn mean you should try safe money on reliability if you ask me. If you are wondering which SSD to select for your VSAN environment I recommend reading this post by Wade Holmes on the VMware vSphere Blog. Especially take note of the Endurance Requirements section! If I had to give a recommendation though, the Intel S3700 seems to still be the sweet spot when it comes to price / endurance / performance!

Re: VMware VSAN VS the simplicity of hyperconvergence

Duncan Epping · Dec 11, 2013 ·

I was reading this awesome article by “the other” Scott Lowe. (That is how he calls himself on twitter.) I really enjoyed the article and think it is a pretty fair write-up. Although I’m not sure I really agree with some of the statements or conclusions drawn. Again, do not get me wrong… I really like the article and effort Scott has put in, and I hope everyone takes the time to read it!

A couple of things I want to comment on:

VMware VSAN VS the simplicity of hyperconvergence

I guess I should start with the title… Just like for companies like SimpliVity (Hey guys congrats on winning the well deserved award for best converged solution) and Nutanix their software is the enabler or their hyper-converged solution. Virtual SAN could be that, if you buy a certain type of hardware of course that is.

Hyper-converged infrastructure takes an appliance-based approach to convergence using, in general, commodity x86-based hardware and internal storage rather than traditional storage array architectures. Hyper-converged appliances are purpose-built hardware devices.

Keyword in this sentence if you ask me is “purpose-built”. In most cases there is nothing purpose-built about the hardware. (Except for SimpliVity as they use a purpose built component for deduplication.) In May of 2011 I wrote about these HPC Servers that SuperMicro was selling and how they could be a nice platform for virtualization, I even ask in my article which company would be the first to start using these in a different way. Funny, as I didn’t know back then that Nutanix was planning on leveraging these which was something I found out in August of 2011. The servers used by most of the Hyper-converged players today those HPC servers and are very much generic hardware devices. The magic is not the hardware being used, the magic is the software if you ask me and I am guessing vendors like Nutanix will agree on me that.

Due to its VMware-centric nature and that fact that VSAN doesn’t present typical storage constructs, such as LUNs and volumes, some describe it as a VMDK storage server.

Not sure I agree with this statement. What I personally actually like about VSAN is that it does present a “typical storage construct” namely a (Virtual SAN) data store. From a UI point of view it just looks like a regular datastore. When you deploy a virtual machine the only difference is that you will be picking a VM Storage Policy on top of that, other than that it is just business as usual. For users, nothing new or confusing about it!

As is the case in some hybrid storage systems, VSAN can accelerate the I/O operations destined for the hard disk tier, providing many of the benefits of flash storage without all of the costs. This kind of configuration is particularly well-suited for VDI scenarios with a high degree of duplication among virtual machines where the caching layer can provide maximum benefit. Further, in organizations that run many virtual machines with the same operating system, this breakdown can achieve similar performance goals. However, in organizations in which there isn’t much benefit from cached data — highly heterogeneous, very mixed workloads — the overall benefit would be much less.

VSAN can accelerate ANY type of I/O if you ask me. It has a write buffer and a read cache. Depending on the size of your working set (active data), the size of the cache and the type of policy used you should always benefit regardless of the type of workload used. From a writing perspective as mentioned it will always go to the buffer, but from a read perspective your working set should be in cache. Of course there are always types of workloads where this will not apply but for the majority it should.

VSAN is very much a “build your own” approach to the storage layer and will, theoretically, work with any hardware on VMware Hardware Compatibility list. However, not every hardware combination is tested and validated. This will be one of the primary drawbacks to VSAN…

This is not entirely true. VMware is working on a program called Virtual SAN ready nodes. These Virtual SAN ready nodes will be pre-configured, certified and tested configurations which are optimized for things like performance / capacity etc. I haven’t seen the final list yet, but I can imagine certain vendors like for instance Dell and HP will want to list specific types of servers with an X number of Disks and a specific SSD types to ensure optimal user experience. So although VSAN is indeed a “bring your own hardware” solution, but I think that is the great thing about VSAN… you have the flexibility to use the hardware you want to use. No need to change your operational procedures because you are introducing a new type of hardware, just use what you are familiar with.

PS: I want to point out there are some technical inaccuracies in Scott’s post. I’ve pointed these out and am guessing they will be corrected soon.