3 weeks ago I announced the availability of the ebook of “Essential Virtual SAN”. Today I have the pleasure to inform you that the paper copy has also hit the streets and is being shipped by Amazon as of today. So for those who were waiting with ordering until the paper version was available… Go here, and order it today, and have it in house by tomorrow! The book covers the architecture of Virtual SAN, operational and architectural gotchas and sizing guidance, design examples and much more. Just pick it up,
Working within R&D at VMware means you typically work with technology which is 1 – 2 years out, and discuss futures of products which are 2-3 years. Especially in the storage space a lot has changed. Not just innovations within the hypervisor by VMware like Storage DRS, Storage IO Control, VMFS-5, VM Storage Policies (SPBM), vSphere Flash Read Cache, Virtual SAN etc. But also by partners who do software based solutions like PernixData (FVP), Atlantis (ILIO) and SANDisk FlashSoft. Of course there is the whole Server SAN / Hyper-converged movement with Nutanix, Scale-IO, Pivot3, SimpliVity and others. Then there is the whole slew of new storage systems some which are scale out and all-flash, others which focus more on simplicity, here we are talking about Nimble, Tintri, Pure Storage, Xtreme-IO, Coho Data, Solid Fire and many many more.
Looking at it from my perspective, I would say there are multiple phases when it comes to the SDS journey:
- Phase 0 – Legacy storage with NFS / VMFS
- Phase 1 – Legacy storage with NFS / VMFS + Storage IO Control and Storage DRS
- Phase 2 – Hybrid solutions (Legacy storage + acceleration solutions or hybrid storage)
- Phase 3 – Object granular policy driven (scale out) storage
Maybe I should have abstracted a bit more:
- Phase 0 – Legacy storage
- Phase 1 – Legacy storage + basic hypervisor extensions
- Phase 2 – Hybrid solutions with hypervisor extensions
- Phase 3 – Fully hypervisor / OS integrated storage stack
I have written about Software Defined Storage multiple times in the last couple of years, have worked with various solutions which are considered to be “Software Defined Storage”. I have a certain view of what the world looks like. However, when I talk to some of our customers reality is different, some seem very happy with what they have in Phase 0. Although all of the above is the way of the future, and for some may be reality today, I do realise that Phase 1, 2 and 3 may be far away for many. I would like to invite all of you to share:
- Which phase you are in, and where you would like to go to?
- What you are struggling with most today that is driving you to look at new solutions?
I was reading the Virtual SAN Data Locality white paper. I think it is a well written paper, and really enjoyed it. I figured I would share the link with all of you and provide a short summary. (http://blogs.vmware.com/vsphere/files/2014/07/Understanding-Data-Locality-in-VMware-Virtual-SAN-Ver1.0.pdf)
The paper starts with an explanation of what data locality is (also referred to as “locality of reference”), and explains the different types of latency experienced in Server SAN solutions (network, SSD). It then explains how Virtual SAN caching works, how locality of reference is implemented within VSAN and also how VSAN does not move data around because of the high cost compared to the benefit for VSAN. It also demonstrates how VSAN delivers consistent performance, even without a local read cache. The key word here is consistent performance, something that is not in the case for all Server SAN solutions. In some cases, a significant performance degradation is experienced minutes long after a workload has been migrated. As hopefully all of you know vSphere DRS runs every 5 minutes by default, which means that migrations can and will happen various times a day in most environments. (Seen environments where 30 migrations a day was not uncommon.) The paper then explains where and when data locality can be beneficial, primarily when RAM is used and with specific use cases (like View) and then explains how CBRC aka View Accelerator (in RAM deduplicated read cache) could be used for this purpose. (Does not explain how other Server SAN solutions leverage RAM for local read caching in-depth, but sure those vendors will have more detailed posts on that, which are worth reading!)
Couple of real gems in this paper, which I will probably read a couple of times in the upcoming days!
I mentioned the new disk IO scheduler in vSphere 5.5 yesterday. When discussing this new disk IO scheduler one thing that was brought to my attention is a caveat around disk limits. Lets get started by saying that disk limits are a function of the host local disk scheduler and not, I repeat, not Storage IO Control. This is an often made mistake by many.
Now, when setting a limit on a virtual disk you define a limit in IOPS. The IOPS specified is the maximum number of IOPS the virtual machine can drive. The caveat is is as follows: IOPS takes the IO size in to account. (It does this as a 64KB IO has a different cost than a 4KB IO.) The calculation is in multiples of 32KB. Note that if you do a 4KB IO it is counted as one IO, however if you do a 64KB IO it is counted as two IOs. Any IO larger than 32KB will be 2 IOs at a minimum as it is rounded up. In other words, a 40KB IO would be 2 IOs and not 1.25 IOs. This also implies that there could be an unexpected result when you have an application doing relatively large blocksize IOs. If you set a limit of 100 IOPS but your app is doing 64KB IOs than you will see your VM being limited to 50 IOPS as each 64KB IO will count as 2 IOs instead of 1. So the formula here is: ceil(IO Size / 32).
I think that is useful to know when you are limiting your virtual machines. Especially cause this is a change in behaviour compared to vSphere 5.1.
When 5.1 was released I noticed the mention of “mClock” in the advanced settings of a vSphere host. I tried enabling it but failed miserably. A couple of weeks back I noticed the same advanced setting again, but this time also noticed it was enabled. So what is this mClock thingie? Well mClock is the new disk IO scheduler used in vSphere 5.5. There isn’t much detail on mClock by itself other than an academic paper by Ajay Gulati.
The paper describes in-depth why mClock was designed / developed, it primarily was to provide a better IO scheduling mechanism that allows for limits, shares and yes also reservations. The paper also describes some interesting details around how different IO sizes and latency is taken in to account. I recommend anyone who likes reading brain hurting material to take a look at it. I am also digging internally for some more human readable material, If I find out more I will let you guys know!