Why Queue Depth matters!

A while ago I wrote an article about the queue depth of certain disk controllers and tried to harvest some of the values and posted those up. William Lam did a “one up” this week and posted a script that can gather the info which then should be posted in a Google Docs spreadsheet, brilliant if you ask me. (PLEASE run the script and lets fill up the spreadsheet!!) But some of you may still wonder why this matters… (For those who didn’t read some of the troubles one customer had with a low-end shallow queue depth disk controller, and Chuck’s take on it here.) Considering the different layers of queuing involved, it probably makes most sense to show the picture from virtual machine down to the device.

queue depth

In this picture there are at least 6 different layers at which some form of queuing is done. Within the guest there is the vSCSI adaptor that has a queue. Then the next layer is VMkernel/VSAN which of course has its own queue and manages the IO that is pushed to the MPP aka muti-pathing layer the various devices on a host. On the next level a Disk Controller has a queue, potentially (depending on the controller used) each disk controller port has a queue. Last but not least of course each device (i.e. a disk) will have a queue. Note that this is even a simplified diagram.

If you look closely at the picture you see that IO of many virtual machines will all flow through the same disk controller and that this IO will go to or come from one or multiple devices. (Typically multiple devices.) Realistically, what are my potential choking points?

  1. Disk Controller queue
  2. Port queue
  3. Device queue

Lets assume you have 4 disks; these are SATA disks and each have a queue depth of 32. Total combined this means that in parallel you can handle 128 IOs. Now what if your disk controller can only handle 64? This will result in 64 IOs being held back by the VMkernel / VSAN. As you can see, it would beneficial in this scenario to ensure that your disk controller queue can hold the same number of IOs (or more) as your device queue can hold.

When it comes to disk controllers there is a huge difference in maximum queue depth value between vendors, and even between models of the same vendor. Lets look at some extreme examples:

HP Smart Array P420i - 1020
Intel C602 AHCI (Patsburg) - 31 (per port)
LSI 2008 - 25
LSI 2308 - 600

For VSAN it is recommended to ensure that the disk controller has a queue depth of at least 256. But go higher if possible. As you can see in the example there are various ranges, but for most LSI controllers the queue depth is 600 or higher. Now the disk controller is just one part of the equation, as there is also the device queue. As I listed in my other post, a RAID device for LSI for instance has a default queue depth of 128 while a SAS device has 254 and a SATA device has 32. The one which stands out the most is the queue depth of the SATA device, only a queue depth of 32 and you can imagine this can once again become a “choking point”. However, fortunately the shallow queue depth of SATA can easily be overcome by using NL-SAS drives (nearline serially attached SCSI) instead. NL-SAS drives are essentially SATA drives with a SAS connector and come with the following benefits:

  • Dual ports allowing redundant paths
  • Full SCSI command set
  • Faster interface compared to SATA, up to 20%
  • Larger (deeper) command queue [depth]

So what about the cost then? From a cost perspective the difference between NL-SAS and SATA is for most vendors negligible. For a 4TB drive the difference at the time of writing on different website was on average $ 30,-. I think it is safe to say that for ANY environment NL-SAS is the way to go and SATA should be avoided when possible.

In other words, when it comes to queue depth: spent a couple of extra bucks and go big… you don’t want to choke your own environment to death!

vCenter Availability and Performance survey

One of the VMware product managers asked me to share this with you guys and ask you to please take the time to fill out this vCenter Server availability and performance survey. It is a very in-depth survey which should help the engineering team and the product management team making the right decisions when it comes to scalability and availability. So if you feel vCenter scalability and availability is important, take the time to fill it out!

tinyurl.com/VCPerf

Automation is scary!

Last couple of months I noticed something which led me to believe that automation is scary… And I am sure that my friends William Lam, Luc Dekens and Alan Renouf are on top of their desk right now and yelling at me but when they finished reading this post I am sure they agree.

I have been blogging for a while and when I started blogging, more than 6 years back, I posted a couple of simple scripts I wrote. The scripts allowed you to do simple things like checking the available disk space, finding snapshots, committing snapshots, find unregistered VMs, clean unregistered VMs etc. Basically the kind of scripts I used when I was a consultant to clean up the mess I typically found a couple of months after a new environment was deployed.

Many people downloaded these scripts and ran them in their environment, back then I didn’t think much of it to be honest… I mean I wrote some simple scripts and people were using those to their advantage right? Well… yes and no. Yes I did write scripts, and yes they were using them to their advantage and yes they were simple for sure. (I am not a scripting god.) So what is the problem then?

The problem is simple, 6+ years after I wrote those scripts I still receive questions about how to use them. Now note this: Those scripts were written when ESX was still the core hypervizor to use. Those scripts were written to run within the service console. Those scripts were written for ESX version 2.5.x and 3.x. Today we use ESXi and majority of folks will be on version 5.x. More than 6 years have passed, but people are still downloading them and putting them in places where they don’t belong and try to run them; without thinking about.

That is why I think Automation is Scary! (Yes in this case capital S is warranted.) If you are looking to automate operational procedures / tasks. (Automation is NOT a replacement for proper operational procedures by the way!) If you are looking to create or re-use (simple) reporting scripts, or scripts which execute simple tasks for you make sure you:

  1. Read the script, and make sure you understand every line of code
  2. Test the script in a test environment
  3. Check for which version the script was written and validate it against your version
  4. Don’t trust ANYONE blindly, and when changes are introduced in your environment or to the script go to step 1 again!

Automation isn’t really scary of course, automation can make your life easier. Just make sure you don’t become that person who has to restore 80 VMs because that script which was cleaning up unregistered VMs unintentionally deleted production VMs… Make sure you understand what your scripts are doing and assess the potential risk.

Looking back: Software Defined Storage…

Over a year ago I wrote an article (multiple actually) about Software Defined Storage, VSAs and different types of solutions and how flash impacts the world. One of the articles contained a diagram and I would like to pull that up for this article. The diagram below is what I used to explain how I see a potential software defined storage solution. Of course I am severely biased as a VMware employee, and I fully understand there are various scenarios here.

As I explained the type of storage connected to this layer could be anything DAS/NFS/iSCSI/Block who cares… The key thing here is that there is a platform sitting in between your storage devices and your workloads. All your storage resources would be aggregated in to a large pool and the layer should sort things out for you based on the policies defined for the workloads running there. Now I drew this layer coupled with the “hypervisor”, but thats just because that is the world I live in.

Looking back at this article and looking at the state of the industry today, a couple of things stood out. First and foremost, the term “Software Defined Storage” has been abused by everyone and doesn’t mean much to me personally anymore. If someone says during a bloggers briefing “we have a software defined storage solution” I typically will ask them to define it, or explain what it means to them. Anyway, why did I show that diagram, well mainly because I realised over the last couple of weeks that a couple of companies/products are heading down this path.

If you look at the diagram and for instance think about VMware’s own Virtual SAN product than you can see what would be possible. I would even argue that technically a lot of it would be possible today, however the product is also lacking in some of these spaces (data services) but I expect this to be a matter of time. Virtual SAN sits right in the middle of the hypervisor, the API and Policy Engine is provided by the vSphere layer, it has its own caching service… For now it isn’t supported to connect SAN storage, but if I want to I could even today simply by tagging “LUNs” as local disks.

Another product which comes to mind when looking at the diagram is Pernix Data’s FVP. Pernix managed to build a framework that sits in the hypervisor, in the data path of the VMs. They provide a highly resilient caching layer, and will be able do both flash as well as memory caching in the near future. They support different types of storage connected with the upcoming release… If you ask me, they should be in the right position to slap additional data services like deduplication / compression / encryption / replication on top of it. I am just speculating here, and I don’t know the PernixData roadmap so who knows…

Something completely different is EMC’s ViPR (read Chad’s excellent post on ViPR) and although they may not entirely fit the picture I drew today they are aiming to be that layer in between you and your storage devices and abstract it all for you and allow for a single API to ease automation and do this “end to end” including the storage networks in between. If they would extend this to allow for certain data services to sit in a different layer then they would pretty much be there.

Last but not least Atlantis USX. Although Atlantis is a virtual appliance and as such a different implementation than Virtual San and FVP, they did manage to build a platform that basically does everything I mentioned in my original article. One thing it doesn’t directly solve is the management of the physical storage devices, but today neither does FVP or Virtual SAN (well to a certain extend VSAN does…) But I am confident that this will change when Virtual Volumes is introduced as Atlantis should be able to leverage Virtual Volumes for those purposes.

Some may say, well what about VMware’s Virsto? Indeed, Virsto would also fit the picture but the end of availability was announced not too long ago. However, it has been hinted at multiple times that Virsto technology will be integrated in to other products over time.

Although by now “Software Defined Storage” is seen as a marketing bingo buzzword the world of storage is definitely changing. The question now is I guess, are you ready to change as well?

New beta of the vSphere 5.5 U1 Hardening Guide released

Mike Foley just announced the new release of the vSphere 5.5 U1 Hardening Guide. Note that it is still labeled as a “beta” as Mike is still gathering feedback, however the document should be finalized first week of June.

For those concerned about security, this is an absolute must read! As always, before implementing ANY of these recommendations make sure to test them on a test cluster and test expected functionality of both the vSphere platform and the virtual machines and applications running on top of it.

Nice work Mike!