Flash-based solid-state disks (SSDs) offer impressive performance capabilities and are all the rage these days. Rightly so? Let’s find out how you can assess the performance benefit of SSDs in your own datacenter before purchasing anything and without expensive, time-consuming and usually inaccurate proofs-of-concept.
** Please note that this article is written by Irfan Ahmad, follow him on twitter and make sure to attend his webinar on the 5th of June on this topic, and vote for CloudPhysics in the big data startup top 10. **
I was fortunate enough to have started the very first project at VMware that optimized ESX to take advantage of Flash and SSDs. Swap to Host Cache (aka Swap-to-SSD) shipped in vSphere 5. For those customers wanting to manage their DRAM spend, this feature can be a huge cost saving. It also continues to serve as a differentiator for vSphere against competitors.
Swap-to-SSD has the distinction of being the first VMware project to fully utilize the capabilities of Flash but it is certainly not the only one. Since then, every established storage vendor has entered this area, not to mention a dozen awesome startups. Some have solutions that apply broadly to all compute infrastructures, yet others have products that are specifically designed to address the hypervisor platform.
The performance capabilities of the Flash are indeed impressive. But they can cost a pretty penny. Marketing machines are in full force trying to convince you that you need a shiny hardware or software solution. An important question remains: can the actual benefit keep up with the hype? The results are mixed and worth reading through.
I thought it’d be fun to look for a measurement of hype. I plotted Google Ngrams for the usage of the terms “hard disk drive” and “flash drive” in publications. The results were remarkable: there is an exponential increase in usage of the latter term alongside a slow decline of the former. Take it for what it’s worth 🙂
What is a virtualization team to do? For example, imagine the expense associated with buying $10K Flash PCI-e cards for read caching for all your servers. Is that cost justified? It could be justified if the performance benefit was clear. But the reality is that it is extremely difficult to predict whether there will be a significant performance benefit for any given VM.
Let’s continue with the server-side IO caching use case (in my presentation on June the 5th, I’ll discuss several other use cases). My team was called in to help a customer in the automotive business with revenue in the multi-billions.
Case Study 1 (Automotive)—Disaster:
Customer Quick Facts:
- 4,000+ employees, $4bln+ in revenue
- Large SSD caching project
- Feasibility study completed
- Proof-of-concept (POC) completed
- Production deployment completed
- No measurable benefit to production VMs!
The customer first did a time-consuming, detailed study to validate feasibility of deploying SSD in their datacenter. Once completed, they undertook an expensive proof of concept with their SSD cache vendor. After all of this, the customer experienced NO measurable benefit! While this came as a shock to both the customer and the vendor, the scenario is far too common.
Understanding the mismatch between a POC and reality is the key to avoiding such problems. Turns out that the customer had plenty of areas where performance was hurting and could have been helped by IO caching but they used experts to do back-of-the-envelope VM selection for the initial rollout of SSDs. The experts relied on application identity and their knowledge of what their workload characteristics might have been to perform the POC. Bingo! No surprise: the issue is with the assessment. They simply picked the wrong VMs. The operations team selected their top-tier application-tier VMs for SSD rollout. Except that, unbeknownst to them, the developers of the application tier who were tired of bad performance had switched over to a different architecture that changed the on-disk workload pattern. And you guessed it, that pattern didn’t experience a speedup by caching.
In reality, the OPs team should have actually selected the numerous other VMs for the DB backing store which were still hurting badly. Instead their POC blinded them to the reality that no benefit was to come from their choice of VMs.
Aha! So different workloads respond differently to SSD IO caching in non-trivial ways. But argh! How do we figure that out? In the webinar I also provide more details on the types of workload characteristics that can affect performance.
To resolve this guessing game that is rampant in our industry, CloudPhysics engineers began discussing the idea of developing a card that would simulate the exact caching behavior of any VM without actually installing a caching solution. If we could do this, the results of this card for all the VMs in a datacenter would help us predict which VMs benefitted and by how much. OK, that sounded easy on paper but pretty much everyone in the industry thought it was almost impossible to get to that level of accuracy. Let it be known that CloudPhysics engineers aren’t ordinary engineers. Under the technical direction of Carl Waldspurger, the team nailed it (I’ll cover how we accomplished this amazing feat in another post).
Let this industry achievement sink in for a moment: we now have the capability to simulate the latency benefit to a VM of applying an SSD cache. Amazing predictive power.
Last year, we released the Caching Analytics Service card that could be used to predict how well any single VM or all VMs running on a host, cluster or datacenter could benefit by server-side IO caching. So naturally, we offered this to customers and hundreds have already used this as a paid service. Let me share two results here, though more details are covered in the webinar.
Case Study 2 (Finance)—CloudPhysics Caching Assessment card predicted 16% of VMs with a latency reduction by 1.5x – 4x
Case Study 3 (Education)—CloudPhysics Caching Assessment card predicted that only 3% of VMs would experience a latency reduction of greater than 1.5x
For both case studies, the customer saved a tremendous amount of money by targeting the VMs that would actually benefit significantly. We’ve saved customers hundreds of thousands of dollars while delivering intelligence to achieve superb performance improvements. Imagine being able to pinpoint exactly where to drop the SSD cards to extract the optimal performance for your VMs. Best of all, the assessment is completely transparent, you don’t have to install any agents. Point and click easy. If you’d like to try out this service, please contact us.
In conclusion, SSDs hold great promise and you can find immediate benefit to resolve performance issues in your datacenters today. However, we have shown that a blind rollout is wasteful and the vast majority of the benefit can be had by a detailed assessment. Sign up now to try out a free trial of all the various CloudPhysics cards.
Bio:
Irfan Ahmad is the CTO and co-founder at CloudPhysics. He was the lead engineer behind VMware’s flagship products Storage DRS, Storage I/O Control and authored vscsiStats. His new company’s product takes the guesswork out of operations management enabling you to model how your systems will behave by simulating different configurations. Irfan is leading a Webinar delving deeper into the benefits and challenges with SSDs in virtualized datacenters this week.
Duncan says
As there is no TRIM support in VMFS-5, how does this work in the long term? I have a 480gb SSD in my 5.1u1 host and I’m just finishing a 2 day migration back to a non-SSD as once the drive is near full the latency goes to 800ms and crawls.
Irfan Ahmad says
What type of SSD drive is it?
Duncan says
SanDisk_SDSSDX480GG25 which uses a SF-2281 controller, which is supposed to do garbage collection, but days after moving half the guests off it’s still slow.
Unfortunate, but I guess I’ll just pass through a sata3 controller to the guest and run ext4 with the discard option.
David Mytton says
SSDs are great for high performance requirements. Memory is always going to be the fastest option but sometimes you have to degrade to disk, and when that happens the best option is using SSDs.
You can optimise cost vs performance by deploying different kinds of disks for different use cases. I benchmarked MongoDB performance using SSDs and spinning disks and for databases you can do things like mounting different databases on different drives. This gives you quite fine grained control over which use cases can get access to the faster performance, understanding the cost of these.
See http://blog.serverdensity.com/mongodb-benchmarks/
Irfan Ahmad says
David, good points. Assigning an entire database, application or VM to SSD is certainly one way to go about doing optimization. For critical workloads, that makes sense and the cost can easily be justified. For other workloads, it is enough to simply fit the application working set into an SSD. This option is what we find customers doing very frequently. So, CloudPhysics SSD analytics are initially focused on the caching use case. Caching makes IO acceleration much more cost feasible and applicable to many more folks.
Ravi says
Duncan bring up a great point, people are misled to believe that SSDs on servers are a great way to speed up server applications. I recently spoke with a customer who had local SSD drives on their servers and it crashed making their entire DB env go for a toss and major outage for several hours.
I am not trying to pitch one solution over the other, but customers need to be aware of the different options and where flash storage fits the bill.
Two points I want to make –
1. Storage vendors are building SAN based device for this very reason, reliability and DR.
2. Flash is a new media and retrofitting in to servers and storage arrays will not work. At Pure Storage we spent two plus years perfecting the way we read/write to flash and do innovative value adds like inline data dedup, pattern elimination and compression.
–Ravi
vExpert 2013
Duncan says
I certainly can’t condone using SSD disks or pci-e on data you can’t lose on a host directly, host caches are nice, but I have heaps of RAM, and I still keep either regular backups and/or a master/slave on redundant storage.
Most people are also misled by flash array vendors. All the big storage players have bought flash innovators/startups already but none are shipping a lot of flash product. The only way you can make all-flash arrays attractive price wise is by adding things like de-dupe and compression, most every benchmark won’t tell you the performance impact of turning that on, and once it’s turned on you can actually compare the performance/price to a mid range SAS array with plenty of cache. Also it’s subjective to not needing a lot of lower tiered SAS, nearline SAS or SATA in the complete storage solution.
I can’t see real benchmarks on the pure storage web site, and I’m not going to read the whole lot before work, can you point me at some?
Ravi says
@Duncan, I exactly know which vendor you are talking about, who advertise dedup and compression but ask customer to turn them off during POC. We see them time to time in POC bake-offs, and customers seems to get a good idea of over marketing. Pure data reduction comprise of three components, dedup, compression and pattern removal, and they are ON all the time, no knobs to turn them off.
We don’t promise million IOPS like some vendors do, I personally haven’t seen any customer big or small who needs that. Instead we advertise and deliver –
400K IOPS – 8K blocksize – 100% Read,
200K IOPS- 8k blocksize – 50% R/W
Source: http://www.purestorage.com/pdf/Pure_Storage_FlashArray_Datasheet.pdf
We publish performance charts for various workloads and blocksizes and make it available to customers for every dot “O” release.
The real testimony is the customers who are using Pure for not only solving their I/O problems but fundamentally changing the way they think of storage management, operations (power and cooling) and impact to their business.
OG McFly says
I am transfering a dozen VMs off an ESXi box with two mostly full 250GB HDDs onto a new one with one virgin 500GB Samsung 840-series SDD. There is no NAS involved, it is a third-party (remote to remote) SCP running on a VM hosted on the origin server. Although I’ve been using ESX since 2006, I forgot the whole “in transit expansion” thing until a few minutes after the huge transfer was kicked off –and a couple of the seldom-used vmdks expanded on arrival, so I had to scramble to free up 50GB on the virgin SSD at the breakpoint between the two drives (so enough space would be left for the remaining 250GB). Dude, even though only 46% of the virgin target SSD had ever been written to, the first 50GB of the second drive’s transfer was literally half the speed of the first’s due solely to the write-performance on the SSD being cut in half (as VMFS5 looked to re-use that 50GB of space I had just deleted and waited forever while the firmware performed a WIPE/EMPTY operation on each page of that 50GB before it performed the WRITE operation). That’s what happens when there is no TRIM support! For reference, the target system was ESXi 5.1 Update 1 and I’m not sure what model Samsung 840 I’m using (installed by others). With this in mind, one can imagine quite easily how using the SSD-based hostcache (or even local pagefiles) in a cluster could really hurt performance of a server under ESXi (what with all the write/wipe/trim/re-write going on). At this point, their ignoring TRIM support is like Palm or Blackberry ignoring touch screens.
OG McFly says
(correcting..46% is what was left after the 50GB was deleted..hey it’s 4:15am on Fri here in real IT world)