Storage Tiering in the Cloud? Just some thoughts…

Duncan Epping · Apr 26, 2010 ·

When discussing design considerations for a Cloud environment there are always a few “hot” topics. Two of these usually stand out: Network and Storage.

This is not only the case with Cloud environments but with virtualization in general. I guess where Cloud differentiates itself from a regular virtual environment is storage tiering. A simple reason to implement storage tiering is cost. Running every virtual machine on the same storage is not very cost effective and will ultimately increase the prices per unit, whether that unit is a VM or consumption model based is not even important at this point.

Many storage vendors offer an automated storage tiering concept. These typically migrate virtual disks or “blocks” based on load pattern. This might be a viable solution for your enterprise environment but is that also the case for a Cloud provider? Or better said for the customers running their workload within the Cloud?

Would you want your virtual disks, or blocks, to be migrate when the storage subsystem of your provider feels it should? Or would you prefer a predictable performance? I guess I am hoping that you, as possible customers, could answer this question. Personally I prefer to get what I paid for. If I have paid for raid-5 on 15.6K disks I want to be able to use that performance when my application requires it.

Now you might say, well with the auto migration mechanism Arrays have these days you will be on fast storage before you know it, but is this actually the case? (Think EMC’s FAST, Compellent’s Data Progression or 3Par’s Adaptive Optimization) Many of these mechanisms will move data around when a threshold has been exceeded within a specific time frame. This might be too late, your job might have already completed. Now I am not, most definitely not, an expert on automated storage tiering, but I wonder who will benefit and when in the Cloud space? Maybe even more important, what if my mechanism chooses to move me to fast storage… will my Cloud service provide bill me for this?

I know I am not answering any questions here and I guess this is one of those posts which rises more questions… I would like to open the floor to anyone who wants to share his thoughts.

Comments

Matt Simmons says

26 April, 2010 at 15:10

Have you seen any performance problems caused BY the migration of “slow” data from fast storage? I haven’t worked with the technology myself, but it seems that it needs to be pretty closely timed (or at least run with a much lower priority) since it takes up fast storage” IOPs to transfer relatively unimportant data.
mw says

26 April, 2010 at 16:25

We have seen traditional DAS environments reach capacity repeatedly. We purchase more storage, do a forklift migration, and start the cycle over again fresh for a couple of years. Our main interest in automated tiered storage is getting most accessed blocks on the fastest array possible without having to babysit or forklift. This is especially true for our shop where data is getting older and larger increasingly. Combine tiers with an expandable SAN makes having both storage size and data blocks beneficial.

Applying this to the virtual (private) cloud space, as long as there’s standardization and planning for virtual instances, should make for a best case situation that’s not set in stone. If we’re rolling a low access file server we need to implement it accordingly knowing our expectations with tier 3 disks. We can also roll a tier 1 instance that may not come to fruition for whatever reason and not have wasted an array of 17K RPMs. That’s very appealing in our datacenter.
Jason Boche says

26 April, 2010 at 17:54

Tiering is ultimately a efficiency/cost savings technology for whoever is paying the storage bill. In the cloud, storage costs are passed on to the customer. Pricing of storage tiering use works most effectively/accurately with the consumption based cost model where the customer pays the true cost of their actual storage consumption on tiers X, Y, and Z on a regular basis (per hour, per day, per week, per month).

Opposite the consumption model, the other option is the cloud provider calculates an up front price menu of tiered storage which the customer is comfortable paying for, but also understands that what they are paying for doesn’t tie 100% to the storage tier they are consuming all of the time.

It’s probably easier to dissect this topic into two discussion points:
1) Performance acceptance with storage tiering, it’s going to depend per customer and will be application and storage array vendor technology dependent.
2) Cost model(s) which tie to #1 above. What is the customer willing to pay for and at what price savings exists the tipping point where a a customer say “yes, let’s do it” and accepts slight performance risks. Some customer workloads are predictable enough to get data moved to Tier 1 storage before latency impacts application performance. This can mitigate performance risk.
RussellCorey says

26 April, 2010 at 19:04

I think this goes back to an SLA on a service offering. I’d imagine there would be 2 levels of offerings; a generic low cost offering (you share and your storage performance is not guaranteed) and a higher cost offering where you’re placed on premium disk that does not get migrated to a lower tier.

Since you’re reserving resources you pay a higher cost than the people who are willing to share the playground with other kids.
RussellCorey says

26 April, 2010 at 19:34

Actually wanted to comment on this:
“well with the auto migration mechanism Arrays have these days you will be on fast storage before you know it, but is this actually the case? (Think EMC’s FAST, Compellent’s Data Progression or 3Par’s Adaptive Optimization) Many of these mechanisms will move data around when a threshold has been exceeded within a specific time frame.”

From what I recall on a compellent array, if it has available capacity on the top tier of storage then all writes begin there and will be migrated downward if the block isn’t frequently accessed. I can’t speak for 3Par on that though.

Basically you’ve got a write going Cache—>Tier 1 disk—| Infrequent access? |—>Tier 2 disk—| Infrequent access |—>Tier 3 disk
Duncan Epping says

26 April, 2010 at 19:40

@mw: private cloud within an enterprise is indeed a different ballgame. I should have specified that more clear I guess. Although it will depend on the SLA you have with your, internal, customer. They will also want what they pay for. If this has been clearly defined it is not a problem but as you can imagine this is hardly ever the case and an users opinion is often based on perception/impression and not on facts.

@russel: exactly what I am discussing with a customer currently. base your offering on three tiers and depending on your customers requirements move him up or down as needed. A storage system which is well integrated and eases the migration between tiers would of course be more than welcome.
Duncan Epping says

26 April, 2010 at 19:41

@russell, correct that’s for writes but what about reads? if 50% of my IO is reads I will still see a degraded performance for half of the time.
Dave B says

26 April, 2010 at 20:21

To add to the questions…

Utilization rates on SAN’s are going up due to increasing consolidation (e.g. quicker hosts driving more data, thin provisioning, etc ). Even 15k FC drives can bog down.

No matter what the SLA, drives getting bogged down is bad – queues/buffers/write caches fill, which is a bad thing ™.

Rather than seeing tiering as providers getting cheap, try thinking of it as a way keeping arrays happy by migrating some data to SSD’s. And if you’re happy migrating small bits of data to SSD – why not migrate some data to SATA too?
RussellCorey says

26 April, 2010 at 20:22

Oh definitely on reads; sticking with Compellent though; if we write a block it starts on tier 1. If it keeps getting subsequent reads it tracks those accesses in the block metadata so it will stay up at the top.

However, if time passes and the block is not accessed then yes it will trickle down and “archive” on tier2 or tier3 storage.

Unfortunately I’ve never touched 3Par or had any exposure to EMC FAST so they might exhibit the very problem you’re concerned with.

That said, if I were a compellent shop and I wanted to charge for premium tier 1 storage then I would allocate a LUN that is only allowed to sit on tier 1 disks at whatever defined RAID level. The challenge of course is that I won’t ever get dedicated spindles but if I’m aggregated across all of my SSD/15k RPM disks that may be less of an issue. I am charging more for disk after all.
Doug says

27 April, 2010 at 03:46

Toss my vote in there for predictable performance.

In my opinion, data progression/migration between tiers is fine for internal applications, but I agree with Duncan that you need to get what you are paying for.

You’ve got to wonder if you’re going to get a discount from the provider when my data is migrated to a lower tier of storage? Think about how much metadata needs to be tracked and harvested in order to give per-block data residency pricing. That makes a big mess and I’d guess the providers would skip that part — it gives them some flexibility and cost savings, but doesn’t really translate well to the consumers.

To further Duncan’s example, I might have a workload that needs to run every month. If the data trickles in during the month then needs to be batch processed at the end of the month, odds are really good that most of the data will have migrated to the slow storage. When I run my job, that data is STILL on the slower storage and my job takes longer. Based on data progression, the data may be available on the higher tier storage TOMORROW based on the recent access, but that doesn’t do me any good. I needed it there today.
RussellCorey says

27 April, 2010 at 07:35

Hey Doug,

I think I sat next to you in a beta VMware design class.

If there is a concern about the performance on a given dataset then maybe it isn’t something you outsource off-premise then? Or if I do I pay the premium bill rate to ensure I’m always on the same tier of storage?

Right now I’m seeing one extreme “My data must perform at X” being applied to the other extreme “I am using automated storage tiering;” If a provider can’t meet a given SLA then you find a new provider (either storage or “cloud”)

When you pay for the generic tier from any given provider are you really buying guaranteed performance or are you saving some dollars on compute resources?

Most arrays let you park data on a specific spindle type/tier of storage; even the automated migration arrays like a compellent. So if you’re using an honest provider and you pay for premium disk you can stay on premium disk.
Chuck says

27 April, 2010 at 15:01

Russell, I think you have it right with the two storage tiers. There would need to be some sort of blended rate (10% SDD, 20% FC, 70% SATA?) for the lower self-adjusting tier and then the price of the premium tier.

Does anyone have real world experience with the FAST-like technologies where an app that was hot once a month or once a quarter tried to get the quicker resources in a pool (for lack of better term) and they were not available?
Rodos says

7 May, 2010 at 08:49

Um .. a topic close to my hear or brain rather. For cloud we need to move to more generic performance characteristics and not think like Enterprise storage (FC vs SATA, RAID types). Providers are going to do all sorts of things automajical underneath, usually to give you the performance you need or to keep the costs down to give you the price you want. If

The industry has some ways to go on this and early on we will only see small bits bubble to the surface. Can you imagine how hard it is to cost model, integrate the technology and then bill tiering of storage at the block level (rather than disk or entire VM) back to a customer. I am sure the industry will get there one day.

Rodos
Suttoi says

15 May, 2010 at 10:04

Static disk = predictable performance?
Are you sure…

What about thin provisioning. If we buy 40GB for a OS vdisk (seems about right) on 15k FC disks, but the cloud provider is using thin provisioning (which they probably will be) we are only using about 8GB (base OS only). Consequently the density of IOs on the FC disks is now 5X higher. That is a lot more IO on disks that still spin at 15K, just like they did 5, 6, 7 years ago.

Unless the cloud provider sells dedicated datastores on unshared spindles I’m not sure we can have a realistic expectation of predictable performance. I think that automated (FAST type) load distribution might be our only hope obi wan.
Bruce says

19 May, 2010 at 10:20

@mw: private cloud within an enterprise is indeed a different ballgame. I should have specified that more clear I guess. Although it will depend on the SLA you have with your, internal, customer. They will also want what they pay for. If this has been clearly defined it is not a problem but as you can imagine this is hardly ever the case and an users opinion is often based on perception/impression and not on facts.

@russel: exactly what I am discussing with a customer currently. base your offering on three tiers and depending on your customers requirements move him up or down as needed. A storage system which is well integrated and eases the migration between tiers would of course be more than welcome.

Related

Reader Interactions

Comments