I was talking to our engineering team a couple of weeks ago with regards to our upcoming beta. During a demo I noticed that in our UI it said “Space Efficiency” instead of deduplication. After some discussions it became clear to me why that is… It isn’t “just” deduplication which will be in the upcoming beta for Virtual SAN, but the engineering team managed to also get compression in there. Huge accomplishment if you ask me and I hope this will make the official next release.
So where does compression fit? Compression happens after deduplication and we expect to store 2KB for every unique 4KB block. Combine this with “RAID5/6” over the network, aka “erasure coding” and with deduplication and thin provisioning and I bet you will see significant space savings! Great new functionality (if and when released), which will especially make the all-flash configuration a no-brainer if you ask me. Especially with flash prices still dropping and the price being almost on the same level as SAS drives these days it will make this a very compelling solution.
Is compression worth the CPU load ?
Maybe for a very finite workload set?
We have used compression in our ZFS appliance over dedupe because the LZJB algorithm we use is achieving a 1.65x compression ratio and doesn’t tax the CPU like a GZIP or BJZIP algorithm. So I would counter that by suggesting that dedupe may be more CPU intensive than the compression, if the proper algorithms are leveraged.
Exactly Joseph, dedupe is definitely more expensive then compression in this case. And we used a non-aggressive algorithm for a reason indeed, nice savings against little to no cost.
Perhaps DeDupe would be most worth if you can specify what you want to dedupe. For instance, a desktop OS in a persistent VDI deployment. Then use compression for all other data. Would something like that be doable, or make any sense?
Rick,
You can deliver “persistent desktops” while still using Linked clones application layering, and profile virtualization to keep the desktops “lean”. The Horizon suite getting UEM, and App Volumes really reduced the need for true full clone desktops.
Terribly written if you ask me.
Thanks for your valuable feedback.
Terribly Commented if you ask me. I found this information valuable myself.
At what level are these settings configurable? Are things like dedupe or compression things that can be applied at storage policy level per VM or is at the disk group level per host?
Configured at a cluster level and on/off per cluster. Dedupe within diskgroup.
Hi Duncan,
One question about the functionalities that this beta will offer: is it possible to cumulate “erasure coding” with Stretched Cluster?
What I would like to know is if during an outage on one site, we could still benefit from some redundancy on the remaining one (if we have enough hosts of course). Like we would have with traditional arrays being replicated but still providing RAID of some sort on each local array.
Thanks
Sylvain
So will the new features be available for the base VSAN or will it require all flash?
I hope they will allow HYB configurations. Many storage vendors are pushing all flash, which is good to help bring down flash costs and make them more affordable for all of us. However, for us, the IOPS requirements that we have just aren’t anywhere near what flash provides. With a proper cache layer for read prefetching + write buffer with destaging, we are perfectly happy to enjoy the benefits of increased capacity available with the HYB configurations.
When we spoke about dedup and compression at Vmworld with Christian, I asked if it will be possible to use a GPU card to offload those workload using CUDA. Terrible idea on a second thought, but, what about using SmartNic such as Mellanox/Tillera for inline dedup/compression or Cavium for encryption ? Thoses network card are economically viable _ compared to an Intel X710, a Cavium card is hardly 10% more expensive _ and we will use them for a lot of other usage (DPI, advanced networking, inline snort etc…).
A Cavium card is typically shipped with 2GB RAM, so sufficent for a small to medium dedup base.
Very excited for this!!!! C’mon beta pick me 🙂
Do you have an all-flash cluster Daniel?
What if customer already has VSAN implemented with HDD/SSD combo whats the roadmap to take advantage of new compression/de-dupe functionality once the upgrade to new code?