• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

cloudphysics

Awesome paper/presentation: Efficient MRC Construction with SHARDS

Duncan Epping · Apr 16, 2015 ·

When I joined VMware and had read a white paper on memory reclamation techniques a dozen times. I was left with a bunch of questions still and I emailed the engineer who authored it back in the days. I asked him a couple of “simple” questions and received a one pager email full with answers. Even the email I had to read twice. Not because it is insanely complex, but because there was so much information in there that it was impossible to digest at all. Carl Waldspurger was that engineer. I’d seen some of his talks when he was still at VMware but he has gone “dark” for a while.

Carl joined CloudPhysics in the early stages of the company. He has been working on various projects, and one of those projects is called SHARDS. I had not seen the result yet, and a couple of weeks ago I watched the presentation. Excellent presentation skills, but more importantly amazing research with a very important result. Some people may have been wondering what you can do with a platform like CloudPhysics and what you can harvast from the data, well I think it is fair to say that this is one of the results of all the hard data mining work that has been done over the last years. Here is the abstract with a link to the online presentation. I didn’t want to share everything here to drive some traffic to Usenix as support. Before you watch the video, a warning…. this isn’t a high level overview, serious deep dive.

Efficient MRC Construction with SHARDS

Reuse-distance analysis is a powerful technique for characterizing temporal locality of workloads, often visualized with miss ratio curves (MRCs). Unfortunately, even the most efficient exact implementations are too heavyweight for practical online use in production systems.

We introduce a new approximation algorithm that employs uniform randomized spatial sampling, implemented by tracking references to representative locations selected dynamically based on their hash values. A further refinement runs in constant space by lowering the sampling rate adaptively. Our approach, called SHARDS (Spatially HashedApproximate Reuse Distance Sampling), drastically reduces the space and time requirements of reuse-distance analysis, making continuous, online MRC generation practical to embed into production firmware or system software. SHARDS also enables the analysis of long traces that, due to memory constraints, were resistant to such analysis in the past.

We evaluate SHARDS using trace data collected from a commercial I/O caching analytics service. MRCs generated for more than a hundred traces demonstrate high accuracy with very low resource usage. MRCs constructed in a bounded 1 MB footprint, with effective sampling rates significantly lower than 1%, exhibit approximate miss ratio errors averaging less than 0.01. For large traces, this configuration reduces memory usage by a factor of up to 10,800 and run time by a factor of up to 204.

You can find the slide/paper and video below as a download.

Waldspurger PDF
View the slides
Download Video

Enjoy 🙂

CloudPhysics Storage Analytics and new round of funding

Duncan Epping · Jun 24, 2014 ·

When I just woke up I saw the news was out… A new round of funding for CloudPhysics! CloudPhysics raised $15 million in a series C investment round, bringing the company’s total funding to $27.5 million! Congratulations folks, I can’t wait to see what this new injection will result in to. One of the things that CloudPhysics heavily invested in to the past 12 months has been the storage side of the house. In their SaaS based solution one of the major pillars today is Storage Analytics, along side General Health Checks and Simulations.

The Storage Analytics section is available as of today to everyone out there! It will allow you to monitor things like “datastore contention”, “unused VMs” and everything there is to know about capacity savings ranging from inside the guest to datastore level details. If you ever wondered how “big data” could be of use to you, I am sure you will understand when you start using CloudPhysics. Not just their monitoring and simulation cards are brilliant, the Card Builder is definitely one of their hidden gems. If you need to convince your management, than all you should do is show the above screenshot: savings opportunity!

Of course there is a lot more to it than I will be able to write about in this short post. In my opinion if you truly want to understand what they bring to the table, just try it out for free for 30 days here!

PS: How about this brilliant Infographic… from the people who taught you how to fight the noisy neighbour, they now show you how to defeat that bully!

**disclaimer: I am an advisor to CloudPhysics **

Startup News Flash part 16

Duncan Epping · Apr 2, 2014 ·

Number 16 of the Startup News Flash, here we go:

Nakivo just announced the beta program for 4.0 of their backup/replication solution. It adds some new features like: recovery of Exchange objects directly from compressed and deduplicated VM backups, Exchange logs truncation, and automated backup verification. If you are interested in testing it, make sure to sign up here. I haven’t tried it, but they seem to be a strong upcoming player in the backup and DR space for SMB.

SanDisk announced a new range of SATA SSDs called “cloudspeed”. They released 4 different models with various endurance levels and workload targets, of course ranging in sizes from 100GB up to 960GB depending on the endurance level selected. Endurance level ranges from 1 up to 10 full drive writes per day. (Just as an FYI, for VSAN we recommend 5 full drive writes per day as a minimum) Performance numbers range between 15k to 20k write IOps and 75 to 88K read IOps. More details can be found in the spec sheet here. What interest me most is the FlashGuard Technology that is included, interesting how SanDisk is capable of understanding wear patterns and workloads to a certain extend and place data in a specific way to prolong the life of your flash device.

CloudPhysics announced the availability of their Storage Analytics card. I gave it a try last week and was impressed. I was planning on doing a write up on their new offering but as various bloggers already covered it I felt there was no point in repeating what they said. I think it makes a lot more sense to just try it out, I am sure you will like it as it will show you valuable info like “performance” and the impact of “thin disks” vs “thick disks”. Sign up here for a 30day free trial!

Startup News Flash part 2

Duncan Epping · Aug 13, 2013 ·

First part of the Startup News Flash was published a couple of weeks ago, and as many things have happened I figured I would publish another. At times I guess I will miss out on a news fact or a new company, if that happens don’t hesitate to leave a comment with your findings/opinion or just a link to what you feel is newsworthy! As mentioned in part 1 the primary focus of this article is Startup news / Flash related news. As you can see most flash related except for one.

Nimbus Data launched two brand new arrays: Gemini F400 / F600 arrays. These are all flash arrays, and bring something unique to the table for sure… and that is costs: price per useable gigabyte is $0.78. Yes, that is low indeed. How do they bring it down? Well of course by very efficient deduplication and compression. On top of that, by leveraging standard hardware and getting all smarts from software the price can be kept low. According to the press release these new arrays will be able to provide between 3TB and 48TB of capacity (I almost said disk space there…) and will be shipping end of this year! Although Nimbus declared Hybrid Storage officially dead, mainly because of the cost of Nimbus all flash solution (the F400 starts under US$60,000, the F600 starts under US$80,000.), I still think there is a lot of room for growth in that space and many customer will be interested in those solutions. My question yesterday on twitter was to Nimbus which configuration they did the math with to declare hybrid dead, because cost per gigabyte is one thing, the upfront investment to reach that price point is another. It will be interesting to see how they will do the upcoming 12-18 months, but it is needless to say that they will be going after their competition aggressively. Talking about competition….

Last year at VMworld I briefly stopped at the Tegile booth, besides the occasional tweet I kind of lost track until recent as Tegile just announced series C funding… Not pocket money I would say but a serious round, $35 million, led by Meritech Capital Partners and original stakeholder August Capital and strategic partners Western Digital and SanDisk.  For those who don’t know, Tegile is a storage company who sells both a hybrid and an “all-flash” solution and they have done this in an interesting modular fashion (all-flash placed in front of spinning disks = modular hybrid). Of course they also offer functionality like dedupe/compression and replication. Although I haven’t heard too much from them lately it is a booth I will surely stop by at VMworld. Again, there is a lot of competition in this space and it would be interesting to see an “All-flash / Hybrid Storage bake off”. Tegile vs Nimbus, Nimble vs Tintri, Pure Storage vs Violin…

Violin Memory just announced the 6264 flash Memory Array. This new all flash storage system can provide a capacity of 64 TiB/70.3 TB with a footprint of just 3U, and that is impressive if you ask me. On top of that, it can provide up to 1 million IOps and at a ultra low latency! Who doesn’t want to have 1 million IOps to its disposal right? (More specs to be found here.) To me though what was more exciting in this press release was the announcement of a management tool called Symphony. Symphony provides a single pane of glass for all your Violin devices (read more details here.) It provides a smart management interface that allows you to create custom dashboard, comprehensive reporting, tagging and filtering and of course they provide a RESTful API for you admins out there who love to automate things. Nice announcement from Violin Memory, and those already running Violin hardware I would definitely recommend evaluating Symphony as the video looks promising.

CloudPhysics just announced the Card Store is GA as of today (13th August 2013) and a new round of funding ($ 10 million) led by Kleiner Perkins Caufield & Byers. Previous investors the Mayfield Fund, Mark Leslie, Peter Wagner, Carl Waldspurger, Nigel Stokes, Matt Ocko and VMware co-founders also participated in this round. I would say an exciting day for CloudPhysics. Many have asked over the last year why have I always been enthusiastic about what they do? I think John Blumenthal (CEO) explains it best:

Our servers receive a daily stream of 80+ billion samples of configuration, performance, failure and event data from our global user base with a total of 20+ trillion data points to date. This ‘collective intelligence,’ combined with CloudPhysics’ patent-pending datacenter simulation and unique resource management techniques, empowers enterprise IT to drive Google-like operations excellence using actionable analytics from a large, relevant, continually refreshed data set.

If you are interested in testing their solution, sign up for a free trial  at cloudphysics.com. Pricing starts at $49/month per physical server, more details here. For those wondering what CloudPhysics has to do with flash, well they’ve got a card for that!

That was it for Part 2, hope you found it a useful round-up and I will expect to be able to publish another startup news flash within 2 weeks!

 

CloudPhysics KB Advisor, how cool is that?

Duncan Epping · Jul 30, 2013 ·

Just imagine, you have 3-8 hosts – an EMC array – Dell hardware – some FibreChannel cards – Specific versions of firmware – Specific versions of ESXi and vCenter… How do you know what works and what does not? Well, you go to kb.vmware.com and you do a search and try to figure out what applies to you and what does not. In this depicted environment of only 3-8 hosts that should be simple? Well with thousands of KB articles I can ensure you that it is not… Just imagine now that you have 2 arrays and 2 clusters of 8 hosts… Or you add iSCSI to the mix? Yes it gets extremely overly complicated really really quick, in fact I would say it is impossible to figure out what does and does not apply to your environment. How do you solve that?

Well you don’t solve that yourself, it requires a big database and an analytics engine behind it… Big data platform even. Luckily though, the smart folks of CloudPhysics have solved it for you. Sign up, download the appliance and let them do the work for you… It doesn’t get any easier than that if you ask me. Some more details can be found in the press release.

I knew the CPhy guys were working on this, surprises me that no one else has done this so far to be honest. What an elegant / simple / awesome solution! Thanks CloudPhysics for making my life once again a whole lot easier.

  • Go to page 1
  • Go to page 2
  • Go to Next Page »

Primary Sidebar

About the author

Duncan Epping is a Chief Technologist in the Office of CTO of the Cloud Platform BU at VMware. He is a VCDX (# 007), the author of the "vSAN Deep Dive", the “vSphere Clustering Technical Deep Dive” series, and the host of the "Unexplored Territory" podcast.

Upcoming Events

May 24th – VMUG Poland
June 1st – VMUG Belgium

Recommended Reads

Sponsors

Want to support Yellow-Bricks? Buy an advert!

Advertisements

Copyright Yellow-Bricks.com © 2023 · Log in