Yellow Bricks

vSphere 4.1 released

Duncan Epping · Jul 13, 2010 ·

I just wanted to let you know that vSphere 4.1 has been released and is available to download. No point in me rehashing the same “what’s new” info everyone is rehashing today and probably the rest of the week. Expect some more detailed blogs coming up over the course of the upcoming weeks.

Reservations primer

Duncan Epping · Jul 8, 2010 ·

My colleague Craig Risinger wrote the below and was kind enough to share it with us. Thanks Craig!

A quick primer on VMware Reservations (not that anyone asked)…

A Reservation is a guarantee.

There’s a difference between reserving a resource and using it. A VM can use more or less than it has reserved. Also, if a reservation-holder isn’t using all the reserved resource, it will share CPU but not RAM. In other words, CPU reservations are friendly but memory reservations are greedy.

Reservation admission control:

If a VM has a reservation defined, the ESX host must have at least that much resource unreserved (not just unused, but unreserved) or else it will refuse to power on the VM. Reservations cannot overlap. A chunk of resource can be reserved by only one entity at a time; there can’t be two reservations on it.

Scenario #1

Given:

An ESX host has 16 GHz (= 8 x 2 GHz cores) and 16 GB.
VM-1 and VM-2 each have 8 vCPUs and 16 GB of vRAM.
VM-1 has reserved 13 GHz of CPU resources and 10 GB of memory.
VM-1 is currently using 11 GHz of CPU resources and 9 GB of memory. (Using != reserving.)

Consequently:

VM-2 can use up to 5 GHz. (Not 3 GHz, CPU reservations are friendly.)
VM-2 can reserve up to 3 GHz. (Using != reserving. Reservations don’t overlap.)
VM-2 can use up to 6 GB. (Not 7 GB. Memory reservations are greedy.)
VM-2 can reserve up to 6 GB. (Reservations don’t overlap.)

Please note that if VM-2 had a 7 GB reservation defined, it would not power on. (Reservation admission control.)

It’s also possible for VM-1 to use more resources than it has reserved. That makes the discussion a bit more complex. VM-1 is guaranteed whatever it’s reserved, and it also gets to fight VM-2 for more resources, assuming VM-2 hasn’t reserved the excess. I’ll come up with example scenarios for that too if you like.

There’s good reason why CPU reservations are friendly but memory reservations are greedy. Say a reservation holder is not using all of a resource, and it lets an interloper use the resource for a while; later, the reservation holder wants to use all it has reserved. An interloper can be kicked off a pCPU quickly. CPU instructions are transient, quickly finished. But RAM holds data. If an interloper was holding pRAM, its data would have to be swapped to disk before the reservation holder could repurpose that pRAM to satisfy its reservation. That swapping would take significant time and delay the reservation holder unfairly. So ESX doesn’t allow reserved pRAM to be used by an interloper.

For a more detailed discussion that gets into Resource Pools, how memory reservations do or don’t prevent host-level swapping, and more, see the following post I wrote several months ago, http://www.yellow-bricks.com/2010/03/03/cpumem-reservation-behaviour/.

Author: Craig Risinger

Are you using DPM? We need you!

Duncan Epping · Jul 7, 2010 ·

I just received a request from our Product Management team. Gil Adato is is working on the next generation of DPM and is seeking DPM current and past users and he asked me to post the following message on my blog. I hope you can help Gil and VMware taking DPM to the next level!

Dear VMware Customers,

Product management is starting planning efforts for the 2012 vSphere release, and we are considering numerous DPM improvements. In order to make the right decision, it’s critical that we get a better understanding of our customers’ experiences with the current product, and get customers’ feedback on what needs to be on the roadmap. We’re also interested in customers’ improvement suggestions and in sharing with them several improvement ideas that we have in mind. Customers’ input will be very helpful to us, and customers would see the benefit of communicating their comments, requirements and suggestions/wish-list directly to the product team.

If you’re a current or past DPM user and you’d like to be a part of the process and help shape the next generation of VMware’s products, please contact me directly.

My contact info is the following:

Gil Adato
[email protected]

When contacting me, please send me the following initial information:

Company Name
Customer contact name
Location (Country)
Customer’s email address

VMware’s product management team is looking forward to hearing from you and making you a part of the product development process.

Thank you for your cooperation,

Gil Adato

Memory Limits

Duncan Epping · Jul 6, 2010 ·

We had a discussion internally around memory limits and what the use case would be for using them. I got some great feedback on my reply and comments so I decided to turn the whole thing into a blog article.

A comment made by one of our developers, which I highly respect, is what triggered my reply. Please note that this is not VMware’s view or usecase but what some of our customers feed back to our development team.

An admin may impose a limit on VMs executing on an unloaded host to better reflect the actual service a VM will likely get once the system is loaded; I’ve heard this use case from several admins)

From a memory performance perspective that is probably the worst thing an Admin can do in my humble opinion. If you are seriously overcommitting your hosts up to the point where swapping or ballooning will occur you need to think about the way you are provisioning. I can understand, well not really, people doing it on a CPU level as the impact is much smaller.

Andrew Mitchell commented on the same email and his reply is key to understanding the impact of memory limits.

“When modern OS’s boot, one of the first things they do is check to see how much RAM they have available then tune their caching algorithms and memory management accordingly. Applications such as SQL, Oracle and JVMs do much the same thing.”

I guess the best way to explain in one line is: The limit is not exposed to the OS itself and as such the App will suffer and so will the service provided to the user.

The funny thing about this is that although the App might request everything it can it, it might not even need it. In that case, more common than we think, it is better to decrease provisioned memory than to create an artificial boundary by applying a memory limit. The limit will more than likely impose an unneeded and unwanted performance impact. Simply lowering the amount of provisioned memory might impact performance but most likely will not as the OS will tune it’s caching algorithms and memory management accordingly.

Changes to Snapshot mechanism “Delete All”

Duncan Epping · Jul 5, 2010 ·

Don’t know if anyone noticed it or not but with the latest set of patches VMware changed the “Delete All” mechanism that is part of the Snapshot feature. I wrote multiple articles about the “Delete All” functionality as it often led to completely filled up VMFS volumes when someone used without knowing the inner workings.

Source

When using the Delete All option in Snapshot Manager, the snapshot farthest from the base disk is committed to its parent, causing that parent snapshot to grow. When the commit is complete, that snapshot is removed and the process starts over on the newly updated snapshot to its parent. This continues until every snapshot has been committed.

This method can be relatively slow since data farthest from the base disk might be copied several times. More importantly, this method can aggressively use disk space if the snapshots are large, which is especially problematic if a limited amount of space is available on the datastore. The space issue is troublesome in that you might choose to delete snapshots explicitly to free up storage.

This issue is resolved in this release in that the order of snapshot consolidation has been modified to start with the snapshot closest to the base disk instead of farthest. The end result is that copying data repeatedly is avoided.

Just to give an example, 4 snapshots:

Old situation (pre vSphere 4 Update 2)

Base disk – 15GB
Snapshot 1 – 1GB –> possibly grows to 13GB
Snapshot 2 – 1GB –> possibly grows to 12GB
Snapshot 3 – 1GB –> possibly grows to 11GB
Snapshot 4 – 10GB

Snapshot 4 is copied in to Snapshot 3, Snapshot 3 in to Snapshot 2, Snapshot 2 in to Snapshot 1 and Snapshot 1 in to your Base disk. After the copy of Snapshot 1 in to the Base disk all Snapshots will be deleted. Please note that the total amount of diskspace consumed before the “Delete All” was 28GB. Right before the final merge the consumed diskspace is 61GB. This is just an example, just imagine what could happen with a 100GB data disk!

New situation

Base disk – 15GB
Snapshot 1 – 1GB
Snapshot 2 – 1GB
Snapshot 3 – 1GB
Snapshot 4 – 10GB

Snapshot 1 is copied in to Base disk, Snapshot 2 is copied in to Base disk, Snapshot 3 in to Base disk and Snapshot 4 in to your Base disk. After the copy of Snapshot 4 in to the Base disk all Snapshots will be deleted. Please note that the total amount of diskspace consumed before the “Delete All” was 28GB. Right before the final merge the consumed diskspace is still 28GB. Not only did VMware reduced the chances of running out of disk space, the time to commit the snapshot by using “delete all” has also been decreased using this new mechanism.