Block sizes and growing your VMFS

I had a discussion on block sizes after the post on thin-provisioned disks with some of my colleagues. For those that did not read this post here’s a short recap:

If you create a thin provisioned disk on a datastore with a 1MB blocksize the thin provisioned disk will grow with increments of 1MB. Hopefully you can see where I’m going. A thin provisioned disk on a datastore with an 8MB blocksize will grow in 8MB increments. Each time the thin-provisioned disk grows a SCSI reservation takes place because of meta data changes. As you can imagine an 8MB blocksize will decrease the amount of meta data changes needed, which means less SCSI reservations. Less SCSI reservations equals better performance in my book.

As some of you know the locking mechanism has been improved with vSphere, yes there’s a good reason why they call it “optimistic locking”. In other words why bother increasing your block size if the locking mechanism has improved?

Although the mechanism behaves differently it does not mean that locking does not need to occur. In my opinion it’s still better to have 1 lock vs 8 locks if a VMDK need to grow. But there’s another good reason, with vSphere comes growable VMFS volumes. You might start with a 500GB VMFS volume and a 1MB block size, but when you expand the disk this block size might not be sufficient when you create new VMs. Keep in mind that you can’t modify the block size, while you just might have given people the option to create disks beyond the limit of the block size. (Mind: you will receive an error, it’s not possible.)

So what about overhead? Will my 1KB log files all be created in 8MB blocks? Cause this would mean a large overhead and might be a valid reason to use 1MB block sizes!

No it will not. VMFS-3 solves this issue by offering a sub-block allocator. Small files use a sub block to reduced overhead. A sub block of a 1MB block size volume is 1/16th the size of the block. For an 8MB block size volume it’s 1/128th. In other words, the sub-blocks are 64KB large in both cases and thus the overhead is the same in both cases as well.

Now my question to you guys, what do you think? Would it make sense to always use an 8MB blocksize… I think it would




You can skip to the end and leave a response. Pinging is currently not allowed.

16 Responses to “Block sizes and growing your VMFS”

  1. Ken Cline says:

    I’ve always felt that larger block sizes were more appropriate. VMFS deals with (mostly) large files, so take advantage of that and allocate in large chunks :)

  2. The option should be removed from the interface and VMware should allocate it always to 8Mb blocks!!! I’ve seen too many cases with customers wanted to grow their vmdk but were limited to 256Gb. Totally agree with you!

  3. Roger Lund says:

    Agreed, I always use the 8 MB block size for my datastore.

  4. Justin says:

    We used 2MB block size when we upgraded to 3.5. I may consider using 8MB in our next upgrade.

  5. glnsize says:

    Until I understood how vmfs3 utilized sub blocks I always used 1mb. Once it was explained to be how it all worked I switched to an 8mb. Sub block allocation negated my only concern with using the larger block size.

  6. Sid Smith says:

    I agree with the changes made as part of vSpher using an 8mb block size is the logical choice.

  7. jl says:

    we will wait to see how other people experience thin-provisioned disks performance wise and if the scsi reservations are going to make problems, just saying that vsphere has optimized locking technology is not enough, need a detailed white paper that describes why it is better.

  8. I don’t think you really understood my post… Or I am really confused about what you are trying to say.

  9. craig says:

    I am just little curious here, if 8MB block size should be the right setting to go for, but why the block size recommendation from VMware in the administration console, always tie up a certain block size with specify datastore size? any idea? just to clarify here

  10. Good question and I honestly don’t know. That’s one of the reasons I asked you guys to chip in…

  11. Sharninder says:

    8MB all the way. There aren’t many files less than 64kb on a vmfs anyway and the sub-blocks solve the internal fragmentation problem to a large extent. 8MB sounds like the logical choice for vsphere.

    Although, I’m not sure how thin provisioned disks will behave wrt. thick disks’ in performance.

  12. Satyam Vaghani says:

    I am a VMware employee and I wrote VMFS with a few cronies, but the following is a personal opinion:

    Forget about locking. Period. Yes, SCSI reservations do happen (and I am not
    trying to defend that here) and there will be some minor differences in
    performance, but the suggestion on the (very well written) blog post goes against the mission of VMFS, which is to
    simplify storage virtualization.

    Heres a counter example: if you have a
    nearly full 8MB VMFS volume and a less full 1MB VMFS volume, you’ll still
    encounter less IO overheads allocating blocks on a 1MB VMFS volume compared
    to the 8MB volume because the resource allocator will sweat more trying to
    find a free block in the nearly full volume. This is just one scenario, but my point is that there are tons of things to consider if one wants to account for overheads in a holistic manner and the VMFS engineers don’t want you to bother with these “tons” of things. Let us handle all that for you.

    So in summary, blocksizes and thin provisioning should be treated
    orthogonally. Since thin provisioning is an official feature, the thing for
    users to know is that it will work “well” on all VMFS blocksize
    configurations that we support. Thinking about reservations or # IOs the
    resource manager does, queue sizes on a host vs the blocksize, etc will confuse the user with
    assertions that are not valid all the time.

    I like the post in that it explains blocks vs sub-blocks. It also appeals to
    power users, so that’s great too. But reservation vs. thin provisioning
    considerations should be academic only. I can tell you about things like
    non-blocking retries, optimistic IO (not optimistic locking) and tons of
    other things that we have done under the covers to make sure reservations
    and thin provisioning don’t belong in the same sentence with vSphere 4. But
    conversely, I challenge any user to prove that 1MB incurs a significant
    overhead compared to 8MB with thin provisioning :)

  13. Duncan says:

    This is honestly one of the best replies I ever had on my blog. Thank you very much for your insights! I really appreciate it and I know for sure that all my readers will appreciate it.

  14. canalha says:

    A valuable discussion, guys.

    Question on top of it: won’t VMFS-3 also need a lock to allocate the sub-block?

  15. Tom says:

    It seems that unless one has a lot of I/O issues to think about — big companies, big apps, etc. — block size might be important. But for the majority of SMBs implementing esx/vsphere, they’re not going to have gigantic I/O issues and the block size is just one more thing to remember. Makes more sense for everyone to set up their VMs aligned. I created my gold templates on aligned C:\ drives because I was creating new ones anyway. And I align any new partitions I create…I really liked Satyam’s comment. It convinced me that I don’t have too much to worry about by going with the defaults and making the effort to align disks.

  16. pwallace says:

    I am not sure I see where Satyam’s comments help dismiss disk aligning, but I am new to this so any explanation would be appreciated.

Leave a Reply

Subscribe to RSS Feed Follow me on Twitter!