• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

Block sizes and growing your VMFS

Duncan Epping · May 14, 2009 ·

I had a discussion on block sizes after the post on thin-provisioned disks with some of my colleagues. For those that did not read this post here’s a short recap:

If you create a thin provisioned disk on a datastore with a 1MB blocksize the thin provisioned disk will grow with increments of 1MB. Hopefully you can see where I’m going. A thin provisioned disk on a datastore with an 8MB blocksize will grow in 8MB increments. Each time the thin-provisioned disk grows a SCSI reservation takes place because of meta data changes. As you can imagine an 8MB blocksize will decrease the amount of meta data changes needed, which means less SCSI reservations. Less SCSI reservations equals better performance in my book.

As some of you know the locking mechanism has been improved with vSphere, yes there’s a good reason why they call it “optimistic locking”. In other words why bother increasing your block size if the locking mechanism has improved?

Although the mechanism behaves differently it does not mean that locking does not need to occur. In my opinion it’s still better to have 1 lock vs 8 locks if a VMDK need to grow. But there’s another good reason, with vSphere comes growable VMFS volumes. You might start with a 500GB VMFS volume and a 1MB block size, but when you expand the disk this block size might not be sufficient when you create new VMs. Keep in mind that you can’t modify the block size, while you just might have given people the option to create disks beyond the limit of the block size. (Mind: you will receive an error, it’s not possible.)

So what about overhead? Will my 1KB log files all be created in 8MB blocks? Cause this would mean a large overhead and might be a valid reason to use 1MB block sizes!

No it will not. VMFS-3 solves this issue by offering a sub-block allocator. Small files use a sub block to reduced overhead. A sub block of a 1MB block size volume is 1/16th the size of the block. For an 8MB block size volume it’s 1/128th. In other words, the sub-blocks are 64KB large in both cases and thus the overhead is the same in both cases as well.

Now my question to you guys, what do you think? Would it make sense to always use an 8MB blocksize… I think it would

Related

Server Storage, vmfs, vSphere, vstorage

Reader Interactions

Comments

  1. Ken Cline says

    14 May, 2009 at 15:30

    I’ve always felt that larger block sizes were more appropriate. VMFS deals with (mostly) large files, so take advantage of that and allocate in large chunks 🙂

  2. Bouke Groenescheij says

    14 May, 2009 at 16:13

    The option should be removed from the interface and VMware should allocate it always to 8Mb blocks!!! I’ve seen too many cases with customers wanted to grow their vmdk but were limited to 256Gb. Totally agree with you!

  3. Roger Lund says

    14 May, 2009 at 19:09

    Agreed, I always use the 8 MB block size for my datastore.

  4. Justin says

    14 May, 2009 at 20:20

    We used 2MB block size when we upgraded to 3.5. I may consider using 8MB in our next upgrade.

  5. glnsize says

    14 May, 2009 at 20:56

    Until I understood how vmfs3 utilized sub blocks I always used 1mb. Once it was explained to be how it all worked I switched to an 8mb. Sub block allocation negated my only concern with using the larger block size.

  6. Sid Smith says

    14 May, 2009 at 21:49

    I agree with the changes made as part of vSpher using an 8mb block size is the logical choice.

  7. jl says

    14 May, 2009 at 22:10

    we will wait to see how other people experience thin-provisioned disks performance wise and if the scsi reservations are going to make problems, just saying that vsphere has optimized locking technology is not enough, need a detailed white paper that describes why it is better.

  8. Duncan Epping says

    14 May, 2009 at 22:57

    I don’t think you really understood my post… Or I am really confused about what you are trying to say.

  9. craig says

    17 May, 2009 at 03:41

    I am just little curious here, if 8MB block size should be the right setting to go for, but why the block size recommendation from VMware in the administration console, always tie up a certain block size with specify datastore size? any idea? just to clarify here

  10. Duncan Epping says

    17 May, 2009 at 06:38

    Good question and I honestly don’t know. That’s one of the reasons I asked you guys to chip in…

  11. Sharninder says

    19 May, 2009 at 07:21

    8MB all the way. There aren’t many files less than 64kb on a vmfs anyway and the sub-blocks solve the internal fragmentation problem to a large extent. 8MB sounds like the logical choice for vsphere.

    Although, I’m not sure how thin provisioned disks will behave wrt. thick disks’ in performance.

  12. Satyam Vaghani says

    29 May, 2009 at 18:51

    I am a VMware employee and I wrote VMFS with a few cronies, but the following is a personal opinion:

    Forget about locking. Period. Yes, SCSI reservations do happen (and I am not
    trying to defend that here) and there will be some minor differences in
    performance, but the suggestion on the (very well written) blog post goes against the mission of VMFS, which is to
    simplify storage virtualization.

    Heres a counter example: if you have a
    nearly full 8MB VMFS volume and a less full 1MB VMFS volume, you’ll still
    encounter less IO overheads allocating blocks on a 1MB VMFS volume compared
    to the 8MB volume because the resource allocator will sweat more trying to
    find a free block in the nearly full volume. This is just one scenario, but my point is that there are tons of things to consider if one wants to account for overheads in a holistic manner and the VMFS engineers don’t want you to bother with these “tons” of things. Let us handle all that for you.

    So in summary, blocksizes and thin provisioning should be treated
    orthogonally. Since thin provisioning is an official feature, the thing for
    users to know is that it will work “well” on all VMFS blocksize
    configurations that we support. Thinking about reservations or # IOs the
    resource manager does, queue sizes on a host vs the blocksize, etc will confuse the user with
    assertions that are not valid all the time.

    I like the post in that it explains blocks vs sub-blocks. It also appeals to
    power users, so that’s great too. But reservation vs. thin provisioning
    considerations should be academic only. I can tell you about things like
    non-blocking retries, optimistic IO (not optimistic locking) and tons of
    other things that we have done under the covers to make sure reservations
    and thin provisioning don’t belong in the same sentence with vSphere 4. But
    conversely, I challenge any user to prove that 1MB incurs a significant
    overhead compared to 8MB with thin provisioning 🙂

  13. Duncan says

    29 May, 2009 at 23:45

    This is honestly one of the best replies I ever had on my blog. Thank you very much for your insights! I really appreciate it and I know for sure that all my readers will appreciate it.

  14. canalha says

    30 May, 2009 at 04:06

    A valuable discussion, guys.

    Question on top of it: won’t VMFS-3 also need a lock to allocate the sub-block?

  15. Tom says

    30 May, 2009 at 17:43

    It seems that unless one has a lot of I/O issues to think about — big companies, big apps, etc. — block size might be important. But for the majority of SMBs implementing esx/vsphere, they’re not going to have gigantic I/O issues and the block size is just one more thing to remember. Makes more sense for everyone to set up their VMs aligned. I created my gold templates on aligned C:\ drives because I was creating new ones anyway. And I align any new partitions I create…I really liked Satyam’s comment. It convinced me that I don’t have too much to worry about by going with the defaults and making the effort to align disks.

  16. pwallace says

    10 December, 2009 at 21:09

    I am not sure I see where Satyam’s comments help dismiss disk aligning, but I am new to this so any explanation would be appreciated.

  17. Anders Olsson says

    22 April, 2010 at 14:10

    According to http://kb.vmware.com/kb/1003565 sub blocks don’t exist:

    “With a block size of 1MB, every file that is created uses at least 1MB of space on the storage, regardless of its actual size. With an 8MB block size, a 1KB file still occupies 8MB of space. The unused space in that block is wasted. The larger block size is only required when a file is so large that it requires an extended addressing space. Being aware of the intended use helps with your planning and efficient use of space on the data store.”

  18. Golddiggie says

    23 April, 2010 at 14:44

    Is it the overhead with thin provisioning the reason why some VM’s just run better if the drives are thick provisioned?

    I’ve moved from having all VM’s drives thin provisioned to making the C drive (where the OS is installed, and not much else) thick with the other drives, mostly, thin (at least for now). I’m actually going to make my SQL server thick on both drives now, so that there’s less overhead, and (possibly) better performance on the VM. It would explain why on slower storage, thin provisioning may not be such a good idea.

    Where there’s been plenty of storage, I’ve typically used the block size, on each LUN, that made sense for what the size of the virtual disk would be on the VM’s. If we’re looking at small drives, that easily fit under the 256GB size limit for the 1MB block size, then I use that. I have had VM’s that have (occasionally) needed larger drives, so larger block sizes were used on those LUNs.

    I see it really coming down to proper planning for deciding what block size to use. Have a baseline that will work for the majority of the VM’s, but still have some LUNs that are different for when you need to deviate from the standard. It also goes along with the size you make the LUNs… If you’re using 500GB (or less) LUNs, then 1MB (or 2MB) block sizes make a lot more sense than 8MB blocks. If you’re using 2TB LUNs, then the larger block sizes probably make more sense. I would also not make a single LUN the only one of it’s size, with that block size.

  19. Ken says

    12 July, 2010 at 01:45

    It looks like the KB (http://kb.vmware.com/kb/1003565) has been fixed. It now reads:

    “VMFS3 uses sub blocks for directories and small files with size smaller than 1 MB. When the VMFS uses all the sub block (4096 sub blocks of 64 KB each), file blocks will be used. For files of 1 MB or higher, file blocks are used. The size of the file block depends on the block size you selected when the Datastore was created.”

  20. urs weber says

    21 September, 2010 at 17:47

    I’ve one more question concerning block size:

    I prefer bigger block sizes too, but how is the changed block tracking feature affected?

    Is the changed block tracking also using the VMFS block size?
    If yes, bigger blocks means less granularity and so may lead to less compression and less dedup percentage.

    Are my suggestions right or is the block size not affecting changed block tracking?

    Regards

    Urs

    • David says

      15 July, 2011 at 15:05

      I know this is an older article but found it very interesting as I was searching for articles on VMFS versions and block sizes.

      We recently moved to ESXi 4.1 from ESX 4.0 and began having issues with backups using Vranger. For performance gains Vranger used the console in ESX to process and proxy the data before sending across the network to the repository. In ESXi the console is removed so this local proxying could not be done. To compensate for the lack of a console, the Vranger folks now recommended running Vranger on a VM and using a technology called Hot Add where the VM would now act as the proxy before sending data across the network to the repository.

      Now this is where the block size comes in. It appears that there is a limitation in the VDDK that states the following:

      “•Hot Add limitation when VMFS block sizes are mismatched
      Hot Add cannot be used if the VMFS block size of the datastore containing the virtual machine folder for the target virtual machine does not match the VMFS block size of the datastore containing the proxy virtual machine. For example, if you back up virtual disk on a datastore with 1MB blocks, the proxy must also be on a datastore with 1MB blocks.”

      This article can be found in the VDDK release notes
      http://www.vmware.com/support/developer/vddk/VDDK-1.2.1-Relnotes.html

      This became a big issue when trying to backup larger VMs that were on datastores of different block sizes than where the Vranger proxy was running. The Hot Add reduced the backup time by a factor of 2-3 times. A backup that normally would take over 24 hours would now finish under 10 hours.

      • Duncan Epping says

        15 July, 2011 at 16:37

        Thanks, great info

  21. Sir Mix-A-lot says

    2 December, 2010 at 15:28

    “I like big blocks, and I cannot lie”

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Advertisements




Copyright Yellow-Bricks.com © 2025 · Log in