• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

Yellow Bricks

by Duncan Epping

  • Home
  • Unexplored Territory Podcast
  • HA Deepdive
  • ESXTOP
  • Stickers/Shirts
  • Privacy Policy
  • About
  • Show Search
Hide Search

Software Defined

Frequently asked questions about Virtual SAN / VSAN

Duncan Epping · Sep 16, 2013 ·

After I published the vSphere Flash Read Cache FAQ many asked if I would also do a blog post for frequently asked questions about Virtual SAN / VSAN. I guess it makes sense considering Virtual SAN / VSAN being such a hot topic. So here are the questions I have received so far, followed by the answers of course. If you have a question do not hesitate to leave a comment.

** updated to reflect VSAN GA **

  • Can I add a host to a VSAN cluster which does not have local disks?
    • Yes a VSAN cluster can consist of hosts which are not contributing to VSAN storage. You will need to create a VSAN VMkernel and simply add it to the cluster. Note that you will need at a minimum 3 hosts which contribute storage to VSAN
  • VSAN requires an SSD, what is it used for?
    • The SSD is used for read caching (70%) and write buffering (30%). Every write will go to SSD first and will be destaged to HDD later.
  • When creating my VSAN VM Storage Policy, when do I use “failures to tolerate” and when do I use “stripe width”?
    • Failures to tolerate is all about availability, this is what you define when your virtual machine will need to be available when a host or disk group has failed. So if you want to take 1 host failure in to account, you define the policy to 1. This will then create 2 data objects and 1 witness in your cluster. Stripe width is about performance (read performance when not in cache and write destaging). Setting it to two or higher will result in data being striped across multiple disks. When used in conjunction with “failures” to tolerate this could potentially result in data of a single VM stored on multiple disks on multiple hosts.
  • Is there a default storage policy for VSAN?
    • Yes there is a policy applied by default to all VMs on a VSAN datastore but you cannot see this policy within the vSphere UI. You can see that a default policy is defined to various classes using the following command: esxcli vsan policy getdefault. By default an N+1 failures to tolerate policy is applied so that even in the case where user forgets to create and set a policy objects are made resilient. It is not recommended to change the default policy.
  • How is data striped across multiple disks on a host when stripe width is set to 2?
    • When stripe width is set to 2 first of all there is no guarantee that the data is striped across disks within a host. VSAN has it’s own algorythm to determine where data should be placed and as such it could happen that although you have sufficient disks in all host your data is striped across multiple hosts instead of disks within a host. When data is striped this is done in chunks of 1MB.
  • What is the purpose of “disk groups” since VSAN will create one datastore anyway?
    • A disk group defines the SSD that is used for caching/buffering in front of a set of HDDs. Basically a disk groups is a way of mapping HDDs to an SSD. Each disk group will have 1 SSD and a maximum of 7 disks.
  • How many disks can a single host contribute to VSAN?
    • Max 5 diskgroup
    • Each disk group needs 1 SDD and 1 HDD at a mininum and 7 HDDs at a maximum
    • HDD count max per host = 5 x 7 = 35
    • SSD count max per host = 5 x 1 = 5
  • Are both SSD and PCIe Flash cards supported?
    • Yes both are supported but check the HCL for more details around this as there are guidelines and requirements
  • Is 10GbE a hard requirement for VSAN?
    • 10GbE is not a hard requirement for VSAN. VSAN works perfectly fine in smaller environments, including labs, with 1GbE. Do note that 10GbE is a recommendation.
  • Why is it recommended for HA’s isolation response to be configured to “powered-off”?
    • When VSAN is enabled vSphere HA uses the VSAN VMkernel network for heartbeating. When a host does not receive any heartbeats, it is most likely that the host is also isolated/partitioned from a VSAN perspective from the rest of the cluster. In this state it is recommended to power-off the virtual machine as a new copy will be powered-on by HA on the remaining hosts in the cluster automatically. This way when the host comes out of isolation the situation where 2 VMs with the same identity are on the network does not occur.
  • Can I partition my SSD or disks so that I can use them for other (install ESXi / vFlash) purposes?
    • No you cannot partition your SSD or HDD(s). Virtual SAN will only, and always, claim entire disks. With VSAN it probably makes most sense to install ESXi on an internal USB/SD card, this to maximize the capacity for VSAN.
  • Does VSAN support deduplication or compression?
    • In the current version VSAN does not support deduplication or compression. The most expensive resource in your VSAN cluster is SSD/Flash, hence duplication of data is most relevant on that layer. While having multiple copies of your data results in two copies on HDDs, and two temporary copies in the distributed write buffer (30% of the SSDs), the distributed read cache portion of the Flash (70%) will only contain a single copy of any cached data.
  • Can VSAN leverage SAN/NAS datastores?
    • VSAN currently does not support the use of SAN/NAS datastores. Disks will need to be “local” and directly passed to the host.
  • I was told VSAN does thin disks by default, if I set Object Space Reservation to 100% does that mean the VMDK will be eager zero thick provisioned?
    • No it does not mean the VM will be thick provisioned, or a portion for that matter, when you define Object Space Reservation. Object Space Reservation is all about the numbers used by VSAN when calculation used disk space / available disk space etc. When Object Space Reservation is set to 100% on a disk of 25GB then this disk will be a thin provisioned disk but VSAN will do its math with 100% used of 25GB. I guess you can compare it to a memory reservation.
  • Does VSAN use iSCSI or NFS to connect hosts to the datastore?
    • VSAN does not use either of these two to connect hosts to a datastore. It uses a proprietary mechanism.
  • What is the impact of maintenance mode in a VSAN enabled cluster?
    • There are three ways of placing a host which is providing storage to your VSAN datastore in maintenance mode:
      1) Full Data Migration – All data residing on the host will be migrated. Impact: Could take a long time to complete.
      2) Ensure accessibility – VSAN ensures that all VMs will remain accessible by migrating the required data to other hosts. Impact: Potentially availability policies are violated.
      3) No Data Migration – No data will be migrated. Impact: Depending on the “failures to tolerate” policy defined some VMs might become unusable.
      The safest option is option 1, with option 2 being the preferred and default as it is the fastest to complete. I guess the question is why you are placing the host in maintenance mode and how fast it will become available again. Option 3 is a fall back, in caseyou really need to get into maintenance mode fast and don’t care about potential data loss.
  • Are there any features of vSphere which aren’t supported/compatible with VSAN?
    • Currently vSphere Distributed Power Management, Storage DRS and Storage IO Control are not supported with VSAN.
  • How do I add a Virtual SAN / VSAN license?
    • VSAN licenses are applied on a cluster level. Open the Webclient click on your VSAN enabled cluster, click the “Manage” tab followed by “Settings”. Under “Configuration” click “Virtual SAN Licensing” and then click “Assign License Key”.
  • How will Virtual SAN be priced / licensed?
    • VSAN is licensed per socket, the price is $ 2495 per socket or $ 50,- per VDI user. Note that the license includes the Distributed Switch and VM Storage Policies, even when using a vSphere license lower than Enterprise Plus!
  • If a host has failed and as such data is lost and all VMs were protected N+1, how long will it take before VSAN starts rebuilding the lost data?
    • VSAN will identify which objects are out of compliance (those which had N+1 and were stored on that host) and starts a time-out period of 60 minutes. It has a time-out period to avoid an unnecessary and costly full sync of data. If the host returns within those 60 minutes then the differences will copied to that host. When a VM has multiple mirrors it doesn’t notice the failure, this 60 minute period is all about going back to full policy compliance, i.e. being able to satisfy additional failures may they occur.
  • When a virtual machines moves around in a cluster will its objects follow to keep IO local?
    • No, objects (virtual disks for instance) do not follow the virtual machine. Just imagine what the cost/overhead of moving virtual disks between hosts would be each time DRS suggests a migration. Instead IO can be done remotely. Meaning that although your virtual machine might run on host-1 from a CPU/Mem perspective, its virtual disks could be physically located on host-2 and host-3.
  • When a Virtual Machine is migrated to another host,  is the situation such that after a vMotion the SDD cache is lost (temporary performance hit) and the cache will be rebuilt over time?
    • No cache will not be lost and there is no need to rebuilt/warm the cache up again. Cache will be accessed remotely when needed.
  • Does VSAN support Fault Tolerance aka FT?
    • No, VSAN does not support Fault Tolerance in this release.
  • The SSD in my host is being reported in vSphere as “non-SSD”. According to support this is a known issue with the generation of server I am using. Will this “mis-reporting” of the disk type affect my ability to configure a VSAN?
    • Yes it will, you will need to tag the SSD as local using (example below is what I use in my lab, your identifier will be different). And in this case I claim it as being “local” and as “SSD”.
      esxcli storage nmp satp rule add –satp VMW_SATP_LOCAL –device mpx.vmhba2:C0:T0:L0 –option “enable_local enable_ssd”
  • It was mentioned that it will take 60 minutes after a failure before VSAN starts the automatic repair. Is it possible to shorten this time-out value?
    • **disclaimer: Although I do not recommend changing this value, I was told it is supported**
      Yes it is possible to shorten this time-out value by configuring the advanced setting named “VSAN.ClomRepairDelay” on every host in your VSAN cluster.
  • Why can’t I use datastore heartbeat functionality in VSAN only cluster?
    • There is no requirement for heartbeat datastores. The reason you do not have this functionality when you only have a VSAN datastore is because HA will use the VSAN network for heartbeats. So if a host is isolated from the VSAN network and cannot send heartbeats, it is safe to say that it will also not be able to update a heartbeat region remotely as such making it pointless to enable this feature in a VSAN only environment.
  • Are there specific Best Practices around deploying View on VSAN?
    • Yes there are, primarily around availability / caching and capacity reservations. Andre Leibovici wrote an article on this topic, read it!
  • Can the VSAN VMkernel of hosts in a cluster be part of a different subnet?
    • VSAN VMkernel’s need to be part of the same subnet. Different subnet for one (or multiple) hosts within a VSAN cluster is not supported. When using multiple VMkernel interfaces per host each interface needs to be part of a different subnet!
  • Does VSAN support being stretched across multiple geographical locations?
    • In the current version VSAN will not support “metro” clustering.
  • Is there a difference between a host failing and a disk gradually failing?
    • Yes there is a difference. There are various failure stated and depending on the state it also determines how fast VSAN will spin up a new mirror. The two failure states are “absent” and “degraded”. Degraded is where a disks has failed and the system has recognized this as such and knows it isn’t coming back. In this case VSAN recognizes this “degraded” state and will create a new mirror of the impacted objects immediately, as there is no point in waiting for 60 minutes when you know it isn’t coming back soon. The “absent” state means that VSAN doesn’t know if it is coming back any time soon, this could be a host that has failed or for instance when you yank a disk, in this case the 60 minute time-out starts.
  • Is there any explanation around how VSAN handles disk failures or host failures?
    • Yes, I wrote an article on this topic. Please read “How VSAN handles a disk or host failure” for more details.
  • What happens when an SSD fails in a VSAN cluster?

    • An SSD sits in front of a Disk Group as the read cache / write buffer. When the SSD fails then the disk group and all the components stored on it are marked as degraded. VSAN will then instanties new mirror copies where applicable and when sufficient disk capacity is available. For more details read this post.
  • Does vSphere support TRIM for SSDs?
    • No, TRIM is currently not supported/leveraged.
  • What are the Maximum Numbers for Virtual SAN GA?
    • 32 hosts per cluster
    • 100 VMs per host maximum
    • 3200 VMs per cluster maximum
    • 2048 VMs HA protected per cluster maximum
    • 2 million IOPS tested
  • How do I size a VSAN datastore / cluster?
    • I developed a sizing calculator which can be found here.
  • How do I monitor VSAN performance?
    • Performance can easily be monitored using the VSAN Observer tool. This has been discussed by various people: here, here and here, here.
  • What’s likely to affect VSAN performance ?
    • Performance is most likely affected by leveraging cheap flash devices or incorrectly configured policies. In the case a workload is highly random and has a large “working set” it could be that many of the IOs will need to come from disk, this can also impact performance depending on the disk type used and the number of disk stripes.
  • Why is  Storage DRS not supported in VSAN ?
    • VSAN only provides a single datastore and has its own placement and balancing algorithms.
  • What will happen when the whole environment goes down and power back on again ? Do we run some sort of integrity check ?
  • Is VSAN dependent on vCenter ? Can I configure VSAN if vCenter is down ?
    • VSAN is not dependent on vCenter. It can be configured from the console using “esxcli” and can even be configured and used before vCenter is up and running. William Lam wrote two articles around how to bootstrap vCenter on a single host running VSAN. (here and here)
  • Could you have locality in VSAN ? Does locality make sense at all compared to other solutions ?
    • By default VSAN does not have a “data locality” concept as I explained here. However, for View environments CBRC is fully supported and that provides a local read cache for desktops.
  • Is vCops aware of VSAN datastore?
    • The current version of VC Ops has limited functionality in its current release. The upcoming version of VC Ops will include more statistics and ways of monitoring a VSAN datastore.
  • How do you backup your VM’s in VSAN ? Just usual existing backup procedures ?
    • VDP supports VSAN and various storage vendors are going through testing/releasing a new version of their product as we speak. VMs stored on a VSAN database should not be treated differently then regular VMs.
  • Does VSAN support any data reduction mechanisms like deduplication or compression?
    • In the current version deduplication or compression is not included.
  • x

If you have a question, please don’t hesitate to ask… Over time I will add more and more to this list so come back regularly.

vSphere Flash Read Cache and esxcli

Duncan Epping · Sep 13, 2013 ·

As most features these days in vSphere you can configure them using the awesome esxcli command. I’ve already mentioned esxcli in my vSphere Flash Read Cache FAQ blog but I wanted to call it out explicitly here as I found it very useful. You can get some nice details using the esxcli command. So where do we start?

First thing would be:

esxcli storage vflash

This will return that there are 3 namespaces: cache, module and device. Lets start top down with device. The command “esxcli storage vflash device list” will show you a list of all flash devices and whether it has been configured for vFRC or not. The module namespace can provide you some more details around for instance cache blocksizes etc. If you run the command this is what the output looks like:

~ # esxcli storage vflash module get
 Min Supported Module Version: 1.0.0.0
 Revision: 1.0.0.0
 Supported Cache Block Size Max: 1048576
 Supported Cache Block Size Min: 4096
 Supported Cache Size Max: 214748364800
 Supported Cache Size Min: 4194304
 Supported Disk Size Max: 17592186044416
 Supported Mode Mask: WriteThru

[Read more…] about vSphere Flash Read Cache and esxcli

Frequently asked questions about vSphere Flash Read Cache

Duncan Epping · Sep 11, 2013 ·

Last week I received many emails on the topic of vFlash so I figured I would start collecting these questions and answer them in a “frequently asked questions about vSphere Flash Read Cache” article. That way the info it is out in the open and it should be fairly easy to find using google. If you haven’t done so yet, I would recommend starting by reading my introduction to vSphere Flash Read Cache. If you already have, here are the questions (and answers) I received so far:

  • In a cluster where some hosts have vFlash resources and others do not, can I vMotion a VM with vFlash assigned to a host without vFlash resources?
    • In order to successfully migrate a VM vFlash resources are required on the destination host
  • When migrating a VM I have various options, in essence I can move the cache or drop it. What is the impact of moving the cache?
    • When selecting to move the cache essential an “X-vMotion” (cross vMotion / non-shared-disk migration) is performed. Meaning that the “local” vFlash cache file that is stored on the VFFS filesystem is copied from source to destination. This results in the migration taking longer, but also means that the cache is hot instantly and does not need to be warmed up (impact of dropping the cache is the warm up that needs to happen again).
  • What are the requirements and considerations to run vSphere Flash Read Cache from a VM/Guest perspective?
    • The requirements from a guest perspective are minimal, other than VM hardware level 10 (5.5 compatiblity) there are no requirements. Considerations are also minimal, of course the IO pattern will matter, if you do 90% writes then the benefit will be minimal. If you have a specific IO size then a consideration could be changing the block size.
  • Will a VM be moved by DRS when it has vFlash enabled on a disk?
    • DRS treats a virtual machine with vFlash enabled as if “should vm-host” affinity rules are applied. Meaning that it will only move a virtual machine when absolutely needed, example being when entering maintenance mode or when it is the only way to solve over-utilization. (Similar to “low latency VM” functionality)
  • What additional considerations (if any) need to be made for backing up vFRC supported guests; at guest and/or VM layer?
    • There are no direct considerations with regards to backup. Considering vSphere Flash Read Cache is “write-through” there is no such a thing as “dirty data”. Both cache and storage will be in-sync at all times. One thing to realize is though that when you do a full VM image level backup, the VM will need to be restored to a host that has vFlash enabled in order for it to be powered on. Also note that with other solutions which do write-back back-up could be a potential caveat / concern. Discuss the potential impact with those vendors as each implementation differs.
  • What is the purpose of the “block size” which I can specify in the advanced settings? What is the minimum and maximum size?
    • The block size can be configured from 4KB to 1024KB. In order to optimize for I/O performance and minimize overhead try to align the vFlash blocksize to the blocksize used by your Guest OS / Application. You can leverage a tool like vscsiStats to identify the blocksize used by your virtual machine(s).
  • What happens to a virtual machine which is accelerated and the flash device on the host it is running on fails?
    • When a flash device fails which is being used for vFlash and a VM is actively using it then the VM will continue running but will experience a performance degradation as all reads will need to come from the storage device instead of SSD. In other words, an SSD failure does not cause any form of disruption to the workload but could potentially result in a degraded performance (similar to the performance before enabling vFlash)
  • What if I run out of capacity on my flash device?
    • When a flash device runs out of capacity you will simply not be able to reserve flash resources for new virtual machines or virtual machines for which you want to allocate flash resources.
  • Does vSphere Flash Read Cache do anything for writes like other solutions do?
    • Well I guess the short answer is “no, not directly”. It is called “vSphere Flash Read Cache” for a reason I guess… But when you think about it, although it doesn’t provide write-back caching the “read caching” portion will free up expensive array resources used for those reads. These expensive resources can now be used for… indeed… writes. So although vSphere Flash Read Cache does not provide write-back caching, only write-through, it might lead to accelerated writes simply because of the decrease of stress on your storage system.
  • Is it possible to configure vSphere Flash Read Cache on a virtual machine with Raw Device Mappings (RDM)?
    • vSphere Flash Read Cache does not support Physical RDMs vSphere Flash Read Cache does support Virtual RDMs.
  • Can I use vFlash on a virtual machine which is stored on a VSAN datastore?
    • No you cannot enable vFlash on a virtual machine which is stored on a VSAN datastore. Considering VSAN already provides read caching and write bufferering there is no real value in enabling that second layer of caching and subsequently add complexity along with it.
  • If I have VSAN enabled and also a NAS/SAN datastore, can my VM that is running on the NAS/SAN datastore be enabled for vFRC?
    • Yes, when the VM is stored on the NAS/SAN datastore you can enable vFRC for this particular VM.
  • Is there a minimum size for vSphere Flash Read Cache allocation to a virtual machine?
    • The minimum size for vSphere Flash Read cache is dependent on the selected blocksize. With a 4KB blocksize the minimum is 4MB, and with a 1MB blocksize the minimum is 276MB
  • What happens if I have my block size configured to 1024KB and I do a 4KB IO?
    • Within that 1024KB cache block your 4KB IO will be stored, the rest of the block (1020KB in this scenario) will be marked as “empty”. Hence the reason it is important to align your IO block size with your cache block size to avoid wastage.
  • What happens to my cache when I resize it on a powered-on VM?
    • When the cache is resized all cache is discarded. This results in a period of time where the cache will need to warm up again.
  • If a host fails and virtual machines with vFRC enabled need to be failed over and there are no sufficient resources available what happens?
    • vFRC allocations are reservations. The behavior is similar to that of a CPU or memory reservation. When there is not enough unreserved capacity available then HA will not be able to power-on the virtual machine.
  • When I configure vSphere Flash Read Cache the maximum cache size seems to be limited to 200GB, what is supported and how do I change this?
    • The maximum cache size per virtual machine is 400GB, however in the current release the cache size is limited to 200GB. You can however change the advanced setting called “VFLASH.MacCacheFileSizeMB” to 409600 if desired. Do note that you will need to change this setting on every host if you want the virtual machine to be able to vMotion between hosts or want vSphere HA to be able to restart a virtual machine.
  • How do I monitor a VM to see how much cache is being used? Is there anything in vCenter or ESXTOP to help me?
    • vCenter contains multiple metrics on a virtual machine level. Just look at the advanced section and more explicitly the virtual disk counters starting with “Virtual Flash …”. There are also various helpful metrics to be found using “esxcli storage vflash cache stats get -c <cache name>. This will for instance provide you with the cache hit rate, latency etc. I have not found any esxtop counters yet.
  • When a virtual machine with vSphere Flash Read Cache enabled is vMotioned including the cache which network is used to transfer the cache?
    • When a VM is migrated / vMotioned including the cache then the “XvMotion / enhance vMotion / non-shared migration” logic is used to move the cache content from source to destination. The data is transferred over the vMotion network. Ensure that sufficient bandwidth is available to shorten the migration time.
  • When flash content is migrated along with the virtual machine, will the migration process make a distinction between hot and cold cache?
    • XvMotion is responsible for migrating the cache from source to destination. Today it doesn’t know the difference and will copy the full file from source to destination. Meaning that if you have a cache size of 10GB the full 10GB will be copied regardless of it being used or not.
  • Can I use the same disk that is divided into partition for vSphere Flash Read Cache & VSAN SSD tier?
    • This is not supported! Both VSAN and vSphere Flash Read Cache require there own flash device.
  • The SSD in my host is being reported in vSphere as “non-SSD”. According to support this is a known issue with the generation of server I am using. Will this “mis-reporting” of the disk type affect my ability to configure a vFlash? <added 17/Sept/2013>
    • Yes it will, you will need to tag the SSD as local using (example below is what I use in my lab, your identifier will be different). And in this case I claim it as being “local” and as “SSD”.
      esxcli storage nmp satp rule add –satp VMW_SATP_LOCAL –device mpx.vmhba2:C0:T0:L0 –option “enable_local enable_ssd”
  • If I disable Strict Admission Control for vSphere HA, would it be able to power-on a VM with vFRC enabled? <added 17/Sept/2013>
    • No, admission control does not have anything to do with powering on virtual machines. Even if you disable admission control and there are no sufficient flash resources available then a virtual machine with vFRC enabled cannot be powered on.
    • Also note, that HA Admission Control does not take vFRC resources in to account whatsoever. So in an environment running vFRC you can technically use 100% of all flash resources, whilst Admission Control was setup in an N+1 fashion.
  • Is vFRC as good as the caching solution by vendor X? <added 18/Sept/2013>
    • I have had this question many times. It is difficult to answer as it depends. It depends on:
      • Your requirements from a functional perspective (read vs write back caching)
      • Type of workload you are running (read vs write, VDI vs Server)
      • Support requirements (Not all solutions out there are on the VMware compatibility list)
      • Operational requirements (vFRC needs to be enabled per virtual disk, some vendors do per datastore for instance etc)
      • Budget and even current vSphere licenses used (vFRC is included with Enterprise Plus)
  • x

If you have any other questions, don’t hesitate to drop them here and I will add them to the FAQ when applicable to the broader audience.

Startup News Flash part 5

Duncan Epping · Sep 10, 2013 ·

After the VMworld storm has slowly died down it is back to business again… This also means less frequent updates, although we are slowly moving towards VMworld Barcelona and I suspect there will be some new announcements at that time. So what happened in the world of flash/startups the last two weeks? This is Startup News Flash part 5, and it seems to be primarily around either funding rounds or acquisitions.

Probably one of the biggest rounds of funding I have seen for a “new world” storage company… $ 150 million. Yes, that is a lot of money. Congrats Pure Storage! I would expect an IPO at some point in the near future, and hopefully they will be expanding their EMEA based team. PureStorage is one of those companies which has always intrigued me. As GigaOm suggests this boosts will probably be used to lower the prices, but personally I would prefer a heavy investment in things like disaster recovery and availability. It is an awesome platform, but in my opinion it needs dedupe aware sync and a-sync replication! That should include VMware SRM integration from day 1 of course!

Flash is hot… Virident just got acquired by Western Digital for $685 million. Makes sense if you consider WD is known as the “hard disk” company, they need to keep on growing their business and the business of hard disks is going to be challenging in the upcoming years with SSD becoming cheaper and cheaper. Considering this is the second acquisition (sTec being the other) related to flash by WD you can say that they mean business.

I just noticed that Cisco announced they intent to acquire Whiptail for $415 million in cash. Interesting to see Cisco moving in to the storage space and definitely a smart move if you ask me.  With UCS for compute and Whiptail for storage they will be able to deliver the full stack considering they more or less already own the world of networking. Will be interesting to see how they integrate it in their UCS offerings. For those who don’t know, Whiptail is an all flash array (afa) which leverages a “scale out” approach, so start small and increase capacity by adding new boxes. Of course they offer most functionality other AFA vendors do, for more details I recommend reading Cormac’s excellent article.

To be honest, no one knew what to expect from this public VSAN Beta announcement. Would we get a couple of hundred registrations or thousands? Well I can tell you that they are going through the roof, make sure to register though if you want to be a part of this! Run it at home nested, run it on your test cluster at the office, do whatever you want with it… but make sure to provide feedback!

 

VMware vSphere Virtual SAN design considerations…

Duncan Epping · Sep 9, 2013 ·

I have been playing a lot with vSphere Virtual SAN (VSAN) in the last couple of months… I figured I would write down some of my thoughts around creating a hardware platform or constructing the virtual environment when it comes to VSAN. There are some recommended practices and there are some constraints, I aim to use this blog post to gather all of these Virtual SAN design considerations. Please read the VSAN introduction, how to install VSAN in your virtual lab and “How do you know where an object is located” to get a better understanding of the product. There is a long list of VSAN blogs that can be found here: vmwa.re/vsan

The below is all based on vSphere 5.5 Virtual SAN (public) Beta and my interpretation and thoughts based on various conversations with colleagues, engineering and reading various documents.

  • vSphere Virtual SAN (VSAN) clusters are limited to a maximum total of 32 hosts and there is a minimum of 3 hosts. VSAN is also currently limited to 100 VMs per host, resulting in a maximum of 3200 VMs in a 32 host cluster. Please note that HA currently has a limit of 2048 protected VMs in a single Datastore.
  • It is recommended to dedicate a 10GbE NIC port to your VSAN VMkernel traffic, although 1GbE is fully supported it could be a limiting factor in I/O intensive environments. Both VSS and VDS are supported.
  • It is recommended to have a VSAN VMkernel on every physical NIC! Ensure to configure them in a “active/standby” configuration so that when you have 2 physical NIC ports and 2 VSAN VMkernel’s each of them will have its own port. Do note that multiple VSAN VMkernel NICs on a single host on the same subnet is not a supported configuration, in  different subnets it is supported.
  • IP Hash Load Balancing is supported by VSAN, but due to limited number of IP-addresses between source/destination load balancing benefits could be limited. In other words, an etherchannel formed out of 4x1GbE NIC will most likely not result in 4GbE.
  • Although Jumbo Frames are fully supported with VSAN they do add a level of operational complexity. When Jumbo Frames are enabled ensure these are enabled end-to-end!
  • VSAN requires at a minimum 1 SSD and 1 Magnetic Disk per diskgroup on a host which is contributing storage. Each diskgroup can have a maximum of 1 SSD and 7 magnetic disks. When you have more than 7 HDDs or two or more SSDs you will need to create additional diskgroups.
  • Each host that is providing capacity to the VSAN datastore has at least one local diskgroup. There is a maximum of 5 disk groups per host!
  • It can beneficial to create multiple smaller disk groups instead of larger diskgroups. More diskgroups means smaller failure domains and more cache drives / queues.
  • Ensure when sizing your environment to take data replicas in to account. If your environment needs N+1 or N+2 (etc) resiliency factor this in accordingly.
  • SSD capacity does not count towards total VSAN datastore capacity. When sizing your environment, do not include SSD capacity in your totalized capacity calculation.
  • It is a recommended practice to have a minimum 1:10 ratio of SSD capacity to HDD capacity in each disk group. In other words, when you have 1TB of HDD capacity, it is recommended to have at least 100GB of SSD capacity. Note that VMware’s recommendation has changed since BETA, new recommendation is:
    • 10 percent of the anticipated consumed storage capacity before the number of failures to tolerate is considered
  • By default, 70% of the available SSD capacity will be used as read cache and 30% will be used as a write buffer. As in most designs, when it comes to cache/buffer –> more = better.
  • Selecting the SSD with the right performance profile can make a 5x-10x difference in VSAN performance easily, chose carefully and wisely. Both SSD and PCIe flash solutions are supported, but there are requirements! Make sure to check the HCL before purchasing new hardware. My tip Intel S3700, great price/performance balance.
  • VSAN relies on VM Storage Policies for policy based management. There is a default policy under the hood, but you cannot see this within the UI. As such it is a recommended practice to create a new standard policy for your environment after VSAN has been configured. It is recommended to start with all settings set to default, ensure “Number of failures to tolerate” is configured to 1. This guarantees that when a single host fails virtual machines can be restarted and recovered from this failure with minimal impact on the environment. Attach this policy to your virtual machines when migrating them to VSAN or during virtual machine provisioning.
  • Configure vSphere HA isolation response to “power-off” to ensure that virtual machines which reside on an isolated host can be safely restarted.
  • Ensure vSphere HA admission control policy (“host failures to tolerate” or the “percentage based) aligns with your VSAN availability strategy. In other words, ensure that both compute and storage are configured using the same “N+x” availability approach.
  • When defining your VM Storage Policy avoid unnecessary usage of “flash read cache reservation”. VSAN has internal read cache optimization algorithms, trust it like you trust the “host scheduler” or DRS!
  • VSAN does not support virtual machine disks greater than 2TB-512b, VMs which require larger VMDKs are not suitable candidates at this point in time for VSAN.
  • VSAN does not support FT, DPM, Storage DRS or Storage I/O Control. It should be noted though that VSAN internally takes care of scheduling and balancing when required. Storage DRS and SIOC are designed for SAN/NAS environments.
  • Although supported by VSAN, it is recommended practice to keep the hosts/disk configuration for a VSAN cluster similar. Non-uniform cluster configuration could lead to variations in performance and could make it more complex to stay compliant to defined policies after a failure.
  • When adding new SSDs or HDDs ensure these are not pre-formatted. Note that when VSAN is configured to “automatic mode” disks are added to existing disk groups or new disk groups are created automatically.
  • Note that vSphere HA behaves slightly different in a VSAN enabled cluster, here are some of the changes / caveats
    • Be aware that when HA is turned on in the cluster, FDM agent (HA) traffic goes over the VSAN network and not the Management Network. However, when an potential isolation is detected HA will ping the default gateway (or specified isolation address) using the Management Network.
    • When enabling VSAN ensure vSphere HA is disabled. You cannot enable VSAN when HA is already configured. Either configure VSAN during the creation of the cluster or disable vSphere HA temporarily when configuring VSAN.
    • When there are only VSAN datastores available within a cluster then Datastore Heartbeating is disabled. HA will never use a VSAN datastore for heartbeating.
    • When changes are made to the VSAN network it is required to re-configure vSphere HA.
  • VSAN requires a RAID Controller / HBA which supports passthrough mode or pseudo passthrough mode. Validate with your server vendor if the included disk controller has support for passthrough. An example of a passthrough mode controller which is sold separately is the LSI SAS 9211-8i.
  • Ensure log files are stored externally to your ESXi hosts and VSAN by leveraging vSphere’s syslog capabilities.
  • ESXi can be installed on: USB, SD and Magnetic Disk. Hosts with 512GB or more memory are only supported when ESXi is installed on magnetic disk.

That is it for now. When more comes to mind I will add it to the list!

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 63
  • Page 64
  • Page 65
  • Page 66
  • Page 67
  • Interim pages omitted …
  • Page 71
  • Go to Next Page »

Primary Sidebar

About the Author

Duncan Epping is a Chief Technologist and Distinguished Engineering Architect at Broadcom. Besides writing on Yellow-Bricks, Duncan is the co-author of the vSAN Deep Dive and the vSphere Clustering Deep Dive book series. Duncan is also the host of the Unexplored Territory Podcast.

Follow Us

  • X
  • Spotify
  • RSS Feed
  • LinkedIn

Recommended Book(s)

Also visit!

For the Dutch-speaking audience, make sure to visit RunNerd.nl to follow my running adventure, read shoe/gear/race reviews, and more!

Do you like Hardcore-Punk music? Follow my Spotify Playlist!

Do you like 80s music? I got you covered!

Copyright Yellow-Bricks.com © 2026 · Log in