3 Replies Latest reply: Jan 9, 2014 6:54 AM by Ashwin Pawar RSS

    Performance Nugget: DQLEN value in VMware ESXi

    Scout

      Someone asked a great question from our joint vmw + Nimble performance video:

      Storage Performance Technical Deep Dive & Demo - YouTube

       

      The question is the following:

      " in esxtop, DQLEN is set to 128 in your case, is that at the Queue Depth at LUN level? what is the typical Q-depth at the Array Port in Nimble storage? 128 seems high, is that because Nimble is a Flash storage and LUNs are capable of processing high amount of commands?"


      If you are wondering what "DQLEN" is and where it shows up, you simply go into esxi shell, invoke esxtop, type "U" for device stats mode - "DQLEN" can be spotted easily:

       

       

      esxtopdqlen.PNG.png

      What does it really mean?  Thanks to @Eric Forgette from engineering & @Rick Jooss from product management, I got a thorough explanation from both of them:

      Per VMW KB, DQLEN is defined as "The value listed under DQLEN is the queue depth of the storage device. This is the maximum number of ESX VMKernel active commands that the device is configured to support."

       

      Does this strictly mean queue length for the device?

      Actually the queue is for both ends of the wire since it's queuing IOs that are actively being worked on.  The ESX host has to have a slot to hold + remember the IO, so does the storage device like Nimble on the target side. 

       

      Where does the "128" value come from?

      The 128 is actually set by Nimble.  iSCSI is better than FC because the target can tell the initiator how much queue depth it has and then the initiator can use that value (or a lower one).   In our case we respond with 128.  That value is per session so you if  you have multiple paths you’ll actually get (#_paths * 128) as the total queue depth for that LUN.


      Is "128" the Queue Depth at LUN level?

      The queue depth is actually at the connection level.  If there are 2 connections/sessions, then it would actually be 256 per LUN/volume.


      What is the typical Q-depth at the Array Port in Nimble storage? 128 seems high, is that because Nimble is a Flash storage and LUNs are capable of processing high amount of commands?

      The value of 128 was selected so that queue_depths would not generally be a limitation in terms of performance, so yes, because the Nimble arrays are high performance.  The cost of more queue depth is also not high in our architecture.  It should probably be noted the q-depth is not per port but per iSCSI session/connection. 


      Last but not least, please remember queue depth has a direct impact on latency.  If the queue depth is set too low, the array does not get utilized to its full potential.  If the queue depth is too high, then the queued IO sits in line too long and thus translates to higher latency.  Always question storage vendor's performance claim on their latency numbers, if they are executed with queue depth of 1, of course latency is super low. 

        • Re: Performance Nugget: DQLEN value in VMware ESXi
          Ashwin Pawar Newbie

          Hi Wen,

           

          Thanks for providing consolidated answers on this post. I really liked it.

           

          Above, you mentioned that:

          The queue depth is actually at the connection level.  If there are 2 connections/sessions, then it would actually be 256 per LUN/volume.

           

          So, based on this:

           

          My question is : Suppose I have a high performance scenario, and I have 2 iSCSI sessions actively running per connection, that means my QDEPTH is technically 128*2=256, In other words my storage LUN can soak up to 256 outstanding IOPS from host?

           

          Does that mean - I need to adjust Maximum Queue Depth for Software iSCSI ?

           

          Can this be examined in the esxtop output? For example - If my LUN QDEPTH is set to 128, and I go to esxtop and look at ACTV and QUEUED and %USD stats, I should see uitlization percentage of 100 %. Does that mean I change this value from 128 to 256?

           

           

          Can you please shed some light here.

           

           

          Also providing link to esxtop vmlink [Just as explained in the webcast]:I found it very useful.

          Checking the queue depth of the storage adapter and the storage device (1027901)

          http://kb.vmware.com/selfservice/documentLinkInt.do?micrositeID=&popup=true&languageId=&externalID=1027901

           

           

          Thanks,

          -Ashwin


            • Re: Performance Nugget: DQLEN value in VMware ESXi
              Scout

              Hi Ashwin, good questions - I don't believe you should increase the max queue depth for sw/iscsi as it'd simply queue up more commands than the storage target could support, and that could translate to higher latency.  Also good pointer on the VMW KB on checking ACTV & %USD values.  I want to monitor these values while running IO testing on a nimble volume.  I'll post my learning here.

               

              -wen