5 Replies Latest reply: May 7, 2015 3:17 PM by Nev Finch RSS

    iSCSI disconnects in VMware 5.5

    Nev Finch Newbie
    Visibility: Open to anyone

      Hi, I am hoping someone may be able to help with an issue we are having in our production environment.

       

      We have recently upgraded our servers to 5 new Cisco UCS 220 M4 servers. These are connected to our Nimble CS300 via two Brocade ICX6610-24 Switches configured in a stack. Each host is running the latest Cisco ISO of VMware. During installation everything worked fine, but now that we have a production load on the environment we are seeing the iSCSI Nic's disconnecting. The timing is random and is mostly one at a time. On occasions though both Nic's disconnect removing access to the storage.

      It appears to be a VMWare issue as when the disconnect occurs both the Brocade and the UCS see there adapters as still up. VMware though are adamant that it is a Brocade issue.

      I currently have open cases with Cisco, VMware and Brocade but really don't appear to be getting anywhere. I have also opened a case with Nimble but they are not seeing any issues on the storage at all.

      Today I have tried downgrading the Cisco Firmware. This also hasn't made any difference.

      The other piece of information that may be relevant is the problem becomes more frequent when we put the system under load. We are using Veeam as a our backup solution and when this runs at night the disconnects are more frequent.

      Below is our complete setup. What I am looking for is anyone who has had a similar problem or any advice on how to continue troubleshooting.

       

      Servers (5 in Total):

      UCSC-C220-M4S
      BMC Version Info: 2.0(3i)
      BIOS Version: C220M4.2.0.3d.0
      Product Serial Number             : [FCH1850V0A4]

      VMware ESXi 5.5.0-2068190-custom-cisco.5.5.2.2
      + Updated one to ESXi 5.5.0-2068190-custom-cisco.5.5.2.3 to see if this made any improvement

      Storage:

      Nimble CS300 storage
      NOS Version: 2.2.6.0-22959-opt

      Storage Switches (2 in Total):

       

      Brocade ICX6610-24
      Boot Image: 10.1.00T7f5

      10 Gbe Connections (10 in Total):

       

      Cisco – Twinax Cables
      Product # SFP-10G-AOC3M
      Part # 10-2847-01


      VIC/CNA:
      ---
        Slot: 1
        Description: Cisco UCS Dual Port 10Gb Ethernet and 4G Fibre Channel CNA SFP+
        PID: UCSC-PCIE-CSC-02
        powMin: 14
        powMax: 25
        Vendor: 0x1137
        Device: 0x0042
        SubVendor: 0x1137
        SubDevice: 0x0085

      ---
        Slot: MLOM
        Description: Cisco UCS 1227 Dual Port 10Gb Ethernet and 4G Fibre Channel CNA SFP+ MLOM
        PID: UCSC-MLOM-CSC-02
        powMin: 14
        powMax: 25
        Vendor: 0x1137
        Device: 0x0042
        SubVendor: 0x1137
        SubDevice: 0x012e
       
      Version: Version 1.6.0.12, Build: 1331820, Interface: 9.2 Built on: Jun 12 2014

      UCS support matrix: Checked UCS HCL, FNIC drivers seems supported with C220 M4/ESXi 5.5 U with VIC 1225/1227

      Connections:

      The Cisco UCS servers are connected to the Brocade Switch using the 10 GB Twinax Cables. There are two of these plugged into Slot one on each server. One is connected to one of the Switch One in the Brocade Stack. The other into Switch Two.

       

      As part of the trouble shooting process we have also tried the following:

      Brocade 10G Active 3M FCoE – Part # 58-1000027-01 Cable

      A Cisco SFP-10G-SR (10-2415-03) with Brocade 57-0000075-01.

      These connections also experience the drop outs.

       

      Notes:

      Have applied VMware KB Article 1030265 (Interrupt Mapping) as recommended by Cisco

      http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1030265

       

      Open Case with Cisco is SR 634639537

      Have spoken to VMware and they believe the issues is in the Brocade

      Open case with VMware is Support Request # 15657062404

      Have now opened case with Brocade. Case#1438624  

        • Re: iSCSI disconnects in VMware 5.5
          Nick Dyer Navigator

          Hello Nev,

           

          Thanks for posting this up. Whilst this is intriguing, it's not really the place to publish support requests (this is not a support forum) - can I suggest you open a case with Nimble support so they can start troubleshooting the issue please (if you haven't done so already).


          Would you mind answering this question though - have you noticed the same issue if you were using 5.5 update 1 instead of update 2?

            • Re: iSCSI disconnects in VMware 5.5
              Nev Finch Newbie

              Nick

               

              Sorry for posting here. We do have an open case with Nimble, as well as Brocade, Cisco and VMware. We just aren't getting anywhere very quickly.

               

              To answer your question we haven't tried 5.5 update 1. We have tried 5.1 Update 3 (Cisco Release) and 5.5 Update 2 (Cisco Release and Standard) all with the same result. We have also tried installing the latest enic VMware driver.

               

              Yesterdays update is that the Cisco is actually seeing the NIC go down locally at the same time as VMware sees it. We are now trying to look more closely to see if the Brocade is seeing any errors at the same time.

                • Re: iSCSI disconnects in VMware 5.5

                  Nev,

                   

                  We recently introduced an Esx host with a brocade nic and we are seeing a similar situation with random disconnects and the ports appearing to go down under load.  All of our other nics are based on Emulex silicon and we don't seem to see the problem there.  Like you we are running Esxi 5.5 U3 and have tried patching firmware etc.  Our environment however is in an HP C7000 bladecenter with BL460c G8 and G9 systems.   We're working with VMware on this as well, but have just started.  Can you share your case with us, I'd love to reference it with VMware support as they dig into it, and I'd love to hear about anything you find.  If we find anything useful I'll share as well.

                    • Re: iSCSI disconnects in VMware 5.5

                      Nev,

                       

                      I think you can disregard my last post.  After reading more carefully (sorry), I realized the I was reading Brocade and thinking Broadcom.  Our problem is in a Broadcom environment so likely not related to your issue.

                        • Re: iSCSI disconnects in VMware 5.5
                          Nev Finch Newbie

                          Nick

                           

                          Happy to share the case with you. It is VMware Support Request 15657062404.

                           

                          We have narrowed it down to the Cisco UCS seeing a error and taking the port down. From the Brocade end it is running fine. This tells me it is an issue between the vNIC on the UCS and VMware. I am waiting for more details from Cisco but suspect it is an issue with the firmware of the NIC and the VMware driver.