6 Replies Latest reply: Dec 20, 2015 6:19 AM by Vlad Valeriu Velciu RSS

    How does failover of controllers work?

    SA Vault Newbie

      Hi there I have a 1Gb standard network configuration with eth1 & 2 management only and eth3-6 data only.

       

      Each management port and data port on each controller are plugged into a separate switches as documented in the Nimble networking best practices:

       

       

       

      I was wondering what triggers a failover of the controller?

       

      There's a few scenarios I'm considering:

       

      • If S1 fails eth1, eth3 and eth 5 lose connections however we still have connections to eth 2, eth4 and eth6 so I guess no failover occurs as we still have connectivity from the controller but in a degraded state? and the same if S2 fails but with eth 2, 4 and 6?

       

      If an individual port either on the switch or controller failed then what would happen?

       

      • in the case of the management network does failover rely on the availability of the group management IP address which 'floats' between the 2 management ports and if that becomes unavailable then failover occurs?
      • In the case of the data network, if one or more ports are unavailable on a controller then what happens?

       

      I am not in a position to test failover scenarios at present but would like to know the theory behind failover.

        • Re: How does failover of controllers work?
          Justin Rich Adventurer

          hopefully support or an engineer will answer this, but based on testing, a controller only fails over if there is a complete outage on a controller. so if you lose one data/mgmt port, it will stay.

            • Re: How does failover of controllers work?
              Mitch Gram Adventurer

              Justin is correct.  There are other conditions that will trigger a controller failover, but from a network related perspective, its is because that a network (management or data) that is defined on any interface is no longer able to communicate with other devices on that network.  So, the loss of a single interface on a controller will not trigger a failover unless there are no other available paths on that controller for the related network.

                • Re: How does failover of controllers work?
                  SA Vault Newbie

                  Thanks for the responses and clarification guys!

                    • Re: How does failover of controllers work?
                      Tim Crowley Newbie

                      I have the same network config as SA vault has.

                      If I disconnect the cable from eth3 on controller A should the traffic fail over the eth3 on controller B and eth 4 - 6 stay on controller A

                       

                      Thanks

                      Tim

                        • Re: How does failover of controllers work?
                          SA Vault Newbie

                          Hi Tim,


                          I summarized it as follows and tested accordingly:


                          Switch failure:


                          • If switch S1 fails then eth1, eth3 and eth5 lose connectivity, however there are still connections to eth2, eth4 and eth6 from the controller so no failover occurs, as there is still connectivity from the controller to both networks (management and data) but in a degraded state. (And the same if S2 fails but with eth2, eth4 and eth6 losing connections.)

                           

                          If an individual port either on the switch or controller fails:

                          • In the case of the management network, failover relies on the availability of the group management IP address which 'floats' between the 2 management ports on the controller.    If the group management address becomes unavailable then failover occurs, therefore it would need both management ports to be unavailable for failover of the management ports to other controller.
                          • Likewise, in the case of the data network, if one or more ports are unavailable on a controller then no failover occurs unless all data ports are unavailable


                          If a controller fails then of course all connections failover.

                           

                          Also Mitch's answer above clarifies what happens when interfaces fail on either management or data network:
                          "the loss of a single interface on a controller will not trigger a failover unless there are no other available paths on that controller for the related network."

                           

                            • Re: How does failover of controllers work?
                              Vlad Valeriu Velciu Wayfarer

                              I would like to add some info to the above after we did some failover tests.

                               

                              A switch misconfiguration (ex. wrong vlan) on the data paths will not trigger a failover. So if the storage paths are up but they cannot be reached, we found out that it will not failover. We waited for 80 seconds.

                               

                              LE: But having one data path fail (NIC down) on a controller will trigger a failover even though there is still one data path available. "Attempting failover to active role because controller B has better IP network connectivity."

                               

                              Software version: 2.3.9.0-296119-opt