6 Replies Latest reply: Aug 30, 2016 9:36 AM by Cazi Brasga RSS

    Proof of concept DR test

    Erik Maurer Newbie

      We currently have two Nimble arrays (one at production, one at DR site) using protection groups to replicate volumes from the primary array to the DR array every two hours over a gigabit link.

      We have three VMware ESXi 5.5 hosts at each location connected to the Nimble via iSCSI.

      We would like to perform a small "proof of concept" DR test with 5 Server 2008 VM's / volumes to satisfy one of our department's regulatory compliance requirements.

      The overall goal is to take down the VMs at the production site, bring them up at the DR site, have the staff verify everything works, then delete the VMs at the DR site and turn the VMs back on at the production site.

       

      What is the best procedure to accomplish this without using third party software?

       

      We are thinking:

      1. Shut down the five Windows servers
      2. Take a manual Nimble snapshot of the five volumes
      3. Take the five volumes offline on the production Nimble
      4. On the DR nimble, find the snapshots we just took and set them to online
      5. "Rescan all" in VMware to discover newly online datastores on the DR Nimble
      6. Browse datastores, add .VMX files to VMware inventory, and power on
      7. Verify functionality, then power off servers
      8. Take volumes offline on DR Nimble
      9. Set volumes online on production Nimble
      10. Power on servers

       

      Maybe it's not necessary to take the production volumes offline on the Nimble since the servers are shut down.

      Would it be better to close the snapshots and bring the clones online on the DR Nimble?

       

      Am I missing anything? Are we going about this totally wrong? Sorry for my ignorance, this is sadly the first time we've tried a DR test since we implemented the Nimble solution.

        • Re: Proof of concept DR test
          Newbie

          Assume that each side has its own vCenter, and each volume is a datastore where w2k8 server resides, you could consider:

           

          Production Site (Site P)

          Secondary Site (Site S)

           

          Test Failover - The goal is to test DR for these 5 servers without impacting current operation.

            1. (Site P) If need the latest server changes, take manual Nimble snapshot with "replicate" option of the 5 volumes and wait for replication completes.

            2. (Site S) For each volume, locate the latest replicated snapshot and create a clone.

            3. (Site S) Associate appropriate ACL to the 5 newly cloned volumes.

            4. (Site S) Rescan ESXi, browse datstores, add VM to inventory and power on, verify functionality and etc.

            5. (Site S) Properly unmount the datastores and detach these 5 volumes from ESXi.

            6. (Site S) To cleanup, offline these 5 cloned volumes and delete them.

           

          Planned Migration - The goal is to migrate these 5 servers to run on the Site S (without replicating back to Site P).

            1. (Site P) Properly shut down these 5 w2k8 servers. Take manual Nimble snapshot with "replicate" option of the 5 volumes and wait for replication completes.

            2. (Site P) Properly unmount the datastores and detach these 5 volumes from ESXi.

            3. (Site P) Offline these 5 volumes. Optionally set these 5 volumes read only to avoid accidental writes if the environment is not completely under controlled.

            4. (Site S) Locate the volume collection(s) for these 5 volumes. Promote all applicable volume collections.

            5. (Site S) Associate appropriate ACL to the 5 newly promoted volumes. (It's OK to swap the order of step #4 and #5 as well)

            6. (Site S) Rescan ESXi, browse datstores, add VM to inventory and power on.

           

          Failover - When Site P is completely inaccessible, the goal is to bring up Site S

            1. Similar to Planned Migration, but use step #4 to step #6.

           

          Planned Migration with Nimble handover - The goal is to migrate these 5 servers to run on the Site S, and replicate back to Site P

            1. (Site P) Properly shut down these 5 w2k8 servers. Take manual Nimble snapshot with "replicate" option of the 5 volumes and wait for replication completes.

            2. (Site P) Properly unmount the datastores and detach these 5 volumes from ESXi.

            3. (Site P) Locate the volume collection(s) for these 5 volumes. Handover all applicable volume collections.

            4. (Site S) Associate appropriate ACL to the 5 newly promoted volumes.

            5. (Site S) Rescan ESXi, browse datstores, add VM to inventory and power on.

           

          The "Test Failover" is probably what you are looking for. As you pointed out, cloned volumes can be used for the test failover purpose. This can be achieved without bringing down the production systems. This is also how VMware SRM performs the test failover with Nimble SRA. Please let us know if there's any question and how it goes.

          • Re: Proof of concept DR test
            Erik Maurer Newbie

            How exactly does one take a snapshot "with replicate option"?

            I don't see that on our Nimble.

            • Re: Proof of concept DR test
              Cazi Brasga Wayfarer

              I would just recommend using the Nimble vSphere plugin to perform the snapshots and clones of the datastores to prevent ID conflicts and it also saves time by performing the rescans across all hosts in the cluster.