This question is for anyone running a virtualized 2010 or 2013 Exchange DAG and leveraging snapshots of the actual datastores the VM is located on.
In our environment we run a couple Exchange 2013 DAG members and take an array snapshot of the datastores those members are on. When the snapshot is taken like all snapshot/backup systems that leverage VMware snapshot technology, VMware takes a snapshot. When the VMware snapshot occurs it causes a DAG failover. I've gone through and read several blog entries about changing cluster delays etc... to eliminate that issue.
None however seem to stand out any more than the Veeam article here.
Granted we aren't leveraging Veeam, but again like I said above Nimble and more or less everyone who leverages VMware for backups and snaps leverages it about the same way.
Question 1, has anyone else seen similar behavior and how did you fix it?
Question 2, the 5th line in that Veeam KB above talks about snapshot.maxConsolidateTime being reduced to 1 second to stun the VM. The up side to this of course is that it decreases that stun time the VM would have to be stunned. The down side is it doesn't give the array near as long as the default 6 seconds VMware makes that value and could cause the array to fail the snapshot if it doesn't have enough time to complete the snap. So do the Nimble engineers see any problem with this?