rfenton

Tip: How not to shoot yourself in the foot

Blog Post created by rfenton Employee on Oct 25, 2014

Over the years, I have seen many a customer spend serious amounts of time, effort and money deploying a variety of backup and replication solutions, to ensure they have the ability to recover data when it matters. Of course this is extremely wise and has for a long time been a necessity for the majority of production environments.  However, often it's uncommon a flood, fire, rogue application or natural disaster that causes the majority of the outages.  Many studies will show up to 75% of the time, it's us (human error) that is the root cause of datacentre outages.  It's no wonder, with increasing complexity and lack of time we have, that mistakes and errors are made.  One of the things I really like about Nimble Storage is a lot of the complexity in managing the storage layer is simply removed, giving back time to the infrastructure administrator to focus on other activities.

 

One of my customers recently requested that we include a capability to create a 'waste-basket' in the GUI, to allow deleted volumes to be moved into an area and deleted at a later date.  The primary requirement was he desired to avoid such human errors when it comes to deprovisioning and decommissioning services.   I highlighted to him that Nimble arrays already have a similar feature and thought it was worth sharing as often it's the little things that can save an inordinate amount of time and heartache and save you from proverbially 'shooting yourself in the foot'

 

shoot-in-foot.jpg

 

Each volume in Nimble OS has an Online status.  Typically when a volume is first provisioned, it is marked as being online, such that any host that is in it's associated Initiator Group can reprobe and access the volume.   In order to delete the volume, it first must be placed in an offline status (removing host access) and then subsequently deleted.   Often as administrators, in haste we do both actions quickly when clearing up volumes. 


Here's how you take a volume offline... Browse to the Volume:


Offline1.jpg


Click Set Offline, you will receive a warning that connected hosts will lose access to the volume....


Offline2.jpg


Click OK and the volume will be placed offline....  Turning the volume Grey in the process


Offline3.jpg


Offline4.jpg



Now of course the hosts will lose access when a volume is placed offline, but if you inadvertently chose the incorrect volume, placing it back online would be a very simple operation  !


This is where my Top Tip is... STOP!


Rather than making that decommissioning process an atomic transaction of offline then delete... take the volume offline and wait a while... have a cup of tea/coffee, wait until the end of the day/week and once nobody is complaining that their application is no longer available... then delete the volume!!   It's a simple technique but one I couldn't recommend more highly when working with any storage system, as most will allow you take the volume offline or unmasked before deleting.


Hopefully this is sound advice for anyone but I'd be interested to hear what other simple and effective tips you've picked up along the way. Please share them in the comments section or better still write your own document/blog on Nimble Connect !

Outcomes