A majority of Nimble users rely on a combination of snapshots and replication for data protection, with good reason. Earlier discussions have highlighted InfoSight statistics regarding RPOs and snapshot retention durations observed in the Nimble install base.
For new users, some questions that might come up are – how much capacity will one need for snapshots? And how much bandwidth will replication require? If you are a Nimble customer, the InfoSight portal has a handy planning section that offers replication sizing guidance specific to your environment (in the Data Protection tab). If you are brand new to Nimble, or have not used snapshots extensively, here is some InfoSight data culled from thousands of Nimble systems/volumes that might help you with planning. Note that this data is specific to Nimble – thanks to our use of configurable block sizes and pointer based snapshots (which only store compressed, block level changes), Nimble snapshots do use considerably less space than other implementations.
First, let’s define a useful term – daily snapshot change rate. Basically daily change rate is how much capacity – as a percentage of your stored data set – you will require on average to store a day’s worth of snapshots, for that particular type of data. Note – we’re not necessarily talking about 1 daily snapshot – we’re talking about the typical mix of snapshot RPOs used for applications, ranging from minutes to hours apart. If you have very frequent snapshots (more than install base average for that particular application), change rates could be slightly higher because you are potentially capturing more versions of each data block in a day. Conversely if you only capture a single daily snapshot, daily change rates would likely be slightly lower.
So if you have a 100GB dataset, and a 0.5% daily change rate on average, then to store 60 days of snapshots you will need – 100 x 0.5% x 60 = 30GB of capacity. Bear in mind that Nimble snapshots (aside from being very thin) are retained on high density, low cost HDDs to make retention cost effective.
Change rate is also useful to estimate WAN bandwidth requirements for replication purposes. So for the example above, you would need to transfer just 0.5GB (500MB) of data across the WAN each day. Depending on your daily replication windows (Nimble allows you to implement QoS policies, to avoid using WAN bandwidth during peak hours for example) you can estimate bandwidth requirements from there. Note: as mentioned previously, the planning section of the InfoSight data protection tab provides a bandwidth estimate as well.
Having just defined this handy term, I will say that the daily snapshot change rate is not necessarily an immutable property of your data set. It depends on how much new data is being written each hour or day, and how much the stored data set grows over time (the two need not be directly connected because of overwrites). We’ll revisit the topic of change rate trending in a future post with some interesting data.
With that definition (and caveat) out of the way, we can now look at how daily change rates look for a wide range of data types across the Nimble install base. The chart below shows change rate by the “Performance Policy” (a type of application profile) that Nimble users have assigned to their data sets.
First let’s look at the category of data sets I have grouped together as belonging to “end user applications” such as Microsoft Exchange, SharePoint, VDI and File Servers. You’ll notice that most of them have daily change rates ranging from 0.3% to just under 1%, with the exception of Exchange logs, which top 1.5%.
Virtualization infrastructure data sets (spanning many flavors of VMware vSphere, Microsoft HyperV and Citrix XenServer) range from 0.4% to just over 1%.
Databases range from 0.25% to 1.5%, with SQL Server logs topping 2.5%.
All “Other” applications (spanning thousands of volumes) average 0.3%.
Finally, the Nimble average across all of the above is 0.5%.
What are you seeing for your data sets?