This has come up in conversation a number of times since I joined Nimble, where data compression is such a key component of our system design. It may be old news to a lot of folks but there's bound to be someone out there who'll benefit from this.

Thanks to the analytics engine behind InfoSight, we have really good real-world data indicating how much compression you can get out of typical enterprise workloads. This is almost always expressed as a compression ratio such as 1.5X, which is really shorthand for 1.5 to 1 or simply 1.5:1. But what most people want to know is, "By how much capacity will the data footprint of my workload by reduced? How much space do I really need for this particular application once Nimble's compression has worked on my data?"

To answer this, I found it helpful to "rephrase" the compression ratio concept. Mathematically, a data compression ratio is defined as the size of the uncompressed data divided by the size of the compressed data:

Uncompressed data

Compression ratio = {-------------------}

Compressed data

So if an application had 3 TB of uncompressed data but it compressed down to 2 TB, the resulting compression ratio would be 3 TB / 2 TB = 1.5:1 or 1.5X.

But many people find it more useful to think of data compression in terms of space savings or reduction percentages, which you would calculate as follows:

Uncompressed data - Compressed data Compressed data

Reduction percentage = ------------------------------------- = 1 - {-------------------}

Uncompressed data Uncompressed data

Using our previous example, we calculate the space savings in going from 3 TB uncompressed down to 2 TB compressed as 1 - ( 2 TB / 3 TB) = 0.33, or 33%.

T

Nice post Taylor, very useful info!