When we are storing data, we typically assume that our storage system of choice returns that data later just as we put it in. However what guarantees do we have that this is actually the case? The case made here is the case of bitrot, the silent degradation of the physical charges that physically make up today's storage devices. To counter this type of problem, one can employ data checksumming, as it is done by both btrfs and ZFS. However, while in the long run btrfs might be the tool of choice for this, it is fairly complex and not yet too mature, whereas ZFS, the most prominent candidate for this type of features, is not without hassle and it must be recompiled for every kernel update (although automation exists). In this blogpost, we'll therefore take a look into a storage design that actually checks whether the returned data is actually valid and not silently corrupted inside our storage system and is completely designed with components available in Linux itself without the need to recompile and test your storage layer on every kernel upgrade. We find that this storage design, while fulfilling the same purpose as ZFS, does not only yield comparable performance, but actually in some cases even able to significantly outperform it, as the benchmarks at the end indicate.
A storage design that checks whether the returned data is valid and not silently corrupted inside a storage system, designed with Linux components.