File recipe compression in data deduplication systems

Dirk Meister, André Brinkmann, Tim Süß
2013 USENIX Conference on File and Storage Technologies  
Data deduplication systems discover and exploit redundancies between different data blocks. The most common approach divides data into chunks and identifies redundancies via fingerprints. The file content can be rebuilt by combining the chunk fingerprints which are stored sequentially in a file recipe. The corresponding file recipe data can occupy a significant fraction of the total disk space, especially if the deduplication ratio is very high. We propose a combination of efficient and
more » ... compression schemes to shrink the file recipes' size. A trace-based simulation shows that these methods can compress file recipes by up to 93%.
dblp:conf/fast/MeisterBS13 fatcat:23blfyy3cjepleqrvaj2axprai