Ddelta: A deduplication-inspired fast delta compression approach

Wen Xia, Hong Jiang, Dan Feng, Lei Tian, Min Fu, Yukun Zhou
2014 Performance evaluation (Print)  
Delta compression, a promising data reduction approach capable of finding the small differences (i.e., delta) among very similar files and chunks, is widely used for optimizing replicate synchronization, backup/archival storage, cache compression, etc. However, delta compression is costly because of its time-consuming wordmatching operations for delta calculation. Our indepth examination suggests that there exists strong wordcontent locality for delta compression, which means that contiguous
more » ... licate words appear in approximately the same order in their similar versions. This observation motivates us to propose Edelta, a fast delta compression approach based on a word-enlarging process that exploits word-content locality. Specifically, Edelta will first tentatively find a matched (duplicate) word, and then greedily stretch the matched word boundary to find a likely much longer (enlarged) duplicate word. Hence, Edelta effectively reduces a potentially large number of the traditional time-consuming word-matching operations to a single word-enlarging operation, which significantly accelerates the delta compression process. Our evaluation based on two case studies shows that Edelta achieves an encoding speedup of 3X∼10X over the state-of-the-art Ddelta, Xdelta, and Zdelta approaches without noticeably sacrificing the compression ratio.
doi:10.1016/j.peva.2014.07.016 fatcat:4mdsayyxwnhbrg7twc334zcnuu