SDC Resilient Error-bounded Lossy Compressor [article]

Sihuan Li, Sheng Di, Kai Zhao, Xin Liang, Zizhong Chen, Franck Cappello
2020 arXiv   pre-print
Lossy compression is one of the most important strategies to resolve the big science data issue, however, little work was done to make it resilient against silent data corruptions (SDC). In fact, SDC is becoming non-negligible because of exa-scale computing demand on complex scientific simulations with vast volume of data being produced or in some particular instruments/devices (such as interplanetary space probe) that need to transfer large amount of data in an error-prone environment. In this
more » ... paper, we propose an SDC resilient error-bounded lossy compressor upon the SZ compression framework. Specifically, we adopt a new independent-block-wise model that decomposes the entire dataset into many independent sub-blocks to compress. Then, we design and implement a series of error detection/correction strategies based on SZ. We are the first to extend algorithm-based fault tolerance (ABFT) to lossy compression. Our proposed solution incurs negligible execution overhead without soft errors. It keeps the correctness of decompressed data still bounded within user's requirement with a very limited degradation of compression ratios upon soft errors.
arXiv:2010.03144v1 fatcat:35b7j4goc5fm3fbqzes77vmyam