Floating-point data compression at 75 Gb/s on a GPU

Molly A. O'Neil, Martin Burtscher
2011 Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units - GPGPU-4  
Numeric simulations often generate large amounts of data that need to be stored or sent to other compute nodes. This paper investigates whether GPUs are powerful enough to make real-time data compression and decompression possible in such environments, that is, whether they can operate at the 32-or 40-Gb/s throughput of emerging network cards. The fastest parallel CPUbased floating-point data compression algorithm operates below 20 Gb/s on eight Xeon cores, which is significantly slower than
more » ... ntly slower than the network speed and thus insufficient for compression to be practical in high-end networks. As a remedy, we have created the highly parallel GFC compression algorithm for double-precision floating-point data. This algorithm is specifically designed for GPUs. It compresses at a minimum of 75 Gb/s, decompresses at 90 Gb/s and above, and can therefore improve internode communication throughput on current and upcoming networks by fully saturating the interconnection links with compressed data.
doi:10.1145/1964179.1964189 dblp:conf/asplos/ONeilB11 fatcat:u2mohscypfba5ombgcjxivzn7m