A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Deca
2019
ACM Transactions on Computer Systems
In-memory caching of intermediate data and active combining of data in shuffle buffers have been shown to be very effective in minimizing the re-computation and I/O cost in big data processing systems such as Spark and Flink. However, it has also been widely reported that these techniques would create a large amount of long-living data objects in the heap. These generated objects may quickly saturate the garbage collector, especially when handling a large dataset, and hence, limit the
doi:10.1145/3310361
fatcat:d5z767ar4rd6xdp4z4sxnpkefi