Eager Memory Management for In-Memory Data Analytics

Hakbeom JANG, Jonghyun BAE, Tae Jun HAM, Jae W. LEE
2019 IEICE transactions on information and systems  
This paper introduces e-spill, an eager spill mechanism, which dynamically finds the optimal spill-threshold by monitoring the GC time at runtime and thereby prevent expensive GC overhead. Our e-spill adopts a slow-start model to gradually increase the spill-threshold until it reaches the optimal point without substantial GCs. We prototype e-spill as an extension to Spark and evaluate it using six workloads on three different parallel platforms. Our evaluations show that e-spill improves
more » ... ance by up to 3.80× and saves the cost of cluster operation on Amazon EC2 cloud by up to 51% over the baseline system following Spark Tuning Guidelines.
doi:10.1587/transinf.2018edl8199 fatcat:mgwlb2pivfhh3co6j7fuv7shky