DI-MMAP—a scalable memory-map runtime for out-of-core data-intensive applications

Brian Van Essen, Henry Hsieh, Sasha Ames, Roger Pearce, Maya Gokhale
2013 Cluster Computing  
We present DI-MMAP, a high-performance runtime that memory-maps large external data sets into an application's address space and shows significantly better performance than the Linux mmap system call. Our implementation is particularly effective when used with high performance locally attached Flash arrays on highly concurrent, latency-tolerant data-intensive HPC applications. We describe the kernel module and show performance results on a benchmark test suite, a new bioinformatics metagenomic
more » ... lassification application, and on a levelasynchronous Breadth-First Search (BFS) graph traversal algorithm. Using DI-MMAP, the metagenomics classification application performs up to 4× better than standard Linux mmap. A fully external memory configuration of BFS executes up to 7.44× faster than traditional mmap. Finally, B. we demonstrate that DI-MMAP shows scalable out-of-core performance for BFS traversal in main memory constrained scenarios. Such scalable memory constrained performance would allow a system with a fixed amount of memory to solve a larger problem as well as provide memory QoS guarantees for systems running multiple data-intensive applications.
doi:10.1007/s10586-013-0309-0 fatcat:xqor673yfzb5bb5tjvqhhvi4c4