Cache locality optimization for recursive programs

Jonathan Lifflander, Sriram Krishnamoorthy
2017 Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2017  
We present an approach to optimize the cache locality for recursive programs by dynamically splicing-recursively interleaving-the execution of distinct function invocations. By utilizing data effect annotations, we identify concurrency and data reuse opportunities across function invocations and interleave them to reduce reuse distance. We present algorithms that efficiently track effects in recursive programs, detect interference and dependencies, and interleave execution of function
more » ... s using user-level (non-kernel) lightweight threads. To enable multi-core execution, a program is parallelized using a nested fork/join programming model. Our cache optimization strategy is designed to work in the context of a random work-stealing scheduler. We present an implementation using the MIT Cilk framework that demonstrates significant improvements in sequential and parallel performance, competitive with a state-of-the-art compile-time optimizer for loop programs and a domainspecific optimizer for stencil programs.
doi:10.1145/3062341.3062385 dblp:conf/pldi/LifflanderK17 fatcat:r24zpnzwbnexffnu6gga72nxoq