A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit <a rel="external noopener" href="https://static.aminer.org/pdf/20170130/pdfs/ppopp/oh7lfbwdvrvqhg06z84c3jiuixbexeys.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
<i title="Association for Computing Machinery (ACM)">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/xu5bk2lj5rbdxlx6222nw7tsxi" style="color: black;">SIGPLAN notices</a>
Standard cache-oblivious recursive divide-and-conquer algorithms for evaluating dynamic programming recurrences have optimal serial cache complexity but often have lower parallelism compared with iterative wavefront algorithms due to artificial dependencies among subtasks. Very recently cache-oblivious recursive wavefront (COW) algorithms have been introduced which do not have any artificial dependencies. Though COW algorithms are based on fork-join primitives, they extensively use atomic<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3155284.3019031">doi:10.1145/3155284.3019031</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cnvgigdefnde5cbkjr6dwguxii">fatcat:cnvgigdefnde5cbkjr6dwguxii</a> </span>
more »... ions, and as a result, performance guarantees provided by state-of-the-art schedulers for programs with fork-join primitives do not apply. In this work, we show how to systematically transform standard cache-oblivious recursive divide-and-conquer algorithms into recursive wavefront algorithms to achieve optimal parallel cache complexity and high parallelism under state-of-the-art schedulers for fork-join programs. Unlike COW algorithms these new algorithms do not use atomic operations. Instead, they use closed-form formulas to compute at what time each recursive function must be launched in order to achieve high parallelism without losing cache performance. The resulting implementations are arguably much simpler than implementations of known COW algorithms.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190218134430/https://static.aminer.org/pdf/20170130/pdfs/ppopp/oh7lfbwdvrvqhg06z84c3jiuixbexeys.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/87/3f/873f97c9be2d3788a76b27b07e490853b1d00bce.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/3155284.3019031"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>