General-purpose join algorithms for large graph triangle listing on heterogeneous systems

Daniel Zinn, Haicheng Wu, Jin Wang, Molham Aref, Sudhakar Yalamanchili
<span title="">2016</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="" style="color: black;">Proceedings of the 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit - GPGPU &#39;16</a> </i> &nbsp;
We investigate applying general-purpose join algorithms to the triangle listing problem on heterogeneous systems that feature a multi-core CPU and multiple GPUs. In particular, we consider an out-of-core context where graph data are available on secondary storage such as a solid-state disk (SSD) and may not fit in the CPU main memory or GPU device memory. We focus on Leapfrog Triejoin (LFTJ), a recently proposed, worst-case optimal algorithm and present "boxing": a novel yet conceptually simple
approach for partitioning and feeding out-of-core input data to LFTJ. The "boxing" algorithm integrates well with a GPU-Optimized LFTJ algorithm for triangle listing. We achieve significant performance gains on a heterogeneous system comprised of GPUs and CPU by utilizing the massive-parallel computation capability of GPUs. Our experimental evaluations on real-world and synthetic data sets (some of which containing more than 1.2 billion edges) show that out-of-core LFTJ is competitive with specialized graph algorithms for triangle listing. By using one or two GPUs, we achieve additional speedups on the same graphs. CCS Concepts •Information systems → Relational parallel and distributed DBMSs; •Theory of computation → Data structures and algorithms for data management; Massively parallel algorithms;
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="">doi:10.1145/2884045.2884054</a> <a target="_blank" rel="external noopener" href="">dblp:conf/ppopp/ZinnWWAY16</a> <a target="_blank" rel="external noopener" href="">fatcat:yo7ln5ce6ze6pmrmj3x7e33yhe</a> </span>
