Collective Asynchronous Remote Invocation (CARI): A High-Level and Effcient Communication API for Irregular Applications

Wakeel Ahmad, Bryan Carpenter, Aamir Shafi
2011 Procedia Computer Science  
The Message Passing Interface (MPI) standard continues to dominate the landscape of parallel computing as the de facto API for writing large-scale scientific applications. But the critics argue that it is a low-level API and harder to practice than shared memory approaches. This paper addresses the issue of programming productivity by proposing a high-level, easy-to-use, and efficient programming API that hides and segregates complex low-level message passing code from the application specific
more » ... ode. Our proposed API is inspired by communication patterns found in Gadget-2, which is an MPI-based parallel production code for cosmological N-body and hydrodynamic simulations. In this paper-we analyze Gadget-2 with a view to understanding what high-level Single Program Multiple Data (SPMD) communication abstractions might be developed to replace the intricate use of MPI in such an irregular application-and do so without compromising the efficiency. Our analysis revealed that the use of low-level MPI primitives-bundled with the computation code-makes Gadget-2 difficult to understand and probably hard to maintain. In addition, we found out that the original Gadget-2 code contains a small handful of-complex and recurring-patterns of message passing. We also noted that these complex patterns can be reorganized into a higherlevel communication library with some modifications to the Gadget-2 code. We present the implementation and evaluation of one such message passing pattern (or schedule) that we term Collective Asynchronous Remote Invocation (CARI). As the name suggests, CARI is a collective variant of Remote Method Invocation (RMI), which is an attractive, high-level, and established paradigm in distributed systems programming. The CARI API might be implemented in several ways-we develop and evaluate two versions of this API on a compute cluster. The performance evaluation reveals that CARI versions of the Gadget-2 code perform as well as the original Gadget-2 code but the level of abstraction is raised considerably.
doi:10.1016/j.procs.2011.04.004 fatcat:jizuyhhgerb2tkibpisasj3qiy