Overhead of a decentralized gossip algorithm on the performance of HPC applications

Ely Levy, Amnon Barak, Amnon Shiloh, Matthias Lieber, Carsten Weinhold, Hermann Härtig
2014 Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers - ROSS '14  
Gossip algorithms can provide online information about the availability and the state of the resources in supercomputers. These algorithms require minimal computing and storage capabilities at each node and when properly tuned, they are not expected to overload the nodes or the network that connects these nodes. These properties make gossip interesting for future exascale systems. This paper examines the overhead of a decentralized gossip algorithm on the performance of parallel MPI
more » ... lel MPI applications running on up to 8192 nodes of an IBM BlueGene/Q supercomputer. The applications that were used in the experiments include PTRANS and MPI-FFT from the HPCC benchmark suite as well as the coupled weather and cloud simulation model COSMO-SPECS+FD4. In most cases, no gossip overhead was observed when the gossip messages were sent at intervals of 256 ms or more. As expected, the overhead that is observed at higher rates is sensitive to the communication pattern of the application and the amount of gossip information being circulated.
doi:10.1145/2612262.2612271 dblp:conf/ics/LevyBSLWH14 fatcat:wxoyadt57zbexbtpjqn6brht2i