A distributed dynamic load balancer for iterative applications

Harshitha Menon, Laxmikant Kalé
2013 Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13  
For many applications, computation load varies over time. Such applications require dynamic load balancing to improve performance. Centralized load balancing schemes, which perform the load balancing decisions at a central location, are not scalable. In contrast, fully distributed strategies are scalable but typically do not produce a balanced work distribution as they tend to consider only local information. This paper describes a fully distributed algorithm for load balancing that uses
more » ... information about the global state of the system to perform load balancing. This algorithm, referred to as GrapevineLB, consists of two stages: global information propagation using a lightweight algorithm inspired by epidemic [21] algorithms, and work unit transfer using a randomized algorithm. We provide analysis of the algorithm along with detailed simulation and performance comparison with other load balancing strategies. We demonstrate the effectiveness of GrapevineLB for adaptive mesh refinement and molecular dynamics on up to 131,072 cores of BlueGene/Q.
doi:10.1145/2503210.2503284 dblp:conf/sc/MenonK13 fatcat:uurwcwv5frd3zordofyz7ej7dy