Improving HPC Application Performance in Cloud through Dynamic Load Balancing

A. Gupta, O. Sarood, L. V. Kale, D. Milojicic
2013 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing  
Driven by the benefits of elasticity and pay-as-you-go model, cloud computing is emerging as an attractive alternative and addition to in-house clusters and supercomputers for some High Performance Computing (HPC) applications. However, poor interconnect performance, heterogeneous and dynamic environment, and interference by other virtual machines (VMs) are some bottlenecks for efficient HPC in cloud. For tightly-coupled iterative applications, one slow processor slows down the entire
more » ... n, resulting in poor CPU utilization. In this paper, we present a dynamic load balancer for tightlycoupled iterative HPC applications in cloud. It infers the static hardware heterogeneity in virtualized environments, and also adapts to the dynamic heterogeneity caused by the interference arising due to multi-tenancy. Through continuous live monitoring, instrumentation, and periodic refinement of task distribution to VMs, our load balancer adapts to the dynamic variations in cloud resources. Through experimental evaluation on a private cloud with 64 VMs using benchmarks and a real science application, we demonstrate performance benefits up to 45%. Finally, we analyze the effect of load balancing frequency, problem size, and computational granularity (problem decomposition) on the performance and scalability of our techniques. • We analyze the impact of load balancing frequency, grain size, and problem size on achieved performance ( § VI).
doi:10.1109/ccgrid.2013.65 dblp:conf/ccgrid/GuptaSKM13 fatcat:poatj7saofdvdpijockg4ehmky