Locality-aware dynamic VM reconfiguration on MapReduce clouds

Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng
2012 Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing - HPDC '12  
Cloud computing based on system virtualization, has been expanding its services to distributed data-intensive platforms such as MapReduce and Hadoop. Such a distributed platform on clouds runs in a virtual cluster consisting of a number of virtual machines. In the virtual cluster, demands on computing resources for each node may fluctuate, due to data locality and task behavior. However, current cloud services use a static cluster configuration, fixing or manually adjusting the computing
more » ... ity of each virtual machine (VM). The fixed homogeneous VM configuration may not adapt to changing resource demands in individual nodes. In this paper, we propose a dynamic VM reconfiguration technique for data-intensive computing on clouds, called Dynamic Resource Reconfiguration (DRR). DRR can adjust the computing capability of individual VMs to maximize the utilization of resources. Among several factors causing resource imbalance in the Hadoop platforms, this paper focuses on data locality. Although assigning tasks on the nodes containing their input data can improve the overall performance of a job significantly, the fixed computing capability of each node may not allow such locality-aware scheduling. DRR dynamically increases or decreases the computing capability of each node to enhance locality-aware task scheduling. We evaluate the potential performance improvement of DRR on a 100-node cluster, and its detailed behavior on a small scale cluster with constrained network bandwidth. On the 100-node cluster, DRR can improve the throughput of Hadoop jobs by 15% on average, and 41% on the private cluster with the constrained network connection.
doi:10.1145/2287076.2287082 dblp:conf/hpdc/ParkLKHM12 fatcat:hvb5mh7hprbehj2n5dqdg4n2u4