MARLA: MapReduce for Heterogeneous Clusters

Zacharia Fadika, Elif Dede, Jessica Hartog, Madhusudhan Govindaraju
2012 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)  
MapReduce has gradually become the framework of choice for "big data". The MapReduce model allows for efficient and swift processing of large scale data with a cluster of compute nodes. However, the efficiency here comes at a price. The performance of widely used MapReduce implementations such as Hadoop suffers in heterogeneous and load-imbalanced clusters. We show the disparity in performance between homogeneous and heterogeneous clusters in this paper to be high. Subsequently, we present
more » ... , a MapReduce framework capable of performing well not only in homogeneous settings, but also when the cluster exhibits heterogeneous properties. We address the problems associated with existing MapReduce implementations affecting cluster heterogeneity, and subsequently present through MARLA the components and trade-offs necessary for better MapReduce performance in heterogeneous cluster and cloud environments. We quantify the performance gains exhibited by our approach against Apache Hadoop and MARIANE in data intensive and compute intensive applications.
doi:10.1109/ccgrid.2012.135 dblp:conf/ccgrid/FadikaDHG12 fatcat:t6ff7z2jzfhwvciqyn4j7xjjlq