An Empirical Study of Hadoop's Energy Efficiency on a HPC Cluster

Nidhi Tiwari, Santonu Sarkar, Umesh Bellur, Maria Indrawan
2014 Procedia Computer Science  
Map-Reduce programming model is commonly used for efficient scientific computations, as it executes tasks in parallel and distributed manner on large data volumes. The HPC infrastructure can effectively increase the parallelism of map-reduce tasks. However such an execution will incur high energy and data transmission costs. Here we empirically study how the energy efficiency of a map-reduce job varies with increase in parallelism and network bandwidth on a HPC cluster. We also investigate the
more » ... so investigate the effectiveness of power-aware systems in managing the energy consumption of different types of map-reduce jobs. We comprehend that for some jobs the energy efficiency degrades at high degree of parallelism, and for some it improves at low CPU frequency. Consequently we suggest strategies for configuring the degree of parallelism, network bandwidth and power management features in a HPC cluster for energy efficient execution of map-reduce jobs.
doi:10.1016/j.procs.2014.05.006 fatcat:n6q3cidruba3te6ldjw4gncj6m