An Experimental and Comparative Benchmark Study Examining Resource Utilization in Managed Hadoop Context [article]

Uluer Emre Ozdil, Serkan Ayvaz
2021 arXiv   pre-print
Transitioning cloud-based Hadoop from IaaS to PaaS, which are commercially conceptualized as pay-as-you-go or pay-per-use, often reduces the associated system costs. However, managed Hadoop systems do present a black-box behavior to the end-users who cannot be clear on the inner performance dynamics, hence, on the benefits of leveraging them. In the study, we aimed to understand managed Hadoop context in terms of resource utilization. We utilized three experimental Hadoop-on-PaaS proposals as
more » ... ey come out-of-the-box and conducted Hadoop specific workloads of the HiBench Benchmark Suite. During the benchmark executions, we collected system resource utilization data on the worker nodes. The results indicated that the same property specifications among cloud services do not guarantee nearby performance outputs, nor consistent results within themselves. We assume that the managed systems' architectures and pre-configurations play a significant role in the performance.
arXiv:2112.10134v1 fatcat:yuc2gdfd6vbdbnyod2dspmzlfy