Performance Comparison of Spark Clusters Configured Conventionally and a Cloud Service

Hameeza Ahmed, Muhammad Ali Ismail, Muhammad Faraz Hyder, Syed Muhammad Sheraz, Nida Fouq
2016 Procedia Computer Science  
Apache Spark is an open source cluster computing technology specifically designed for large scale data processing. This paper deals with the deployment of Spark cluster as a cloud service on the OpenStack based cloud. HiBench benchmark suite is used to compare the performance of Spark cluster as a service and conventional Spark cluster. The results clearly depict how Spark as a cloud service gives more promising outcomes in terms of time, effort and throughput.
doi:10.1016/j.procs.2016.04.014 fatcat:6ramsiir2jfilpnocif2hfov5q