A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
The recent boom of big data, coupled with the challenges of its processing and storage gave rise to the development of distributed data processing and storage paradigms like MapReduce, Spark, and NoSQL databases. With the advent of cloud computing, processing and storing such massive datasets on clusters of machines is now feasible with ease. However, there are limited tools and approaches, which users can rely on to gauge and comprehend the performance of their big data applications deployedarXiv:1711.09138v2 fatcat:by6vcwv44bdqlfjgftwsp2qwua