A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2013; you can also visit the original URL.
The file type is application/pdf
.
Supporting Bulk Synchronous Parallelism in Map-Reduce Queries
2012
2012 SC Companion: High Performance Computing, Networking Storage and Analysis
One of the major drawbacks of the Map-Reduce (MR) model is that, to simplify reliability and fault tolerance, it does not preserve data in memory across consecutive MR jobs: a MR job must dump its data to the distributed file system before they can be read by the next MR job. This restriction imposes a high overhead to complex MR workflows and graph algorithms, such as PageRank, which require repetitive MR jobs. The Bulk Synchronous Parallelism (BSP) programming model, on the other hand, has
doi:10.1109/sc.companion.2012.129
dblp:conf/sc/Fegaras12
fatcat:ag66ssu27bhpbi5x4spmrucxwu