Petascaling Machine Learning Applications With Mr-Mpi

Cevdet Aykanat
2014 Zenodo  
This whitepaper addresses applicability of the Map/Reduce paradigm for scalable and easy parallelization of fundamental data mining approaches with the aim of exploring/enabling processing of terabytes of data on PRACE Tier-0 supercomputing systems. To this end, we first test the usage of MR-MPI library, a lightweight Map/Reduce implementation that uses the MPI library for inter-process communication, on PRACE HPC systems; then propose MR-MPI-based implementations of a number of machine
more » ... algorithms and constructs; and finally provide experimental analysis measuring the scaling performance of the proposed implementations. We test our multiple machine learning algorithms with different datasets. The obtained results show that utilization of the Map/Reduce paradigm can be a strong enhancer on the road to petascale.
doi:10.5281/zenodo.823040 fatcat:f4olnrq2zfcsjksd7e4k5ej5vy