A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
A study of the effects of machine geometry and mapping on distributed transpose performance
2008
Proceedings of the 2008 conference on Computing frontiers - CF '08
This paper describes a parallel strategy to extend the scalability of a small 3D FFT on thousands of Blue Gene/L processors. The approach is to execute the intermediate phases of the 3D FFT on smaller processor subsets. Performance measurements of the standalone 3D FFT on two communication protocols, MPI and BG/L ADE [19] are presented. While the performance of the 3D-FFT with MPI-based and BG/L ADE-based implementations exhibited qualitatively similar behavior, the BG/L ADE-based version has
doi:10.1145/1366230.1366243
dblp:conf/cf/EleftheriouFRWHG08
fatcat:azncncvaqvbovj5lc5zyk3vmym