A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2015; you can also visit the original URL.
The file type is application/pdf
.
An empirically tuned 2D and 3D FFT library on CUDA GPU
2010
Proceedings of the 24th ACM International Conference on Supercomputing - ICS '10
In this paper, a Cooley-Tukey algorithm based multidimensional FFT computation framework on GPU is proposed. This framework generalizes the decomposition of multi-dimensional FFT on GPUs using an I/O tensor representation, and therefore provides a systematic description of possible FFT implementations on GPUs. The framework is geared to the efficiency of multi-dimensional FFT on GPU architectures. In particular, no global transposition among dimensions is performed and some previously unnoticed
doi:10.1145/1810085.1810127
dblp:conf/ics/GuLS10
fatcat:yncl3tty5vauzoe3gfs3bufcsa