A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2013; you can also visit the original URL.
The file type is application/pdf
.
Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU
2013
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '13
The performance of Graphic Processing Units (GPU) is sensitive to irregular memory references. Some recent work shows the promise of data reorganization for eliminating non-coalesced memory accesses that are caused by irregular references. However, all previous studies have employed simple, heuristic methods to determine the new data layouts to create. As a result, they either do not provide any performance guarantee or are effective to only some limited scenarios. This paper contributes a
doi:10.1145/2442516.2442523
dblp:conf/ppopp/WuZZJS13
fatcat:37gzd4pntfh5rfvdjseygzkbmm