A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Efficient Adaptive Algorithms for Transposing Small and Large Matrices on Symmetric Multiprocessors
2006
Informatica
Matrix transpose in parallel systems typically involves costly all-to-all communications. In this paper, we provide a comparative characterization of various efficient algorithms for transposing small and large matrices using the popular symmetric multiprocessors (SMP) architecture, which carries a relatively low communication cost due to its large aggregate bandwidth and lowlatency inter-process communication. We conduct analysis on the cost of data sending / receiving and the memory
doi:10.15388/informatica.2006.153
fatcat:qf2e2i7775dn3caqplua3zyu4m