Efficient Adaptive Algorithms for Transposing Small and Large Matrices on Symmetric Multiprocessors

Rami Al Na'mneh, W. David Pan, Seong-Moo Yoo
2006 Informatica  
Matrix transpose in parallel systems typically involves costly all-to-all communications. In this paper, we provide a comparative characterization of various efficient algorithms for transposing small and large matrices using the popular symmetric multiprocessors (SMP) architecture, which carries a relatively low communication cost due to its large aggregate bandwidth and lowlatency inter-process communication. We conduct analysis on the cost of data sending / receiving and the memory
more » ... t of these matrix-transpose algorithms. We then propose an adaptive algorithm that can minimize the overhead of the matrix transpose operations given the parameters such as the data size, number of processors, start-up time, and the effective communication bandwidth.
doi:10.15388/informatica.2006.153 fatcat:qf2e2i7775dn3caqplua3zyu4m