Asynchronous transpose-matrix architectures

J.A. Tierno, P. Kudva
Proceedings International Conference on Computer Design VLSI in Computers and Processors  
The matrix transposition operation is a necessary step in several image video c ompression and decompression algorithms, in particular the discrete cosine transform DCT and its inverse IDCT, and some distributed arithmetic applications. These algorithms have to be p erformed at high data-rates, and with a minimum of power dissipation for portable applications. In this paper we describe how the clocked solution is usually implemented, and we present two new asynchronous architectures that
more » ... ectures that perform matrix transposition. These architectures, one based on two phase signaling, one based on four phase signaling, have better characteristics than the clocked solution in terms of latency and power, at no cost in area or throughput. We discuss the characteristics of these three a r chitectures and evaluate the relative advantages o f e ach one.
doi:10.1109/iccd.1997.628904 dblp:conf/iccd/TiernoK97 fatcat:r7ik6gk6wfatnfdttkjufjiqqi