A scalable and compact systolic architecture for linear solvers

Kevin S. H. Ong, Suhaib A. Fahmy, Keck-Voon Ling
2014 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors  
We present a scalable design for accelerating the problem of solving a dense linear system of equations using LU Decomposition. A novel systolic array architecture that can be used as a building block in scientific applications is described and prototyped on a Xilinx Virtex 6 FPGA. This solver has a throughput of around 3.2 million linear systems per second for matrices of size N=4 and around 80 thousand linear systems per second for matrices of size N=16. In comparison with similar work, our
more » ... sign offers up to a 12-fold improvement in speed whilst requiring up to 50% less hardware resources. As a result, a linear system of size N=64 can be implemented on a single FPGA, whereas previous work was limited to a size of N=12 and resorted to complex multi-FPGA architectures to scale. Finally, the scalable design can be adapted to different sized problems with minimum effort.
doi:10.1109/asap.2014.6868658 dblp:conf/asap/OngFL14 fatcat:3j23w7boyffb7issoqhobvblei