Variable-Size Batched Condition Number Calculation on GPUs

Hartwig Anzt, Jack Dongarra, Goran Flegar, Thomas Grutzmacher
2018 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)  
We present a kernel that is designed to quickly compute the condition number of a large collection of tiny matrices on a graphics processing unit (GPU). The matrices can differ in size and the process integrates the use of pivoting to ensure a numerically-stable matrix inversion. The performance assessment reveals that, in double precision arithmetic, the new GPU kernel achieves up to 550 GFLOPs (billions of floatingpoint operations per second) and 800 GFLOPs on NVIDIA's P100 and V100 GPUs,
more » ... ectively. The results also demonstrate a considerable speed-up with respect to a workflow that computes the condition number via launching a set of four batched kernels. In addition, we present a variable-size batched kernel for the computation of the matrix infinity norm. We show that this memory-bound kernel achieves up to 90% of the sustainable peak bandwidth.
doi:10.1109/cahpc.2018.8645907 dblp:conf/sbac-pad/AnztDFG18 fatcat:hogahlzelvhnnag6hzl4xz4ikq