A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Batched matrix computations on hardware accelerators based on GPUs
2015
The international journal of high performance computing applications
Contractions can often be implemented as index reordering plus batched GEMM (and hence, be highly efficient)
doi:10.1177/1094342014567546
fatcat:lb3idu5ksvgdtk3tmtpfd2putq