A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
FusionStitching: Boosting Memory Intensive Computations for Deep Learning Workloads
[article]
2021
arXiv
pre-print
We show in this work that memory intensive computations can result in severe performance problems due to off-chip memory access and CPU-GPU context switch overheads in a wide range of deep learning models. For this problem, current just-in-time (JIT) kernel fusion and code generation techniques have limitations, such as rough fusion plan exploration strategies and limited code generation ability. We propose FusionStitching, a deep learning compiler capable of fusing memory intensive operators,
arXiv:2009.10924v2
fatcat:6lrkhmesljgfrlysahmnp64nhe