A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Out-of-core Training for Extremely Large-Scale Neural Networks With Adaptive Window-Based Scheduling
[article]
2020
arXiv
pre-print
While large neural networks demonstrate higher performance in various tasks, training large networks is difficult due to limitations on GPU memory size. We propose a novel out-of-core algorithm that enables faster training of extremely large-scale neural networks with sizes larger than allotted GPU memory. Under a given memory budget constraint, our scheduling algorithm locally adapts the timing of memory transfers according to memory usage of each function, which improves overlap between
arXiv:2010.14109v1
fatcat:dvftcewz3bebdpjxvun22yir5y