A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
On the Acceleration of Deep Learning Model Parallelism with Staleness
[article]
2022
arXiv
pre-print
Training the deep convolutional neural network for computer vision problems is slow and inefficient, especially when it is large and distributed across multiple devices. The inefficiency is caused by the backpropagation algorithm's forward locking, backward locking, and update locking problems. Existing solutions for acceleration either can only handle one locking problem or lead to severe accuracy loss or memory inefficiency. Moreover, none of them consider the straggler problem among devices.
arXiv:1909.02625v3
fatcat:3go6lc3u7fdfxlu4tcbxvpu6ea