A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Self-Distillation: Towards Efficient and Compact Neural Networks
2021
IEEE Transactions on Pattern Analysis and Machine Intelligence
Remarkable achievements have been obtained by deep neural networks in the last several years. However, the breakthrough in neural networks accuracy is always accompanied by explosive growth of computation and parameters, which leads to a severe limitation of model deployment. In this paper, we propose a novel knowledge distillation technique named self-distillation to address this problem. Self-distillation attaches several attention modules and shallow classifiers at different depths of neural
doi:10.1109/tpami.2021.3067100
pmid:33735074
fatcat:6cymqo72bbbchbcoyy76ih7jqq