1 Hit in 2.5 sec

MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation [article]

Benlin Liu, Yongming Rao, Jiwen Lu, Jie Zhou, Cho-jui Hsieh
2020 arXiv   pre-print
Utilizing the softtargets learned from the intermediate feature maps of the model, we canachieve better self-boosting of the network in comparison with the state-of-the-art.  ...  Knowledge Distillation (KD) has been one of the most popu-lar methods to learn a compact model.  ...  The generator is used to perform top-down distillation to generate the soft teacher targets to facilitate the self-boosting of the main model.  ... 
arXiv:2008.12094v1 fatcat:r6wxy6leynb7xn753q2wj7sj4u