Distilling Knowledge From Graph Convolutional Networks

Yiding Yang, Jiayan Qiu, Mingli Song, Dacheng Tao, Xinchao Wang
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Figure 1: (a) Unlike existing knowledge distillation methods that focus on only the prediction or the middle activation, our method explicitly distills knowledge about how the teacher model embeds the topological structure and transfers it to the student model. (b) We display the structure of the feature space, visualized by the distance between the red point and the others on a point cloud dataset. Here, each object is represented as a set of 3D points. Top Row: structures obtained from the
more » ... cher; Middle Row: structures obtained from the student trained with the local structure preserving (LSP) module; Bottom Row: structures obtained from the student trained without LSP. Features in the middle and bottom row are obtained from the last layer of the model after training for ten epochs. As we can see, model trained with LSP learns a similar structure as that of the teacher, while the model without LSP fails to do so.
doi:10.1109/cvpr42600.2020.00710 dblp:conf/cvpr/YangQSTW20 fatcat:nzswe5fls5brhgj2n3lof7zjmy