Filters








727 Hits in 5.2 sec

Graph Distillation for Action Detection with Privileged Modalities [article]

Zelun Luo, Jun-Ting Hsieh, Lu Jiang, Juan Carlos Niebles, Li Fei-Fei
2018 arXiv   pre-print
In this work, we propose a method termed graph distillation that incorporates rich privileged information from a large-scale multimodal dataset in the source domain, and improves the learning in the target  ...  We propose a technique that tackles action detection in multimodal videos under a realistic and challenging condition in which only limited training data and partially observed modalities are available  ...  To this end, we propose the novel graph distillation method to learn a dynamic distillation across multiple modalities for action detection in multimodal videos.  ... 
arXiv:1712.00108v2 fatcat:vbvqueff4faahg365uzzgoffoq

PMD-Net: Privileged Modality Distillation Network for 3D Hand Pose Estimation from a Single RGB Image

Kewen Wang, Xilin Chen
2020 British Machine Vision Conference  
In this paper, we propose a Privileged Modality Distillation Network (PMD-Net), which improves the RGB-based hand pose estimation by excavating the privileged information from depth prior during training  ...  Different from existing methods, the PMD-Net is composed of three sub-networks to regress X, Y, and Z coordinates respectively and distills the privileged information from the depth network to the RGB  ...  For example, privileged modality distillation is used to tackle action detection [5, 23] , image classification [14] , and vessel border detection [7] .  ... 
dblp:conf/bmvc/WangC20 fatcat:ocwmvsoauzb3ddbbztkj57a7uq

Learning an Augmented RGB Representation with Cross-Modal Knowledge Distillation for Action Detection

Rui Dai, Srijan Das, Francois Bremond
2021 2021 IEEE/CVF International Conference on Computer Vision (ICCV)  
To this end, we aim at learning an augmented RGB representation for action detection, taking advantage of additional modalities at training time through KD.  ...  Extensive experimental analysis shows that our proposed distillation framework is generic and outperforms other popular cross-modal distillation methods in action detection task.  ...  The authors are also grateful to the OPAL infrastructure from Université Côte d'Azur for providing resources and support.  ... 
doi:10.1109/iccv48922.2021.01281 fatcat:r45zgztljbcthffhy5mukc6hse

Learning an Augmented RGB Representation with Cross-Modal Knowledge Distillation for Action Detection [article]

Rui Dai, Srijan Das, Francois Bremond
2021 arXiv   pre-print
To this end, we aim at learning an augmented RGB representation for action detection, taking advantage of additional modalities at training time through KD.  ...  Extensive experimental analysis shows that our proposed distillation framework is generic and outperforms other popular cross-modal distillation methods in action detection task.  ...  The authors are also grateful to the OPAL infrastructure from Université Côte d'Azur for providing resources and support.  ... 
arXiv:2108.03619v1 fatcat:dzzxoqp76nbalcastnv33rsnbu

Modality Distillation with Multiple Stream Networks for Action Recognition [chapter]

Nuno C. Garcia, Pietro Morerio, Vittorio Murino
2018 Lecture Notes in Computer Science  
This paper presents a new approach for multimodal video action recognition, developed within the unified frameworks of distillation and privileged information, named generalized distillation.  ...  Diverse input data modalities can provide complementary cues for several tasks, usually leading to more robust algorithms and better performance.  ...  The authors propose a graph-based distillation method thats is able to distill information from all modalities at training time, while also passing through a validation phase on a subset of modalities.  ... 
doi:10.1007/978-3-030-01237-3_7 fatcat:tohgxdi4a5d7vlf4rfe6ckkhsq

Modality Distillation with Multiple Stream Networks for Action Recognition [article]

Nuno Garcia, Pietro Morerio, Vittorio Murino
2018 arXiv   pre-print
This paper presents a new approach for multimodal video action recognition, developed within the unified frameworks of distillation and privileged information, named generalized distillation.  ...  Code available at https://github.com/ncgarcia/modality-distillation .  ...  Modality Distillation with Multiple Stream Networks for Action Recognition -Supplementary material Nuno C.  ... 
arXiv:1806.07110v2 fatcat:pdnky2bningcnpipoxiql5q5tq

Learning with privileged information via adversarial discriminative modality distillation [article]

Nuno C. Garcia, Pietro Morerio, Vittorio Murino
2018 arXiv   pre-print
This paper presents a new approach in this direction for RGB-D vision tasks, developed within the adversarial learning and privileged information frameworks.  ...  Heterogeneous data modalities can provide complementary cues for several tasks, usually leading to more robust algorithms and better performance.  ...  ACKNOWLEDGMENTS The authors would like to thank Riccardo Volpi for useful discussion on adversarial training and GANs.  ... 
arXiv:1810.08437v1 fatcat:emrj23ga3ngprlp2zmtxvms3qy

Fusion-GCN: Multimodal Action Recognition using Graph Convolutional Networks [article]

Michael Duhme, Raphael Memmesheimer, Dietrich Paulus
2021 arXiv   pre-print
With Fusion-GCN, we propose to integrate various sensor data modalities into a graph that is trained using a GCN model for multi-modal action recognition.  ...  In this paper, we present Fusion-GCN, an approach for multimodal action recognition using Graph Convolutional Networks (GCNs).  ...  [30] propose a graph distillation method to incorporate rich privileged information from a large-scale multi-modal dataset in the source domain, and improves the learning in the target domain More fundamentally  ... 
arXiv:2109.12946v1 fatcat:3z5t4er56vf2zhiwxpvl7o5f6a

Knowledge Distillation: A Survey [article]

Jianping Gou, Baosheng Yu, Stephen John Maybank, Dacheng Tao
2021 arXiv   pre-print
In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks.  ...  However, it is a challenge to deploy these cumbersome deep models on devices with limited resources, e.g., mobile phones and embedded devices, not only because of the high computational complexity but  ...  These methods are the examples of spatiotemporal modality distillation with a different knowledge transfer for action recognition.  ... 
arXiv:2006.05525v6 fatcat:aedzaeln5zf3jgjsgsn5kvjrri

RGB-based 3D Hand Pose Estimation via Privileged Learning with Depth Images [article]

Shanxin Yuan, Bjorn Stenger, Tae-Kyun Kim
2018 arXiv   pre-print
This paper proposes a method for hand pose estimation from RGB images that uses both external large-scale depth image datasets and paired depth and RGB images as privileged information at training time  ...  Experiments on three public datasets show that the method outperforms the state-of-the-art methods for hand pose estimation using RGB image input.  ...  [15] recently proposed graph distillation for action detection with privileged modalities (RGB, depth, skeleton, and flow), where a novel graph distillation layer was used to dynamically learn to distill  ... 
arXiv:1811.07376v1 fatcat:lrfsih64arccxfsgn37duwvjbq

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks [article]

Lin Wang, Kuk-Jin Yoon
2021 arXiv   pre-print
Then, we provide a comprehensive survey on the recent progress of KD methods together with S-T frameworks typically for vision tasks.  ...  To achieve faster speeds and to handle the problems caused by the lack of data, knowledge distillation (KD) has been proposed to transfer information learned from one model to another.  ...  [290] focused on video concept learning for action recognition and event detection by using web videos and images.  ... 
arXiv:2004.05937v6 fatcat:yqzo7nylzbbn7pfhzpfc2qaxea

Spatio-Temporal Graph for Video Captioning With Knowledge Distillation

Boxiao Pan, Haoye Cai, De-An Huang, Kuan-Hui Lee, Adrien Gaidon, Ehsan Adeli, Juan Carlos Niebles
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
In this paper, we propose a spatio-temporal graph model to explicitly capture such information for video captioning. Yellow boxes represent object proposals from Faster R-CNN [12].  ...  Video sample from MSVD [3] with the caption "A cat jumps into a box." Best viewed in color.  ...  We thank our anonymous reviewers, Andrey Kurenkov, Chien-Yi Chang, and Ranjay Krishna, for helpful comments and discussion.  ... 
doi:10.1109/cvpr42600.2020.01088 dblp:conf/cvpr/PanCHLGAN20 fatcat:3mafcbc6o5d4rnyn554ly2awfa

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation [article]

Boxiao Pan, Haoye Cai, De-An Huang, Kuan-Hui Lee, Adrien Gaidon, Ehsan Adeli, Juan Carlos Niebles
2020 arXiv   pre-print
In this paper, we propose a novel spatio-temporal graph model for video captioning that exploits object interactions in space and time.  ...  We demonstrate the efficacy of our approach through extensive experiments on two benchmarks, showing our approach yields competitive performance with interpretable predictions.  ...  We thank our anonymous reviewers, Andrey Kurenkov, Chien-Yi Chang, and Ranjay Krishna, for helpful comments and discussion.  ... 
arXiv:2003.13942v1 fatcat:puqoouhvfnguzngzeeaaftdbfy

Distillation of Human-Object Interaction Contexts for Action Recognition [article]

Muna Almushyti, Frederick W. Li
2021 arXiv   pre-print
Modeling spatial-temporal relations is imperative for recognizing human actions, especially when a human is interacting with objects, while multiple objects appear around the human differently over time  ...  More importantly, we investigate how knowledge from these graphs can be distilled to their counterparts for improving human-object interaction (HOI) recognition.  ...  For action recognition task, the knowledge is distilled between multiple modalities (e.g., skeleton, RGB), which can be considered as privilege information and not all of them are available during inference  ... 
arXiv:2112.09448v1 fatcat:tqjx6nyuvvhrzmyfifvsx56gde

Online Sensor Hallucination via Knowledge Distillation for Multimodal Image Classification [article]

Saurabh Kumar, Biplab Banerjee, Subhasis Chaudhuri
2019 arXiv   pre-print
In order to ensure better knowledge transfer during modality hallucination, we explicitly incorporate concepts of knowledge distillation for the purpose of exploring the privileged (side) information in  ...  We deal with the problem of information fusion driven satellite image/scene classification and propose a generic hallucination architecture considering that all the available sensor information are present  ...  [40] recently proposed a graph-based distillation method to learn in presence of side information for action detection and action recognition tasks in videos.  ... 
arXiv:1908.10559v1 fatcat:s6mknx6nhrfctkci5pwogjofbu
« Previous Showing results 1 — 15 out of 727 results