Filters








1,184 Hits in 6.4 sec

Jointly learning heterogeneous features for RGB-D activity recognition

Jian-Fang Hu, Wei-Shi Zheng, Jianhuang Lai, Jianguo Zhang
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
In this paper, we focus on heterogeneous feature learning for RGB-D activity recognition.  ...  In addition, a novel RGB-D activity dataset focusing on human-object interaction is collected for evaluating the proposed method, which will be made available to the community for RGB-D activity benchmarking  ...  In this paper, we propose a heterogeneous feature learning model for RGB-D activity recognition.  ... 
doi:10.1109/cvpr.2015.7299172 dblp:conf/cvpr/HuZLZ15 fatcat:3abelf3zdff5niel423mgdnom4

Jointly Learning Heterogeneous Features for RGB-D Activity Recognition

Jian-Fang Hu, Wei-Shi Zheng, Jianhuang Lai, Jianguo Zhang
2017 IEEE Transactions on Pattern Analysis and Machine Intelligence  
In this paper, we focus on heterogeneous features learning for RGB-D activity recognition.  ...  A new RGB-D activity dataset focusing on human-object interaction is further contributed, which presents more challenges for RGB-D activity benchmarking.  ...  CONCLUSION We have proposed a new RGB-D method called joint heterogenous features learning (JOULE) model to jointly learn heterogeneous features with different number of dimensions for RGB-D activity recognition  ... 
doi:10.1109/tpami.2016.2640292 pmid:28026749 fatcat:peoubp5fcbfodo2khfjwwlofeq

Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition [article]

Pichao Wang and Wanqing Li and Jun Wan and Philip Ogunbona and Xinwang Liu
2017 arXiv   pre-print
power of the deeply learned features and weakens the undesired modality discrepancy by jointly optimizing a ranking loss and a softmax loss for both homogeneous and heterogeneous modalities.  ...  the two kinds of features for action recognition.  ...  One typical challenge in deep learning based action recognition is how a RGB-D sequence could be effectively represented and fed to deep neural networks for recognition.  ... 
arXiv:1801.01080v1 fatcat:6vvgkqdgjrf3tpm4lttds3qtqm

Cooperative Cross-Stream Network for Discriminative Action Representation [article]

Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen
2019 arXiv   pre-print
features and reduces the undesired modality discrepancy by jointly optimizing a modality ranking constraint and a cross-entropy loss for both homogeneous and heterogeneous modalities.  ...  The jointly spatial and temporal stream networks feature extraction is accomplished by an end-to-end learning manner.  ...  To efficient explore the relation of the RGB stream and optical flow stream, We propose a crossmodality features extraction paradigm to jointly learning spatiotemporal features for two heterogenous modalities  ... 
arXiv:1908.10136v1 fatcat:hwu2pnudxffmfg3iajq7s3ymsm

Robust object recognition in RGB-D egocentric videos based on Sparse Affine Hull Kernel

Shaohua Wan, J.K. Aggarwal
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
In this paper, we propose a novel kernel function for recognizing objects in RGB-D egocentric videos.  ...  Our kernel function also allows convenient integration of heterogeneous data modalities beyond RGB and depth.  ...  This can be attributed to the benefit of learning proper weights for heterogeneous data integration. Figure 4 plots the learned weights of the RGB and depth channel for a subset of object classes.  ... 
doi:10.1109/cvprw.2015.7301302 dblp:conf/cvpr/WanA15 fatcat:wyub2j6o55hmhgqk5cehgrpiy4

Viewpoint Invariant Action Recognition using RGB-D Videos [article]

Jian Liu, Naveed Akhtar, Ajmal Mian
2018 arXiv   pre-print
We use the complementary RGB and Depth information from the RGB-D cameras to address this problem.  ...  The heterogeneous features from the two streams are combined and used as a dictionary to predict the label of the test samples.  ...  [8] proposed to learn heterogeneous features for the RGB-D video based action recognition.  ... 
arXiv:1709.05087v2 fatcat:pok3b2pmubeh3d2rdl3tfwn2lm

Bilinear heterogeneous information machine for RGB-D action recognition

Yu Kong, Yun Fu
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
This paper proposes a novel approach to action recognition from RGB-D cameras, in which depth features and RGB visual features are jointly used.  ...  Rich heterogeneous RGB and depth data are effectively compressed and projected to a learned shared space, in order to reduce noise and capture useful information for recognition.  ...  Bilinear Heterogeneous Information Machine The goal of this work is to utilize heterogeneous features from RGB-D action videos, and learn shared crossmodal features for action recognition.  ... 
doi:10.1109/cvpr.2015.7298708 dblp:conf/cvpr/KongF15 fatcat:rbik4szu5vdhlor5m4dkken6a4

Audio-Visual Contrastive Learning for Self-supervised Action Recognition [article]

Haoyuan Lan, Yang Liu, Liang Lin
2022 arXiv   pre-print
In this paper, we present an end-to-end self-supervised framework named Audio-Visual Contrastive Learning (AVCL), to learn discriminative audio-visual representations for action recognition.  ...  The underlying correlation between audio and visual modalities within videos can be utilized to learn supervised information for unlabeled videos.  ...  Therefore, using only one model to jointly learn the audiovisual is not feasible.  ... 
arXiv:2204.13386v1 fatcat:tqdppujyardkjj3phuwhgbroe4

PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities [article]

Lan Wang, Chenqiang Gao, Luyu Yang, Yue Zhao, Wangmeng Zuo, Deyu Meng
2018 arXiv   pre-print
For example, the RGB surveillance cameras are often restricted from private spaces, which is in conflict with the need of abnormal activity detection for personal security.  ...  and infrared features.  ...  , where features of multiple modalities are jointly learned and combined.  ... 
arXiv:1804.06248v1 fatcat:5sojriw56zfkvass2c47hthuwa

Multi-Modal Human Action Recognition With Sub-Action Exploiting and Class-Privacy Preserved Collaborative Representation Learning

Chengwu Liang, Deyin Liu, Lin Qi, Ling Guan
2020 IEEE Access  
This paper proposes a segmental architecture to exploit the relations of sub-actions, jointly with heterogeneous information fusion and Class-privacy Preserved Collaborative Representation (CPPCR) for  ...  INDEX TERMS Action recognition, feature fusion, class-privacy preserved, sub-action sharing. This work is licensed under a Creative Commons Attribution 4.0 License.  ...  In [57] , RGB and depth futures are fused for RGB-D videos based action recognition.  ... 
doi:10.1109/access.2020.2976496 fatcat:357i4iqivzhjpitbgttmjix3ri

VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living [article]

Srijan Das, Rui Dai, Di Yang, Francois Bremond
2021 arXiv   pre-print
Many attempts have been made towards combining RGB and 3D poses for the recognition of Activities of Daily Living (ADL).  ...  One is to transfer the Pose knowledge into RGB through a feature-level distillation and the other towards mimicking pose driven attention through an attention-level distillation.  ...  ACKNOWLEDGEMENT We are grateful to INRIA Sophia Antipolis -Mediterranean "NEF" computation cluster for providing resources and support.  ... 
arXiv:2105.08141v1 fatcat:vtopa4qekbdelez6cyw4xk24t4

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition [article]

Sijie Song, Jiaying Liu, Yanghao Li, Zongming Guo
2020 arXiv   pre-print
With the prevalence of RGB-D cameras, multi-modal video data have become more available for human action recognition.  ...  , that the network learns to compensate for the loss of skeletons at test time and even at training time.  ...  [46] proposed a multi-modal feature learning framework for RGB-D object recognition to learn not only modal-specific patterns but also modal-shared features.  ... 
arXiv:2001.11657v1 fatcat:xcmed5yqx5bwrjpqzxsyguo7ga

Correlated and Individual Multi-Modal Deep Learning for RGB-D Object Recognition [article]

Ziyan Wang, Jiwen Lu, Ruogu Lin, Jianjiang Feng, Jie zhou
2016 arXiv   pre-print
Unlike most conventional RGB-D object recognition methods which extract features from the RGB and depth channels individually, our CIMDL jointly learns feature representations from raw RGB-D data with  ...  In this paper, we propose a new correlated and individual multi-modal deep learning (CIMDL) method for RGB-D object recognition.  ...  [26] devised a distance metric learning approach [31, 34] to fuse heterogeneous feature representations for RGB-D object recognition.  ... 
arXiv:1604.01655v3 fatcat:l5sp75obdfcxdirkmhawngngs4

Human action recognition using associated depth and skeleton information

Nick C. Tang, Yen-Yu Lin, Ju-Hsuan Hua, Ming-Fang Weng, Hong-Yuan Mark Liao
2014 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Each action instance for recognition in RGB representation is then augmented with the borrowed depth and skeleton features.  ...  Addressing this problem in this work, we aim at action recognition in RGB videos with the aid of Kinect.  ...  ., D = {(x i , d i , s i )} N i=1 . Recognition with augmented features The training data for action recognition have been expanded from D to D .  ... 
doi:10.1109/icassp.2014.6854475 dblp:conf/icassp/TangLHWL14 fatcat:vp7jn5wcmbhujmdnrxgn2k24si

Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition

Di Wu, Lionel Pigou, Pieter-Jan Kindermans, Nam Do-Hoang Le, Ling Shao, Joni Dambre, Jean-Marc Odobez
2016 IEEE Transactions on Pattern Analysis and Machine Intelligence  
A semi-supervised hierarchical dynamic framework based on a Hidden Markov Model (HMM) is proposed for simultaneous gesture segmentation and recognition where skeleton joint information, depth and RGB images  ...  This paper describes a novel method called Deep Dynamic Neural Networks (DDNN) for multimodal gesture recognition.  ...  for RGB-D data.  ... 
doi:10.1109/tpami.2016.2537340 pmid:26955020 fatcat:h3bpphgchfeqlartq4ewgbllfq
« Previous Showing results 1 — 15 out of 1,184 results