A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Jointly learning heterogeneous features for RGB-D activity recognition
2015
2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
In this paper, we focus on heterogeneous feature learning for RGB-D activity recognition. ...
In addition, a novel RGB-D activity dataset focusing on human-object interaction is collected for evaluating the proposed method, which will be made available to the community for RGB-D activity benchmarking ...
In this paper, we propose a heterogeneous feature learning model for RGB-D activity recognition. ...
doi:10.1109/cvpr.2015.7299172
dblp:conf/cvpr/HuZLZ15
fatcat:3abelf3zdff5niel423mgdnom4
Jointly Learning Heterogeneous Features for RGB-D Activity Recognition
2017
IEEE Transactions on Pattern Analysis and Machine Intelligence
In this paper, we focus on heterogeneous features learning for RGB-D activity recognition. ...
A new RGB-D activity dataset focusing on human-object interaction is further contributed, which presents more challenges for RGB-D activity benchmarking. ...
CONCLUSION We have proposed a new RGB-D method called joint heterogenous features learning (JOULE) model to jointly learn heterogeneous features with different number of dimensions for RGB-D activity recognition ...
doi:10.1109/tpami.2016.2640292
pmid:28026749
fatcat:peoubp5fcbfodo2khfjwwlofeq
Cooperative Training of Deep Aggregation Networks for RGB-D Action Recognition
[article]
2017
arXiv
pre-print
power of the deeply learned features and weakens the undesired modality discrepancy by jointly optimizing a ranking loss and a softmax loss for both homogeneous and heterogeneous modalities. ...
the two kinds of features for action recognition. ...
One typical challenge in deep learning based action recognition is how a RGB-D sequence could be effectively represented and fed to deep neural networks for recognition. ...
arXiv:1801.01080v1
fatcat:6vvgkqdgjrf3tpm4lttds3qtqm
Cooperative Cross-Stream Network for Discriminative Action Representation
[article]
2019
arXiv
pre-print
features and reduces the undesired modality discrepancy by jointly optimizing a modality ranking constraint and a cross-entropy loss for both homogeneous and heterogeneous modalities. ...
The jointly spatial and temporal stream networks feature extraction is accomplished by an end-to-end learning manner. ...
To efficient explore the relation of the RGB stream and optical flow stream, We propose a crossmodality features extraction paradigm to jointly learning spatiotemporal features for two heterogenous modalities ...
arXiv:1908.10136v1
fatcat:hwu2pnudxffmfg3iajq7s3ymsm
Robust object recognition in RGB-D egocentric videos based on Sparse Affine Hull Kernel
2015
2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
In this paper, we propose a novel kernel function for recognizing objects in RGB-D egocentric videos. ...
Our kernel function also allows convenient integration of heterogeneous data modalities beyond RGB and depth. ...
This can be attributed to the benefit of learning proper weights for heterogeneous data integration. Figure 4 plots the learned weights of the RGB and depth channel for a subset of object classes. ...
doi:10.1109/cvprw.2015.7301302
dblp:conf/cvpr/WanA15
fatcat:wyub2j6o55hmhgqk5cehgrpiy4
Viewpoint Invariant Action Recognition using RGB-D Videos
[article]
2018
arXiv
pre-print
We use the complementary RGB and Depth information from the RGB-D cameras to address this problem. ...
The heterogeneous features from the two streams are combined and used as a dictionary to predict the label of the test samples. ...
[8] proposed to learn heterogeneous features for the RGB-D video based action recognition. ...
arXiv:1709.05087v2
fatcat:pok3b2pmubeh3d2rdl3tfwn2lm
Bilinear heterogeneous information machine for RGB-D action recognition
2015
2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
This paper proposes a novel approach to action recognition from RGB-D cameras, in which depth features and RGB visual features are jointly used. ...
Rich heterogeneous RGB and depth data are effectively compressed and projected to a learned shared space, in order to reduce noise and capture useful information for recognition. ...
Bilinear Heterogeneous Information Machine The goal of this work is to utilize heterogeneous features from RGB-D action videos, and learn shared crossmodal features for action recognition. ...
doi:10.1109/cvpr.2015.7298708
dblp:conf/cvpr/KongF15
fatcat:rbik4szu5vdhlor5m4dkken6a4
Audio-Visual Contrastive Learning for Self-supervised Action Recognition
[article]
2022
arXiv
pre-print
In this paper, we present an end-to-end self-supervised framework named Audio-Visual Contrastive Learning (AVCL), to learn discriminative audio-visual representations for action recognition. ...
The underlying correlation between audio and visual modalities within videos can be utilized to learn supervised information for unlabeled videos. ...
Therefore, using only one model to jointly learn the audiovisual is not feasible. ...
arXiv:2204.13386v1
fatcat:tqdppujyardkjj3phuwhgbroe4
PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities
[article]
2018
arXiv
pre-print
For example, the RGB surveillance cameras are often restricted from private spaces, which is in conflict with the need of abnormal activity detection for personal security. ...
and infrared features. ...
, where features of multiple modalities are jointly learned and combined. ...
arXiv:1804.06248v1
fatcat:5sojriw56zfkvass2c47hthuwa
Multi-Modal Human Action Recognition With Sub-Action Exploiting and Class-Privacy Preserved Collaborative Representation Learning
2020
IEEE Access
This paper proposes a segmental architecture to exploit the relations of sub-actions, jointly with heterogeneous information fusion and Class-privacy Preserved Collaborative Representation (CPPCR) for ...
INDEX TERMS Action recognition, feature fusion, class-privacy preserved, sub-action sharing. This work is licensed under a Creative Commons Attribution 4.0 License. ...
In [57] , RGB and depth futures are fused for RGB-D videos based action recognition. ...
doi:10.1109/access.2020.2976496
fatcat:357i4iqivzhjpitbgttmjix3ri
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living
[article]
2021
arXiv
pre-print
Many attempts have been made towards combining RGB and 3D poses for the recognition of Activities of Daily Living (ADL). ...
One is to transfer the Pose knowledge into RGB through a feature-level distillation and the other towards mimicking pose driven attention through an attention-level distillation. ...
ACKNOWLEDGEMENT We are grateful to INRIA Sophia Antipolis -Mediterranean "NEF" computation cluster for providing resources and support. ...
arXiv:2105.08141v1
fatcat:vtopa4qekbdelez6cyw4xk24t4
Modality Compensation Network: Cross-Modal Adaptation for Action Recognition
[article]
2020
arXiv
pre-print
With the prevalence of RGB-D cameras, multi-modal video data have become more available for human action recognition. ...
, that the network learns to compensate for the loss of skeletons at test time and even at training time. ...
[46] proposed a multi-modal feature learning framework for RGB-D object recognition to learn not only modal-specific patterns but also modal-shared features. ...
arXiv:2001.11657v1
fatcat:xcmed5yqx5bwrjpqzxsyguo7ga
Correlated and Individual Multi-Modal Deep Learning for RGB-D Object Recognition
[article]
2016
arXiv
pre-print
Unlike most conventional RGB-D object recognition methods which extract features from the RGB and depth channels individually, our CIMDL jointly learns feature representations from raw RGB-D data with ...
In this paper, we propose a new correlated and individual multi-modal deep learning (CIMDL) method for RGB-D object recognition. ...
[26] devised a distance metric learning approach [31, 34] to fuse heterogeneous feature representations for RGB-D object recognition. ...
arXiv:1604.01655v3
fatcat:l5sp75obdfcxdirkmhawngngs4
Human action recognition using associated depth and skeleton information
2014
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Each action instance for recognition in RGB representation is then augmented with the borrowed depth and skeleton features. ...
Addressing this problem in this work, we aim at action recognition in RGB videos with the aid of Kinect. ...
., D = {(x i , d i , s i )} N i=1 .
Recognition with augmented features The training data for action recognition have been expanded from D to D . ...
doi:10.1109/icassp.2014.6854475
dblp:conf/icassp/TangLHWL14
fatcat:vp7jn5wcmbhujmdnrxgn2k24si
Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition
2016
IEEE Transactions on Pattern Analysis and Machine Intelligence
A semi-supervised hierarchical dynamic framework based on a Hidden Markov Model (HMM) is proposed for simultaneous gesture segmentation and recognition where skeleton joint information, depth and RGB images ...
This paper describes a novel method called Deep Dynamic Neural Networks (DDNN) for multimodal gesture recognition. ...
for RGB-D data. ...
doi:10.1109/tpami.2016.2537340
pmid:26955020
fatcat:h3bpphgchfeqlartq4ewgbllfq
« Previous
Showing results 1 — 15 out of 1,184 results