A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Hierarchical Contrastive Motion Learning for Video Action Recognition
[article]
2022
arXiv
pre-print
One central question for video action recognition is how to model motion. ...
In this paper, we present hierarchical contrastive motion learning, a new self-supervised learning framework to extract effective motion representations from raw video frames. ...
Joint Training for Action Recognition Our ultimate goal is to improve video action recognition with the learned hierarchical motion features. ...
arXiv:2007.10321v3
fatcat:vd6n6rluavcbdi3e7k4wge2ije
Deep hierarchical pooling design for cross-granularity action recognition
[article]
2020
arXiv
pre-print
In this paper, we introduce a novel hierarchical aggregation design that captures different levels of temporal granularity in action recognition. ...
Besides being principled and well grounded, the proposed hierarchical pooling is also video-length agnostic and resilient to misalignments in actions. ...
CONCLUSION In this paper, we introduced a hierarchical aggregation design for cross-granularity action recognition. ...
arXiv:2006.04473v1
fatcat:wgn72n7jgbbgvewkjkwv7mwvwq
Hi-EADN: Hierarchical Excitation Aggregation and Disentanglement Frameworks for Action Recognition Based on Videos
2021
Symmetry
Most existing video action recognition methods mainly rely on high-level semantic information from convolutional neural networks (CNNs) but ignore the discrepancies of different information streams. ...
hierarchical disentanglement (SEHD) module. ...
Hierarchical Disentanglement GAP Global Average Pooling SA multi-head self-guided Attention ...
doi:10.3390/sym13040662
fatcat:6mjobgpymbcgncus6z3za3saam
Video Representation Learning with Visual Tempo Consistency
[article]
2020
arXiv
pre-print
We propose to maximize the mutual information between representations of slow and fast videos via hierarchical contrastive learning (VTHCL). ...
Video representations learned from VTHCL achieve the competitive performances under the self-supervision evaluation protocol for action recognition on UCF-101 (82.1\%) and HMDB-51 (49.2\%). ...
Acknowledgments We thank Zhirong Wu and Yonglong Tian for their public implementation of previous works. ...
arXiv:2006.15489v2
fatcat:dem4jafa6naavksvkgf2b3l6ca
Hierarchical Attention Network for Action Recognition in Videos
[article]
2016
arXiv
pre-print
structures for complex human action understanding. ...
In this paper we propose a novel approach named Hierarchical Attention Network (HAN), which enables to incorporate static spatial information, short-term motion information and long-term video temporal ...
In this paper, we study the problem of video representation learning for action recognition. ...
arXiv:1607.06416v1
fatcat:6rfajs2f7rcidj7fwq3bypl3uq
A REVIEW ON MACHINE LEARNING ALGORITHMS ON HUMAN ACTION RECOGNITION
2017
Asian Journal of Pharmaceutical and Clinical Research
Next, hierarchical recognition approaches for abnormal action states are introduced and looked at. ...
Statistics based methodologies, syntactic methodologies, and description based methodologies for hierarchical recognition is examined in the paper. ...
For instance non-hierarchical single layer methodologies can be effortlessly used for low-level or nuclear action recognition, for example, motion location. ...
doi:10.22159/ajpcr.2017.v10s1.19977
fatcat:pzroxovz75bkpilavebs2ig6fm
Recent advances in video-based human action recognition using deep learning: A review
2017
2017 International Joint Conference on Neural Networks (IJCNN)
This paper presents a review of various state-of-theart deep learning-based techniques proposed for human action recognition on the three types of datasets. ...
There are many challenges involved in human action recognition in videos, such as cluttered backgrounds, occlusions, viewpoint variation, execution rate, and camera motion. ...
In contrast, the paper reviews the recent developments in the use of deep learning techniques which have been applied in the human action recognition research area. ...
doi:10.1109/ijcnn.2017.7966210
dblp:conf/ijcnn/WuSB17
fatcat:f35o5nkxozfsrew2sgtaybofla
Action Recognition with Deep Multiple Aggregation Networks
[article]
2020
arXiv
pre-print
In this paper, we introduce a novel hierarchical pooling design that captures different levels of temporal granularity in action recognition. ...
Besides being principled and well grounded, the proposed hierarchical pooling is also video-length and resolution agnostic. ...
CONCLUSION We introduce in this paper a temporal pyramid approach for video action recognition. ...
arXiv:2006.04489v1
fatcat:ybavmgy33ffflnefb4sc6z4cpu
Recognizing activities with cluster-trees of tracklets
2012
Procedings of the British Machine Vision Conference 2012
Contrary to most approaches based on action decompositions, we propose to use the full hierarchical action structure instead of selecting a small fixed number of parts. ...
We represent a video as a hierarchy of mid-level motion components. This hierarchy is a data-driven decomposition specific to each video. ...
Figure 1 ), in order to build a hierarchical model of the motion content of a video. This is in contrast to existing approaches [39] that view videos as a bag of clusters. ...
doi:10.5244/c.26.30
dblp:conf/bmvc/GaidonHS12
fatcat:c3tgkymblvecdpvbk77f4n6hiq
Hierarchical Self-supervised Representation Learning for Movie Understanding
[article]
2022
arXiv
pre-print
Most self-supervised video representation learning approaches focus on action recognition. ...
In contrast, in this paper we focus on self-supervised video learning for movie understanding and propose a novel hierarchical self-supervised pretraining strategy that separately pretrains each level ...
For example, they propose models that encourage the learning of shortterm appearance and motion cues, as these are the most informative for action recognition. ...
arXiv:2204.03101v1
fatcat:kl2xwoczfzedvd5tx452ecg2le
Action Recognition and Localization by Hierarchical Space-Time Segments
2013
2013 IEEE International Conference on Computer Vision
We propose Hierarchical Space-Time Segments as a new representation for action recognition and localization. This representation has a two level hierarchy. ...
Using simple linear SVM on the resultant bag of hierarchical space-time segments representation, we attain better than, or comparable to, state-of-art action recognition performance on two challenging ...
Video Frame Hierarchical Segmentation For human action recognition, segments in a video frame that contain motion are useful as they may belong to moving body parts. ...
doi:10.1109/iccv.2013.341
dblp:conf/iccv/MaZIS13
fatcat:mdcwpudgqbhvjj375tvspn73tu
HMS: Hierarchical Modality Selection for Efficient Video Recognition
[article]
2021
arXiv
pre-print
This paper introduces Hierarchical Modality Selection (HMS), a simple yet efficient multimodal learning framework for efficient video recognition. ...
Videos are multimodal in nature. Conventional video recognition pipelines typically fuse multimodal features for improved performance. ...
In contrast to conventional video recognition approaches that leverage multimodal features for all samples, we learn what modalities to use on a per-input basis. ...
arXiv:2104.09760v2
fatcat:js2whnimvvbhfp5uzenqu3mlvq
Human Action Recognition Using HDP by Integrating Motion and Location Information
[chapter]
2010
Lecture Notes in Computer Science
The proposed method, unsupervised MI-HDP-LDA, was evaluated for Weizmann dataset. ...
These are unsupervised learning, but they require the number of latent topics to be set manually. ...
In the experiments of motion learning and recognition for Weismann Dataset, LDA showed 61.8% recognition rate using only motion information. ...
doi:10.1007/978-3-642-12304-7_28
fatcat:qtndtwio75gefm7hm36xseq4s4
Language-Motivated Approaches to Action Recognition
[chapter]
2017
Gesture Recognition
In order to obtain statistical insight into the underlying patterns of motions in activities, we develop a dynamic, hierarchical Bayesian model which connects low-level visual features in videos with poses ...
We also introduce a probabilistic framework for detecting and localizing pre-specified activities (or gestures) in a video sequence, analogous to the use of filler models for keyword detection in speech ...
Acknowledgments The authors wish to thank the associate editors and anonymous referees for all their advice about the structure, references, experimental illustration and interpretation of this manuscript ...
doi:10.1007/978-3-319-57021-1_5
fatcat:byt4ayc6nrcyfjh4o2av2btmae
A Hierarchical Representation for Future Action Prediction
[chapter]
2014
Lecture Notes in Computer Science
We develop a max-margin learning framework for future action prediction, integrating a collection of moveme detectors in a hierarchical way. ...
We consider inferring the future actions of people from a still image or a short video clip. ...
We consider at most 3 pose types for each motion segment.
Learning a Collection of Moveme Classifiers Given a hierarchy of movemes, we learn a classifier for each moveme in the hierarchy. ...
doi:10.1007/978-3-319-10578-9_45
fatcat:77eleukzgzbvbe6rkmheipu7ey
« Previous
Showing results 1 — 15 out of 14,188 results