Filters








6,735 Hits in 7.9 sec

UCF-STAR: A Large Scale Still Image Dataset for Understanding Human Actions

Marjaneh Safaei, Pooyan Balouchian, Hassan Foroosh
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Action recognition in still images poses a great challenge due to (i) fewer available training data, (ii) absence of temporal information.  ...  UCF-STAR is the largest dataset in the literature for action recognition in still images.  ...  TSSTN proves that predicting the "latent" temporal information in still images improves action recognition performance.  ... 
doi:10.1609/aaai.v34i03.5653 fatcat:52c6soxtuze7zeso3ebwxdnxfu

Human Behavior Analysis: A Survey on Action Recognition

Bruno Degardin, Hugo Proença
2021 Applied Sciences  
The visual recognition and understanding of human actions remain an active research domain of computer vision, being the scope of various research works over the last two decades.  ...  Previous surveys mainly focus on the evolution of this field, from handcrafted features to deep learning architectures.  ...  [13] presented an extensive and complete survey regarding not only action recognition, but also action prediction, presenting the state-of-the-art evolution on both problems.  ... 
doi:10.3390/app11188324 fatcat:zenvfhlaubht7ar3qrpil4lgdm

Multi-Modality Video Representation for Action Recognition

Chao Zhu, Yike Wang, Dongbing Pu, Miao Qi, Hui Sun, Lei Tan
2020 Journal on Big Data  
The difference between image recognition and action recognition is that action recognition needs more modality information to depict one action, such as the appearance, the motion and the dynamic information  ...  Most of current methods define an action by spatial information and motion information.  ...  Besides, different from image classification, action recognition needs spatial and temporal cues.  ... 
doi:10.32604/jbd.2020.010431 fatcat:mkzehr6jl5hf5auapqs3daug7q

Action Recognition Using Single-Pixel Time-of-Flight Detection

Ikechukwu Ofodile, Ahmed Helmi, Albert Clapés, Egils Avots, Kerttu Maria Peensoo, Sandhra-Mirella Valdma, Andreas Valdmann, Heli Valtna-Lukner, Sergey Omelkov, Sergio Escalera, Cagri Ozcinar, Gholamreza Anbarjafari
2019 Entropy  
Such data trace to record one action contains a sequence of one-dimensional arrays of voltage values acquired by a single-pixel detector at 1 GHz repetition rate.  ...  Our proposed method relies only on recording the temporal evolution of light pulses scattered back from the scene.  ...  The spatial properties of the scene are imprinted into the temporal evolution of the recorded trace.  ... 
doi:10.3390/e21040414 pmid:33267128 fatcat:xhujztfdj5b4revpbyemwuyxkq

DenseImage Network: Video Spatial-Temporal Evolution Encoding and Understanding [article]

Xiaokai Chen, Ke Gao
2018 arXiv   pre-print
Extensive experiments on two recent challenging benchmarks demonstrate that our DenseImage Network can accurately capture the common spatial-temporal evolution between similar actions, even with enormous  ...  Many of the leading approaches for video understanding are data-hungry and time-consuming, failing to capture the gist of spatial-temporal evolution in an efficient manner.  ...  The success of static image classification with CNNs has driven the development of video recognition, but how to represent spatial-temporal evolution in videos using CNNs is still a problem.  ... 
arXiv:1805.07550v1 fatcat:tp6q3amuhvf5nm3anghg3ex47m

Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video [article]

Lionel Pigou, Aäron van den Oord, Sander Dieleman, Mieke Van Herreweghe, Joni Dambre
2016 arXiv   pre-print
Recent studies have demonstrated the power of recurrent neural networks for machine translation, image captioning and speech recognition.  ...  For the task of capturing temporal structure in video, however, there still remain numerous open research questions.  ...  The research leading to these results has received funding from the Agency for Innovation by Science and Technology in Flanders (IWT).  ... 
arXiv:1506.01911v3 fatcat:fgug5fhirrekplsnerrwgvr6be

Self-Supervised Spatiotemporal Feature Learning via Video Rotation Prediction [article]

Longlong Jing, Xiaodong Yang, Jingen Liu, Yingli Tian
2019 arXiv   pre-print
Our extensive experiments successfully demonstrate the effectiveness of the proposed framework on action recognition, leading to significant improvements over the state-of-the-art self-supervised methods  ...  With the self-supervised pre-trained 3DRotNet from large datasets, the recognition accuracy is boosted up by 20.4% on UCF101 and 16.7% on HMDB51 respectively, compared to the models trained from scratch  ...  This material is based upon the work supported by National Science Foundation (NSF) under award number IIS-1400802.  ... 
arXiv:1811.11387v2 fatcat:yhx7laoxv5hnvoqbi4dghkc5ui

TA2N: Two-Stage Action Alignment Network for Few-shot Action Recognition [article]

Shuyuan Li, Huabin Liu, Rui Qian, Yuxi Li, John See, Mengjuan Fei, Xiaoyuan Yu, Weiyao Lin
2021 arXiv   pre-print
Next, the second stage coordinates query feature to match the spatial-temporal action evolution of support by performing temporally rearrange and spatially offset prediction.  ...  The first stage locates the action by learning a temporal affine transform, which warps each video feature to its action duration while dismissing the action-irrelevant feature (e.g. background).  ...  However, the spatial variation of actor evolution, such as the positions of actors, also being critical for action recognition, which cannot be modeled by TC.  ... 
arXiv:2107.04782v2 fatcat:qqa4ymll2zhgvnw5qks7qz4tp4

Convolutional Two-Stream Network Fusion for Video Action Recognition

Christoph Feichtenhofer, Axel Pinz, Andrew Zisserman
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
Recent applications of Convolutional Neural Networks (ConvNets) for human action recognition in videos have proposed different solutions for incorporating the appearance and motion information.  ...  We study a number of ways of fusing ConvNet towers both spatially and temporally in order to best take advantage of this spatio-temporal information.  ...  This work was supported by the Austrian Science Fund (FWF) under project P27076, and also by EPSRC Programme Grant Seebibyte EP/M013774/1. The GPUs used for this research were donated by NVIDIA.  ... 
doi:10.1109/cvpr.2016.213 dblp:conf/cvpr/FeichtenhoferPZ16 fatcat:fv3nx5sparhb5c6vcpthy5gply

Convolutional Two-Stream Network Fusion for Video Action Recognition [article]

Christoph Feichtenhofer, Axel Pinz, Andrew Zisserman
2016 arXiv   pre-print
Recent applications of Convolutional Neural Networks (ConvNets) for human action recognition in videos have proposed different solutions for incorporating the appearance and motion information.  ...  We study a number of ways of fusing ConvNet towers both spatially and temporally in order to best take advantage of this spatio-temporal information.  ...  This work was supported by the Austrian Science Fund (FWF) under project P27076, and also by EPSRC Programme Grant Seebibyte EP/M013774/1. The GPUs used for this research were donated by NVIDIA.  ... 
arXiv:1604.06573v2 fatcat:deygtaqfwnhhxg37xzk6v7bjai

Survey On Feature Extraction Approach for Human Action Recognition in Still Images and Videos

Pavan M, Deepika D, Divyashree R, Kavana K, Pooja V Biligi
2022 International Journal of Scientific Research in Computer Science Engineering and Information Technology  
The paper presents a brief overview of features of human actions by categorizing as still image-based and video-based.  ...  The main aim of this work is to study the various action recognition techniques in videos and images.  ...  Still image-based action recognition research is still in its early stages, as it is a relatively new field.  ... 
doi:10.32628/cseit228392 fatcat:ixbasagqvvaahfc3pmilvd7rke

Relevance of Interest Points for Eye Position Prediction on Videos [chapter]

Alain Simac-Lejeune, Sophie Marat, Denis Pellerin, Patrick Lambert, Michèle Rombaut, Nathalie Guyader
2009 Lecture Notes in Computer Science  
We fund that in function of the video sequence, and more especially in function of the motion inside the sequence, the spatial or the space-time interest point detector is more or less relevant to predict  ...  This papers tests the relevance of interest points to predict eye movements of subjects when viewing video sequences freely.  ...  He used this detector for the recognition of human actions (walking, running, drinking, etc.) in movies.  ... 
doi:10.1007/978-3-642-04667-4_33 fatcat:24ybhgg2czdepbnk7eak7u43ba

Complex sequential understanding through the awareness of spatial and temporal concepts

Bo Pang, Kaiwen Zha, Hanwen Cao, Jiajun Tang, Minghui Yu, Cewu Lu
2020 Nature Machine Intelligence  
Experiments demonstrate that a Semi-Coupled Structure can successfully annotate the outline of an object in images sequentially and perform video action recognition.  ...  For sequence-to-sequence problems, a Semi-Coupled Structure can predict future meteorological radar echo images based on observed images.  ...  A model, given a sequence of the radar echo images sorted in time, needs to predict a sequence of the future radar echo images from the previous evolution of CR (see Fig. 5 b).  ... 
doi:10.1038/s42256-020-0168-3 fatcat:dyrxp3pqvvb3tmrjv2yig5egru

Im2Flow: Motion Hallucination from Static Images for Action Recognition

Ruohan Gao, Bo Xiong, Kristen Grauman
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
We propose an approach that hallucinates the unobserved future motion implied by a single snapshot to help static-image action recognition.  ...  It not only achieves state-of-the-art accuracy for dense optical flow prediction, but also consistently enhances recognition of actions and dynamic scenes.  ...  Acknowledgements: This research was supported in part by an ONR PECASE Award N00014-15-1-2291 and an IBM Faculty Award and IBM Open Collaboration Award.  ... 
doi:10.1109/cvpr.2018.00622 dblp:conf/cvpr/GaoXG18 fatcat:ofat4dobw5aj7apx4ikmjymyqe

Im2Flow: Motion Hallucination from Static Images for Action Recognition [article]

Ruohan Gao, Bo Xiong, Kristen Grauman
2018 arXiv   pre-print
We propose an approach that hallucinates the unobserved future motion implied by a single snapshot to help static-image action recognition.  ...  It not only achieves state-of-the-art accuracy for dense optical flow prediction, but also consistently enhances recognition of actions and dynamic scenes.  ...  Acknowledgements: This research was supported in part by an ONR PECASE Award N00014-15-1-2291 and an IBM Faculty Award and IBM Open Collaboration Award.  ... 
arXiv:1712.04109v3 fatcat:cnucptnuizg3rmx6k4mumzchem
« Previous Showing results 1 — 15 out of 6,735 results