1,503 Hits in 6.8 sec

Temporal Dynamic Graph LSTM for Action-driven Video Object Detection [article]

Yuan Yuan, Xiaodan Liang, Xiaolong Wang, Dit-Yan Yeung, Abhinav Gupta
2017 arXiv   pre-print
TD-Graph LSTM enables global temporal reasoning by constructing a dynamic graph that is based on temporal correlations of object proposals and spans the entire video.  ...  To tackle this problem, we propose a novel temporal dynamic graph Long Short-Term Memory network (TD-Graph LSTM).  ...  The proposed TD-Graph LSTM Overview. We establish a fully-differentiable temporal dynamic graph LSTM (TD-Graph LSTM) framework for the action-driven video object detection task.  ... 
arXiv:1708.00666v1 fatcat:dqvgtiouaffjvcsb55cwwkw4z4

OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement [article]

Fangyi Zhu, Jenq-Neng Hwang, Zhanyu Ma, Guang Chen, Jun Guo
2020 arXiv   pre-print
The temporal graph provides useful supplement over previous image-based approaches, allowing to reason the activities from the temporal evolution of visual features and the dynamic movement of spatial  ...  We introduce the video-based object-oriented video captioning network (OVC)-Net via temporal graph and detail enhancement to effectively analyze the activities along time and stably capture the vision-language  ...  Fig. 4 . 4 An example of building an object-oriented temporal graph for an object. Actually, we build the temporal graph for each object in the video.  ... 
arXiv:2003.03715v5 fatcat:g5trretzdjauplie7estebze2a

An Attention Enhanced Spatial–Temporal Graph Convolutional LSTM Network for Action Recognition in Karate

Jianping Guo, Hong Liu, Xi Li, Dahong Xu, Yihan Zhang
2021 Applied Sciences  
Then, the performance of our algorithm was compared with spatial temporal graph convolutional networks (ST-GCN) for the karate technique action dataset.  ...  In this paper, a new graph convolution model is proposed.  ...  of the temporal dynamics.  ... 
doi:10.3390/app11188641 fatcat:srngjdvz4rgw5gucaheykwfngy

Literature Review of Action Recognition in the Wild [article]

Asket Kaur, Navya Rao, Tanya Joon
2019 arXiv   pre-print
Action Recognition problem in the untrimmed videos is a challenging task and most of the papers have tackled this problem using hand-crafted features with shallow learning techniques and sophisticated  ...  The literature review presented below on Action Recognition in the wild is the in-depth study of Research Papers.  ...  For action detection, a sequence of skeletal data as a T*N*3 image, it is able to adapt object detection methods to the task.  ... 
arXiv:1911.12249v1 fatcat:46qu4wtyqvhuxcomoymdd5owcm

Guest Editorial Introduction to the Special Section on Intelligent Visual Content Analysis and Understanding

Hongliang Li, Lu Fang, Tianzhu Zhang
2020 IEEE transactions on circuits and systems for video technology (Print)  
For salient object preservation, "Object detection-based video retargeting with spatial-temporal consistency," by Lee et al., adopts object detector and tracker to extract the regions of interest (RoI)  ...  The article "Attention-driven loss for anomaly detection in video surveillance," by Zhou et al., describes a novel video anomaly detection method by introducing an attention mechanism to resolve the imbalance  ... 
doi:10.1109/tcsvt.2020.3031416 fatcat:gpwbmydqbza5lddatxcfcidwcq

Remarkable Skeleton Based Human Action Recognition

Sushma Jaiswal, Tarun Jaiswal
2020 Artificial Intelligence Evolution  
In this paper, we first highlight the need for action recognition and significance of 3D skeleton data and finally, we survey the largest 3D skeleton dataset, i.e.  ...  The performance of the SBHAR is also affected by the various factors such as video frame setting, view variations in motion, different backgrounds and inter-personal differences.  ...  Data-driven tactics enhance the elasticity of the model for graph built up. The considered adaptive graph convolutional layer also increase flexibility via spatial-temporal channel attention phase.  ... 
doi:10.37256/aie.122020562 fatcat:2wdzis5ax5bdfnwdhlzgvoh6xu

Spatio-Temporal Pyramid Graph Convolutions for Human Action Recognition and Postural Assessment [article]

Behnoosh Parsa, Athma Narayanan, Behzad Dariush
2019 arXiv   pre-print
In this paper, we propose a novel Spatio-Temporal Pyramid Graph Convolutional Network (ST-PGN) for online action recognition for ergonomic risk assessment that enables the use of features from all levels  ...  Recognition of human actions and associated interactions with objects and the environment is an important problem in computer vision due to its potential applications in a variety of domains.  ...  Conclusion and Future Work We proposed a novel Spatio-Temporal Pyramid Graph Convolutional Net-work (ST-PGN) for online action recognition.  ... 
arXiv:1912.03442v1 fatcat:k2hmpiydezagdgsvcwd6niirhe

Toward human-centric deep video understanding

Wenjun Zeng
2020 APSIPA Transactions on Signal and Information Processing  
We show that semantic models, view-invariant models, and spatial-temporal visual attention mechanisms are important building blocks. We also discuss the future perspectives of video understanding.  ...  In this paper, we share our views on why and how to use a human centric approach to address the challenging video understanding problems.  ...  Since there is a sequence of skeleton joint sets over time in the video, one can use a recurrent neural network such as Long Short-Term Memory (LSTM) network [60] to model the temporal dynamics of the  ... 
doi:10.1017/atsip.2019.26 fatcat:rtrqzokr6bc4lj6vs6megf5xru

Structural-RNN: Deep Learning on Spatio-Temporal Graphs

Ashesh Jain, Amir R. Zamir, Silvio Savarese, Ashutosh Saxena
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
Learning spatio-temporal struc- actions in surveillance videos. 2009. ture from rgb-d videos for human activity detection and an- [9] X. Chen and C. L. Zitnick.  ...  Key object driven multi-category object [17] C. Goller and A. Kuchler.  ... 
doi:10.1109/cvpr.2016.573 dblp:conf/cvpr/JainZSS16 fatcat:n7lshpornvdbxkqbf7gus5pihy

A Comprehensive Review of Group Activity Recognition in Videos

Li-Fang Wu, Qi Wang, Meng Jian, Yu Qiao, Bo-Xuan Zhao
2021 International Journal of Automation and Computing  
First, we provide a summary and comparison of 11 GAR video datasets in this field.  ...  Finally, we outline several challenging issues and possible directions for future research.  ...  [76] proposed a graph LSTM-in-LSTM (GLIL) network which jointly models the person-level actions and the group-level activity.  ... 
doi:10.1007/s11633-020-1258-8 fatcat:ycka4thcy5a6vghpenpthtrndi

A Survey on 3D Skeleton-Based Action Recognition Using Learning Method [article]

Bin Ren, Mengyuan Liu, Runwei Ding, Hong Liu
2020 arXiv   pre-print
are illustrated in a data-driven manner.  ...  However, previous surveys about action recognition mostly focus on the video or RGB data dominated methods, and the scanty existing reviews related to skeleton data mainly indicate the representation of  ...  Sijie and Yuanjun [31] fistly presented a novel model for skeleton-based action recognition, the spatial temporal graph convolutionam networks(ST-GCN), which firstly constructed a spatial-temporal graph  ... 
arXiv:2002.05907v1 fatcat:tmnfwxnwtrdo3hjncdoxqvdowy

Skeleton Focused Human Activity Recognition in RGB Video [article]

Bruce X. B. Yu, Yan Liu, Keith C. C. Chan
2020 arXiv   pre-print
The data-driven approach that learns an optimal representation of vision features like skeleton frames or RGB videos is currently a dominant paradigm for activity recognition.  ...  Whereas for the RGB modality, we will use the spatial-temporal region of interest from RGB videos and take the attention features from the skeleton modality to guide the learning process.  ...  Section III introduces the proposed DL architecture of our skeleton-driven attention for RGB video modality.  ... 
arXiv:2004.13979v1 fatcat:bujdjitqgraoldsglfw6yolafq

A Multi-Task Learning Approach for Human Activity Segmentation and Ergonomics Risk Assessment [article]

Behnoosh Parsa, Ashis G. Banerjee
2020 arXiv   pre-print
We propose a new approach to Human Activity Evaluation (HAE) in long videos using graph-based multi-task modeling.  ...  These approaches are insufficient for accurate activity assessment since they only compute an average score over a clip, and do not consider the correlation between the joints and body dynamics.  ...  Our proposed framework comprises a Graph Convolutional Network (GCN) backbone and an Encoder-Decoder Temporal Convolutional Network (ED-TCN) for the action detection head and a Long-Short-Term-Memory (  ... 
arXiv:2008.03014v2 fatcat:r5mb62aszvabfbjlztnvpwwuhi

SCR-Graph: Spatial-Causal Relationships based Graph Reasoning Network for Human Action Prediction [article]

Bo Chen, Decai Li, Yuqing He, Chunsheng Hua
2019 arXiv   pre-print
In temporal dimension, we designed a knowledge graph based causal reasoning module and map the past actions to temporal causal features through Diffusion RNN.  ...  Technologies to predict human actions are extremely important for applications such as human robot cooperation and autonomous driving.  ...  [13] proposed a temporal deep model to better learn activity progression for performing activity detection and early detection tasks.  ... 
arXiv:1912.05003v1 fatcat:2mhe6a5owbeqxnsmq5osogz7qu

Forecasting Action through Contact Representations from First Person Video

Eadom Dessalene, Chinmaya Devaraj, Michael Maynord, Cornelia Fermuller, Yiannis Aloimonos
2021 IEEE Transactions on Pattern Analysis and Machine Intelligence  
On top of the Anticipation Module we apply Egocentric Object Manipulation Graphs (Ego-OMG), a framework for action anticipation and prediction.  ...  Ego-OMG models longer term temporal semantic relations through the use of a graph modeling transitions between contact delineated action states.  ...  A natural formalism for representing temporal relations is a graph.  ... 
doi:10.1109/tpami.2021.3055233 pmid:33507864 fatcat:4eufliynnbbwbkadllrrcprwee
« Previous Showing results 1 — 15 out of 1,503 results