190 Hits in 4.6 sec

Simplex-Based 3D Spatio-temporal Feature Description for Action Recognition

Hao Zhang, Wenjun Zhou, Christopher Reardon, Lynne E. Parker
2014 2014 IEEE Conference on Computer Vision and Pattern Recognition  
We present a novel feature description algorithm to describe 3D local spatio-temporal features for human action recognition.  ...  In addition, the results show that our SOD descriptor is a superior individual descriptor for action recognition.  ...  Introduction Local spatio-temporal features have shown promising performance for human action recognition in unconstrained scenarios [5, 7, 13, 17, 23, 27, 30] .  ... 
doi:10.1109/cvpr.2014.265 dblp:conf/cvpr/ZhangZRP14 fatcat:7wcjevidpjetnnubwl3d6vxrkq

A unified spatio-temporal human body region tracking approach to action recognition

Nouf Al Harbi, Yoshihiko Gotoh
2015 Neurocomputing  
attained for the action classification task.  ...  The LLC coding is employed to optimise the codebook, the coding scheme projecting every one of the spatio-temporal descriptors into a local coordinate representation developed via max pooling.  ...  The first author would like to thank Taibah University, Madinah, Saudi Arabia for funding this work as part of her PhD scholarship program.  ... 
doi:10.1016/j.neucom.2014.11.072 fatcat:jdq62gjcqvbm3pfknmq2rqcuj4

Spatio-Temporal Ranked-Attention Networks for Video Captioning [article]

Anoop Cherian, Jue Wang, Chiori Hori, Tim K. Marks
2020 arXiv   pre-print
We propose a novel LSTM-based temporal ranking function, which we call ranked attention, for the ST model to capture action dynamics. Our entire framework is trained end-to-end.  ...  Generating video descriptions automatically is a challenging task that involves a complex interplay between spatio-temporal visual features and language models.  ...  For the spatial features, we explore the advantages of using 3D CNN features from the recent Inflated 3D (I3D) activity recognition model [8] , as well as features from a Fast RCNN object detection model  ... 
arXiv:2001.06127v1 fatcat:utpvajecyvgodjjofcmna5gh2i

Tensor Representations via Kernel Linearization for Action Recognition from 3D Skeletons (Extended Version) [article]

Piotr Koniusz and Anoop Cherian and Fatih Porikli
2016 arXiv   pre-print
We present two different kernels for action recognition, namely (i) a sequence compatibility kernel that captures the spatio-temporal compatibility of joints in one sequence against those in the other,  ...  In this paper, we explore tensor representations that can compactly capture higher-order relationships between skeleton joints for 3D action recognition.  ...  To summarize, the main contributions of this paper are (i) introduction of sequence and the dynamics compatibility kernels for capturing spatio-temporal evolution of body-joints for 3D skeleton based action  ... 
arXiv:1604.00239v2 fatcat:slpwe2reazgjfb3zluvbywdusa

A Review on Video-Based Human Activity Recognition

Shian-Ru Ke, Hoang Thuc, Yong-Jin Lee, Jenq-Neng Hwang, Jang-Hee Yoo, Kyoung-Ho Choi
2013 Computers  
This review article surveys extensively the current progresses made toward video-based human activity recognition.  ...  However, the above mentioned feature representations do not fully capture the whole body actions.  ...  [11] applies a spatio-temporal interest point detector to find local region of interest in the cuboids of space and time for activity recognition.  ... 
doi:10.3390/computers2020088 fatcat:zb3wlmwjjvbfne2ck6uyjtffdq

Weakly Supervised Learning of Heterogeneous Concepts in Videos [article]

Sohil Shah, Kuldeep Kulkarni, Arijit Biswas, Ankit Gandhi, Om Deshmukh, Larry Davis
2016 arXiv   pre-print
Typical textual descriptions that accompany online videos are 'weak': i.e., they mention the main concepts in the video but not their corresponding spatio-temporal locations.  ...  The concepts in the description are typically heterogeneous (e.g., objects, persons, actions). Certain location constraints on these concepts can also be inferred from the description.  ...  For each spatio-temporal track, the features extracted for heterogeneous concepts are concatenated while using this approach.  ... 
arXiv:1607.03240v1 fatcat:wco3otmsnbbvzh5ddmvsnme76i

Learning Pullback HMM Distances

2014 IEEE Transactions on Pattern Analysis and Machine Intelligence  
Recent work in action recognition has exposed the limitations of methods which directly classify local features extracted from spatio-temporal video volumes.  ...  Experimental results are presented which show how pullback learning greatly improves action recognition performances with respect to base distances.  ...  Recognition methods which neglect action dynamics (typically extracting spatio-temporal (S/T) [2] , [15] features from the 3D volume associated with a video) have delivered good results.  ... 
doi:10.1109/tpami.2013.181 pmid:26353316 fatcat:bd52twbsczhgxo3bkg4cebmhby

An FPGA-based fast two-symbol processing architecture for JPEG 2000 arithmetic coding

Nandini Ramesh Kumar, Wei Xiang, Yafeng Wang
2010 2010 IEEE International Conference on Acoustics, Speech and Signal Processing  
: FEATURE EXTRACTION METHOD FOR VIDEOBASED HUMAN ACTION ..................................... 1106 RECOGNITIONS: EXTENDED OPTICAL FLOW ALGORITHM Ashok Ramadass, Myunghoon Suk, Balakrishnan Prabhakaran,  ...  Cavalcanti, Federal University of Pernambuco, Brazil IVMSP-L8.6: 3D FACE REPRESENTATION AND RECOGNITION BY INTRINSIC SHAPE ...................................... 854 DESCRIPTION MAPS Zhe Guo, Yanning Zhang  ... 
doi:10.1109/icassp.2010.5495418 dblp:conf/icassp/KumarXW10 fatcat:kxhovzwnzveu5pgdoelok3ox2m

Video understanding for complex activity recognition

Florent Fusier, Valéry Valentin, François Brémond, Monique Thonnat, Mark Borg, David Thirde, James Ferryman
2007 Machine Vision and Applications  
We are interested in complex scenes in terms of actors participating to the activities, large spatio-temporal scale and complex interactions.  ...  Video understanding requires several processing stages from pixel-based video stream analysis up to high level behaviour recognition.  ...  Principle The aim of the Global Tracker is to correct the detected mobile objects wrongly tracked by the previous processes using 3D spatio-temporal analysis.  ... 
doi:10.1007/s00138-006-0054-y fatcat:hbnzhv4s35hcfksag4kas2nzje

Person Re-identification by Video Ranking [chapter]

Taiqing Wang, Shaogang Gong, Xiatian Zhu, Shengjin Wang
2014 Lecture Notes in Computer Science  
simultaneously to learn a video ranking function for person re-id.  ...  image sequence matching and state-of-the-art singleshot/multi-shot based re-id methods.  ...  for applications in action and activity recognition [25] .  ... 
doi:10.1007/978-3-319-10593-2_45 fatcat:hbtfqi2kebhctnvolhcxjen47q

SL-DML: Signal Level Deep Metric Learning for Multimodal One-Shot Action Recognition [article]

Raphael Memmesheimer, Nick Theisen, Dietrich Paulus
2020 arXiv   pre-print
It further outperforms the baseline on the large scale NTU RGB+D 120 dataset for the One-Shot action recognition protocol by 5.6%.  ...  The resulting encoder transforms features into an embedding space in which closer distances encode similar actions while higher distances encode different actions.  ...  [19] presented an one-shot approach based on Simplex Hidden Markov Models (SHMM). Improved dense trajectories are used as base features [20] .  ... 
arXiv:2004.11085v4 fatcat:rtkrummeyvf4xnxc4vjt63i3py

Learning Neural Textual Representations for Citation Recommendation

Binh Thanh Kieu, Inigo Jauregi Unanue, Son Bao Pham, Hieu Xuan Phan, Massimo Piccardi
2021 2020 25th International Conference on Pattern Recognition (ICPR)  
3D Palmprint Recognition DAY 2 -Jan 13, 2021 Obinata, Yuya; Yamamoto, Takuma 335 Temporal Extension Module for Skeleton-Based Action Recognition DAY 2 -Jan 13, 2021 Liu, Daizong; Zhang, Hongting  ...  Encoding and Hierarchical Temporal Modeling in a Spatio-Temporal Graph Convolutional Network for Action Recognition DAY 2 -Jan 13, 2021 Song, Siyang; Sanchez, Enrique; Shen, Linlin; Valstar, Michel  ... 
doi:10.1109/icpr48806.2021.9412725 fatcat:3vge2tpd2zf7jcv5btcixnaikm

Real-time control of video surveillance systems with program supervision techniques

B. Georis, F. Brémond, M. Thonnat
2007 Machine Vision and Applications  
To validate this platform, we have built and evaluated six video surveillance systems which are featured with three properties: adaptability, reliability and real-time processing.  ...  This platform is composed of three main components: the library of programs, the knowledge base and the control component. The knowledge is either given by experts or learnt by the system.  ...  They contain all the temporal links of the original objects which have been fused together and their 3D features are the weighted mean of the original 3D features.  ... 
doi:10.1007/s00138-006-0053-z fatcat:qy7z5h4rznawloks4726iwfgam

A framework for combining a motion atlas with non-motion information to learn clinically useful biomarkers: Application to cardiac resynchronisation therapy response prediction

Devis Peressutti, Matthew Sinclair, Wenjia Bai, Thomas Jackson, Jacobus Ruijsink, David Nordsletten, Liya Asner, Myrianthi Hadjicharalambous, Christopher A. Rinaldi, Daniel Rueckert, Andrew P. King
2017 Medical Image Analysis  
The atlas represents cardiac cycle motion across a number of subjects in a common space based on rich motion descriptors capturing 3D displacement, velocity, strain and strain rate.  ...  We present a framework for combining a cardiac motion atlas with non-motion data.  ...  Spatio-Temporal atlases have previously been proposed for a range of problems.  ... 
doi:10.1016/ pmid:27770718 fatcat:5lflmudymbexphf5sm73d7jmlm

MIFTel: a multimodal interactive framework based on temporal logic rules

Danilo Avola, Luigi Cinque, Alberto Del Bimbo, Marco Raoul Marini
2020 Multimedia tools and applications  
Different sources and acquisition times can be exploited for improving recognition results.  ...  For increasing the recognition reliability, a predictive model is also associated with the proposed method.  ...  Table 3 . 1 . 31 Skeleton features for gesture recognition.  ... 
doi:10.1007/s11042-019-08590-1 fatcat:mzyv277fcnhhheuptvmrokg7qm
« Previous Showing results 1 — 15 out of 190 results