Filters








2,750 Hits in 3.4 sec

Context-Aware RCNN: A Baseline for Action Detection in Videos [article]

Jianchao Wu, Zhanghui Kuang, Limin Wang, Wayne Zhang, Gangshan Wu
2020 arXiv   pre-print
Thus, we revisit RCNN for actor-centric action recognition via cropping and resizing image patches around actors before feature extraction with I3D deep network.  ...  Video action detection approaches usually conduct actor-centric action recognition over RoI-pooled features following the standard pipeline of Faster-RCNN.  ...  from the clip feature map based on actor boxes.  ... 
arXiv:2007.09861v1 fatcat:cntx4bbblven7jdrwhiwd7hi4y

Scaling New Peaks: A Viewership-centric Approach to Automated Content Curation [article]

Subhabrata Majumdar, Deirdre Paul, Eric Zavesky
2021 arXiv   pre-print
However, as with most automated systems, we assert that a the best system would be a combined human-in-the-loop process by supplying our generated seed segments generated as inputs for a human curator  ...  To begin the comparison process, we went through the V1 highlights video and annotated each clip with short descriptions, then similarly partitioned each external summary.  ... 
arXiv:2108.04187v1 fatcat:vs6ent5l3jdyfjjxkdlsynkz2y

Actor-Centric Relation Network [chapter]

Chen Sun, Abhinav Shrivastava, Carl Vondrick, Kevin Murphy, Rahul Sukthankar, Cordelia Schmid
2018 Lecture Notes in Computer Science  
Our approach is weakly supervised and mines the relevant elements automatically with an actor-centric relational network (ACRN).  ...  ACRN computes and accumulates pair-wise relation information from actor and global scene features, and generates relation features for action classification.  ...  Action recognition has traditionally focused on classifying actions in short video clips.  ... 
doi:10.1007/978-3-030-01252-6_20 fatcat:k5kmk4ihznej5gzmnjvyc6727a

Actor-Centric Relation Network [article]

Chen Sun and Abhinav Shrivastava and Carl Vondrick and Kevin Murphy and Rahul Sukthankar and Cordelia Schmid
2018 arXiv   pre-print
Our approach is weakly supervised and mines the relevant elements automatically with an actor-centric relational network (ACRN).  ...  ACRN computes and accumulates pair-wise relation information from actor and global scene features, and generates relation features for action classification.  ...  Action recognition has traditionally focused on classifying actions in short video clips.  ... 
arXiv:1807.10982v1 fatcat:oa37s7utpjdnlbgjgylhoiaq44

A Multi-stream Bi-directional Recurrent Neural Network for Fine-Grained Action Detection

Bharat Singh, Tim K. Marks, Michael Jones, Oncel Tuzel, Ming Shao
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
To model long-term temporal dynamics within and between actions, the multi-stream CNN is followed by a bi-directional Long Short-Term Memory (LSTM) layer.  ...  Our system uses a tracking algorithm to locate a bounding box around the person, which provides a frame of reference for appearance and motion and also suppresses background noise that is not within the  ...  In action recognition datatsets (e.g., UCF 101), video clips are temporally trimmed to start and end at the start and end times of each action, and are generally short in length (e.g., from 2-20 seconds  ... 
doi:10.1109/cvpr.2016.216 dblp:conf/cvpr/SinghMJTS16 fatcat:jz2aip3dg5gszcj7z3cpcj45qy

Target-Specific Action Classification for Automated Assessment of Human Motor Behavior from Video [article]

Behnaz Rezaei, Yiorgos Christakis, Bryan Ho, Kevin Thomas, Kelley Erb, Sarah Ostadabbas, Shyamal Patel
2019 arXiv   pre-print
The cascaded pose tracker achieves an average accuracy of 88\% in tracking the target human actor in our video recordings, and overall system achieves average test accuracy of 84\% for target-specific  ...  We implement a cascaded pose tracker that uses temporal relationships between detections for short-term tracking and appearance-based tracklet fusion for long-term tracking.  ...  Acknowledgments: The authors would like to acknowledge the BlueSky project team for generating the data that made this work possible.  ... 
arXiv:1909.09566v1 fatcat:prcmfivlzjgidflcpla3xqeopi

A Structured Model For Action Detection [article]

Yubo Zhang, Pavel Tokmakov, Martial Hebert, Cordelia Schmid
2019 arXiv   pre-print
A dominant paradigm for learning-based approaches in computer vision is training generic models, such as ResNet for image recognition, or I3D for video understanding, on large datasets and allowing them  ...  In particular, we augment a standard I3D network with a tracking module to aggregate long term motion patterns, and use a graph convolutional network to reason about interactions between actors and objects  ...  Finally, we will demonstrate how we build the actor-centric graph, and how it is used to generate action predictions.  ... 
arXiv:1812.03544v5 fatcat:jojhizk3c5djrce4kh3rndz46e

Human action recognition from a single clip per action

Weilong Yang, Yang Wang, Greg Mori
2009 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops  
In this paper, we consider the problem of human action recognition from a single clip per action. Each clip contains at most 25 frames.  ...  The transferable distance function learning extracts generic knowledge of patch weighting from previous training sets, and can be applied to videos of new actions without further learning.  ...  Introduction The ability to generalize from a small training set is an important feature of any recognition system.  ... 
doi:10.1109/iccvw.2009.5457663 dblp:conf/iccvw/Yang0M09 fatcat:6ib727kbu5f5ddwsgozikmudhy

Target-Specific Action Classification for Automated Assessment of Human Motor Behavior from Video

Behnaz Rezaei, Yiorgos Christakis, Bryan Ho, Kevin Thomas, Kelley Erb, Sarah Ostadabbas, Shyamal Patel
2019 Sensors  
The cascaded pose tracker achieves an average accuracy of 88% in tracking the target human actor in our video recordings, and overall system achieves average test accuracy of 84% for target-specific action  ...  We implement a cascaded pose tracker that uses temporal relationships between detections for short-term tracking and appearance based tracklet fusion for long-term tracking.  ...  Acknowledgments: The authors would like to acknowledge the BlueSky project team for generating the data that made this work possible.  ... 
doi:10.3390/s19194266 pmid:31581449 pmcid:PMC6806251 fatcat:o4swbs5w4fbxrnqr3gj2kilmpu

Integration of Real-Time UAV Video into the Fire Brigades Crisis Management System [chapter]

Mark van Persie, Menso C. van Sijl, Edwin Wisse, Janio B. Tjoe-Awie, Arnout J. de Jong, Wim Bakker
2012 Lecture Notes in Geoinformation and Cartography  
Net-centric approach  ...  1/7/2013 Header and Footer 10 1 0 System description: Video Handling Destinguished products Live video stream Video clips Snapshot images Mosaics Used standards for interoperability:  ...  systems Header and Footer 20 2 0 Conclusions Usability and user friendlyness of system is important: -quickly deployable -minimal constraints in operation (beyond vision, above people, permission to  ... 
doi:10.1007/978-3-642-33218-0_24 fatcat:dqxmti33tfdtni6ezzqo4l4j3q

Review of Video Predictive Understanding: Early Action Recognition and Future Action Prediction [article]

He Zhao, Richard P. Wildes
2021 arXiv   pre-print
However, these approaches face challenges such as the curse of dimensionality, poor generalization, and constraints from domain-specific knowledge.  ...  Recently, structures that rely on deep convolutional neural networks and recurrent neural networks have been extensively proposed for improving the performance of existing vision tasks, in general, and  ...  Generation of naturalistic future images beyond short time horizons remains challenging.  ... 
arXiv:2107.05140v2 fatcat:f23pi3i5fzhqxlirv3slgkl3wu

Understanding Health Information Seeking from an Actor-Centric Perspective

Simon Batchelor, Linda Waldman, Gerry Bloom, Sabrina Rasheed, Nigel Scott, Tanvir Ahmed, Nazib Khan, Tamanna Sharmin
2015 International Journal of Environmental Research and Public Health  
The paper integrates an actor centric approach with the theory of planned behavior.  ...  In the actor-centric approach outlined below, we model the complexity of the health system, as well as show the intricacies associated with information seeking and decisions about treatment.  ...  A first step in using an actor centric approach is to identify the generic types of actors in a typical health system.  ... 
doi:10.3390/ijerph120708103 pmid:26184275 pmcid:PMC4515711 fatcat:tj2xipwxrjc2jhr7uacdk6xnoq

EduNet: A New Video Dataset for Understanding Human Activity in the Classroom Environment

Vijeta Sharma, Manjari Gupta, Ajai Kumar, Deepti Mishra
2021 Sensors  
The development of a new benchmark dataset for the education domain will benefit future research concerning classroom monitoring systems.  ...  It is also a challenging dataset of actions as it has many clips (and due to the unconstrained nature of the clips).  ...  The dataset proposed by [36] has fewer video clips with only 817 clips. Both datasets are only student-centric.  ... 
doi:10.3390/s21175699 pmid:34502592 fatcat:hmxck7jitfhx3fgbqmizmbuyxe

Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem [article]

John Holler, Risto Vuorio, Zhiwei Qin, Xiaocheng Tang, Yan Jiao, Tiancheng Jin, Satinder Singh, Chenxi Wang, Jieping Ye
2019 arXiv   pre-print
In addition to treating the drivers as individual agents, we consider the problem from a system-centric perspective, where a central fleet management agent is responsible for decision-making for all drivers  ...  These decisions have short-term effects on the revenue generated by the drivers and driver availability.  ...  The driver-centric approach outperforms the system-centric by a margin of close to 10%.  ... 
arXiv:1911.11260v1 fatcat:kq3vu76co5gujcc6xz4gcs6fqe

MarioQA: Answering Questions by Watching Gameplay Videos [article]

Jonghwan Mun, Paul Hongsuck Seo, Ilchae Jung, Bohyung Han
2017 arXiv   pre-print
To address this objective, we automatically generate a customized synthetic VideoQA dataset using Super Mario Bros. gameplay videos so that it contains events with different levels of reasoning complexity  ...  The datasets are generated by simulating a virtual world given a set of actions by virtual actors and a set of constraints imposed on the actors.  ...  However, attention models may be able to gain more benefit from pretrained models, and GC is a more preferable model for our environment with short video clips.  ... 
arXiv:1612.01669v2 fatcat:ah5fpvhwcjfnrmmn3y3wduvjfe
« Previous Showing results 1 — 15 out of 2,750 results