18,373 Hits in 5.0 sec

Detecting events and key actors in multi-person videos [article]

Vignesh Ramanathan and Jonathan Huang and Sami Abu-El-Haija and Alexander Gorban and Kevin Murphy and Li Fei-Fei
2016 arXiv   pre-print
Multi-person event recognition is a challenging task, often with many people active in the scene but only a small subset contributing to an actual event.  ...  In this paper, we propose a model which learns to detect events in such videos while automatically "attending" to the people responsible for the event.  ...  Conclusion We have introduced a new attention based model for event classification and detection in multi-person videos.  ... 
arXiv:1511.02917v2 fatcat:tx3j5cxkhja7je3gouh2vlnho4

Detecting Events and Key Actors in Multi-person Videos

Vignesh Ramanathan, Jonathan Huang, Sami Abu-El-Haija, Alexander Gorban, Kevin Murphy, Li Fei-Fei
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
Multi-person event recognition is a challenging task, often with many people active in the scene but only a small subset contributing to an actual event.  ...  In this paper, we propose a model which learns to detect events in such videos while automatically "attending" to the people responsible for the event.  ...  Rathod and K. Tang for useful comments. We also thank O. Camburu and N. Johnston for helping with the GPU implementation. This research is partly supported by ONR MURI and Intel ISTC-PC.  ... 
doi:10.1109/cvpr.2016.332 dblp:conf/cvpr/RamanathanHAG0F16 fatcat:eumxyvtr3jcblmglg722fit6ae

Key Event Detection in Video using ASR and Visual Data

Niraj Shrestha, Aparna N. Venkitasubramanian, Marie-Francine Moens
2014 Proceedings of the Third Workshop on Vision and Language  
In this paper, we present preliminary work on key event detection in British royal wedding videos using automatic speech recognition (ASR) and visual data.  ...  The error is only slightly higher when using ASR output in the detection of key events and their participants in the wedding videos compared to the results obtained with subtitles.  ...  Methodology The main objective of this work is to identify and index key events in videos using ASR data along with key actors involved in the event.  ... 
doi:10.3115/v1/w14-5407 dblp:conf/acl-vl/ShresthaVM14 fatcat:gjy447zno5etnjo23zdztlmkzi

Accurate person tracking through changing poses for multi-view action recognition

Pradeep Natarajan, Prithviraj Banerjee, Ram Nevatia
2010 Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing - ICVGIP '10  
Person tracking algorithms often fail under such conditions, since they work by detecting and tracking people in a few known poses (typically standing).  ...  We represent the pose of the person in each track window using a grid-of-centroids model, and recognize the action by matching with a set of keyposes, in each frame.  ...  The camera is static and has a downward tilt of ≈ 20 ∘ . In each video, an actor enters the scene, picks up an item and leaves.  ... 
doi:10.1145/1924559.1924580 dblp:conf/icvgip/NatarajanBN10 fatcat:nrmmqqyvu5aundcbu3yvooo63a

A Comprehensive Review of Group Activity Recognition in Videos

Li-Fang Wu, Qi Wang, Meng Jian, Yu Qiao, Bo-Xuan Zhao
2021 International Journal of Automation and Computing  
First, we provide a summary and comparison of 11 GAR video datasets in this field.  ...  and sports video analysis.  ...  Unified modeling framework Group activity recognition for video usually involves multi-person detection, multi-person tracking and activity recognition.  ... 
doi:10.1007/s11633-020-1258-8 fatcat:ycka4thcy5a6vghpenpthtrndi

Monitoring Activities of Daily Living (ADLs) of Elderly Based on 3D Key Human Postures [chapter]

Nadia Zouba, Bernard Boulay, Francois Bremond, Monique Thonnat
2008 Lecture Notes in Computer Science  
A video analysis component contains person detection, person tracking and human posture recognition.  ...  Using these 3D key human postures, we have modeled thirty four video events, simple ones such as "a person is standing" and composite ones such as "a person is feeling faint".  ...  We also envisage to facilitate incorporation of new sensors by developing a generic model of intelligent sensor and to add the data uncertainty and imprecision on sensor measurement analysis.  ... 
doi:10.1007/978-3-540-92781-5_4 fatcat:b5uj75gqzvd4rnz2gr5bg3inr4

A Vision-Based System for Intelligent Monitoring: Human Behaviour Analysis and Privacy by Context

Alexandros Chaaraoui, José Padilla-López, Francisco Ferrández-Pastor, Mario Nieto-Hidalgo, Francisco Flórez-Revuelta
2014 Sensors  
The experimental results of the behaviour recognition method show an outstanding performance, as well as support for multi-view scenarios and real-time execution, which are required in order to provide  ...  Due to progress and demographic change, society is facing a crucial challenge related to increased life expectancy and a higher number of people in situations of dependency.  ...  The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.  ... 
doi:10.3390/s140508895 pmid:24854209 pmcid:PMC4063058 fatcat:24uqyuamyvhrlc5jrypxxwvtpa


Hamid Aghajan, Juan Carlos Augusto, Andrea Prati, Carles Gomez
2016 Journal of Ambient Intelligence and Smart Environments  
These algorithms have been evaluated almost exclusively using brief segments of video data captured in artificial environments, often under optimal imaging conditions, and with falls simulated by actors  ...  This issue Activity recognition plays a key role in providing activity assistance and care for users in smart homes.  ...  These algorithms have been evaluated almost exclusively using brief segments of video data captured in artificial environments, often under optimal imaging conditions, and with falls simulated by actors  ... 
doi:10.3233/ais-160373 fatcat:laetkeimavgdvpjrayc5ma7e6m

A large-scale benchmark dataset for event recognition in surveillance video

Sangmin Oh, Anthony Hoogs, Amitha Perera, Naresh Cuntoor, Chia-Chih Chen, Jong Taek Lee, Saurajit Mukherjee, J. K. Aggarwal, Hyungtae Lee, Larry Davis, Eran Swears, Xioyang Wang (+12 others)
2011 CVPR 2011  
Our dataset consists of many outdoor scenes with actions occurring naturally by non-actors in continuously captured videos of the real world.  ...  We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoor areas  ...  Acknowledgments Thanks to Kamil Knuk, Stefano Soatto, Arslan Basharat, and many others for help on this work.  ... 
doi:10.1109/cvpr.2011.5995586 dblp:conf/cvpr/OhHPCCLMALDSWJRSVPRYTSFRD11 fatcat:fkkxv762izfetdrthrhnopjbb4

Emotion recognition from embedded bodily expressions and speech during dyadic interactions

Philipp M. Muller, Sikandar Amin, Prateek Verma, Mykhaylo Andriluka, Andreas Bulling
2015 2015 International Conference on Affective Computing and Intelligent Interaction (ACII)  
We present a human-validated dataset that contains 224 high-resolution, multi-view video clips and audio recordings of emotionally charged interactions between eight couples of actors.  ...  We study the problem of emotion recognition from bodily expressions and speech during dyadic (person-person) interactions in a real kitchen instrumented with ambient cameras and microphones.  ...  ACKNOWLEDGMENTS The authors would like to thank Johannes Tröger for working as a director in the recordings as well as all involved actors.  ... 
doi:10.1109/acii.2015.7344640 dblp:conf/acii/MullerAVAB15 fatcat:zkphthbeezffbkouiogua6sy7y

Entity centric Feature Pooling for Complex Event Detection

Ishani Chakraborty, Hui Cheng, Omar Javed
2014 Proceedings of the 1st ACM International Workshop on Human Centered Event Understanding from Multimedia - HuEvent '14  
In this paper, we propose an entity centric region of interest detection and visual-semantic pooling scheme for complex event detection in YouTube-like videos.  ...  The AoI map is derived from image based saliency cues weighted by the actionable space of the person involved in the event.  ...  In general, objects are key to understanding an event and hence searching for relevant objects could enhance detectability of a video.  ... 
doi:10.1145/2660505.2660506 dblp:conf/mm/ChakrabortyCJ14 fatcat:rjdw3zmyp5hvvpa4fygabsrwj4

Video-based event recognition: activity representation and probabilistic recognition methods

Somboon Hongeng, Ram Nevatia, Francois Bremond
2004 Computer Vision and Image Understanding  
Multi-agent events are recognized by propagating the constraints and likelihood of event threads in a temporal logic network.  ...  A multi-agent event is composed of several action threads related by temporal constraints.  ...  Introduction Automatic event detection in video streams is gaining attention in the computer vision research community due to the needs of many applications such as surveillance for security, video content  ... 
doi:10.1016/j.cviu.2004.02.005 fatcat:5swafug2ufbsljwkt7afaqzoge

Multi-source Multi-modal Activity Recognition in Aerial Video Surveillance

Riad I. Hammoud, Cem S. Sahin, Erik P. Blasch, Bradley J. Rhodes
2014 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops  
We present a multi-source multi-modal activity/event recognition system for surveillance applications, consisting of: (1) detecting and tracking multiple dynamic targets from a moving platform, (2) representing  ...  ) and activity video segments of targetsof-interest (TOIs) (in both pixel and geo-coordinates).  ...  The ideas and opinions expressed here are not official policies of the United States Air Force.  ... 
doi:10.1109/cvprw.2014.44 dblp:conf/cvpr/HammoudSBR14 fatcat:p7pqnfbzefdplpm5eztcgk2lge

View and scale invariant action recognition using multiview shape-flow models

Pradeep Natarajan, Ramakant Nevatia
2008 2008 IEEE Conference on Computer Vision and Pattern Recognition  
We present an approach to simultaneously track and recognize known actions that is robust to such variations, starting from a person detection in the standing pose.  ...  Actions in real world applications typically take place in cluttered environments with large variations in the orientation and scale of the actor.  ...  This research was supported, in part, by the Office of Naval Research under Contract #N00014-06-1-0470 and in part, by the U.S. Government VACE program.  ... 
doi:10.1109/cvpr.2008.4587716 dblp:conf/cvpr/NatarajanN08 fatcat:u7s5ik2klbb2hokoybihpjc3jm

Group activity recognition by using effective multiple modality relation representation with temporal-spatial attention

Dezhong Xu, Heng Fu, Lifang Wu, Meng Jian, Dong Wang, Xu Liu
2020 IEEE Access  
Group activity recognition has received a great deal of interest because of its broader applications in sports analysis, autonomous vehicles, CCTV surveillance systems and video summarization systems.  ...  In this work, a technology of novel group activity recognition is proposed based on multi-modal relation representation with temporal-spatial attention.  ...  We believe that each frame in a video contributes differently to the entire event in the temporal.  ... 
doi:10.1109/access.2020.2979742 fatcat:jmy5xrtc5jexnb2d46uxrxinta
« Previous Showing results 1 — 15 out of 18,373 results