52,283 Hits in 4.2 sec

Collecting and Annotating the Large Continuous Action Dataset [article]

Daniel Paul Barrett and Ran Xu and Haonan Yu and Jeffrey Mark Siskind
2015 arXiv   pre-print
This manuscript serves to describe the novel content and characteristics of the LCA dataset, present the design decisions made when filming the dataset, and document the novel methods employed to annotate  ...  All actions were filmed in the same collection of backgrounds so that background gives little clue as to action class.  ...  The views and conclusions con-tained in this document are those of the authors and should not be interpreted as representing the official policies, either express or implied, of the Army Research Laboratory  ... 
arXiv:1511.05914v1 fatcat:miswemypqfffdgdc5l4tytbary

Collecting and annotating the large continuous action dataset

Daniel Paul Barrett, Ran Xu, Haonan Yu, Jeffrey Mark Siskind
2016 Machine Vision and Applications  
This manuscript serves to describe the novel content and characteristics of the LCA dataset, present the design decisions made when filming the dataset, document the novel methods employed to annotate  ...  All actions were filmed in the same collection of backgrounds so that background gives little clue as to action class.  ...  The VIRAT dataset [48] has 12 classes and longer streaming video. Here, we introduce a new dataset called the large continuous action dataset (LCA).  ... 
doi:10.1007/s00138-016-0768-4 fatcat:gudcbkbglnbmji5u46cozpwoey

Detecting activities of daily living in first-person camera views

H. Pirsiavash, D. Ramanan
2012 2012 IEEE Conference on Computer Vision and Pattern Recognition  
The dataset is annotated with activities, object tracks, hand positions, and interaction events.  ...  We have collected a dataset of 1 million frames of dozens of people performing unscripted, everyday activities.  ...  Acknowledgements: We thank Carl Vondrick for help in using his annotation system. Funding for this research was provided by NSF Grant 0954083, ONR-MURI Grant N00014-10-1-0933, and support from Intel.  ... 
doi:10.1109/cvpr.2012.6248010 dblp:conf/cvpr/PirsiavashR12 fatcat:xezbqkct2reeloxonixydnb5ru

UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles [article]

Tianjiao Li and Jun Liu and Wei Zhang and Yun Ni and Wenqian Wang and Zhiheng Li
2021 arXiv   pre-print
Our dataset was collected by a flying UAV in multiple urban and rural districts in both daytime and nighttime over three months, hence covering extensive diversities w.r.t subjects, backgrounds, illuminations  ...  Experiments show the efficacy of our method on the UAV-Human dataset. The project page:  ...  We also propose a GT-I3D network for distorted fisheye video action recognition. The experimental results show the efficacy of our method.  ... 
arXiv:2104.00946v4 fatcat:ic3t5tbxh5g7tgrgb6lovwpoya

A large-scale benchmark dataset for event recognition in surveillance video

Sangmin Oh, Anthony Hoogs, Amitha Perera, Naresh Cuntoor, Chia-Chih Chen, Jong Taek Lee, Saurajit Mukherjee, J. K. Aggarwal, Hyungtae Lee, Larry Davis, Eran Swears, Xioyang Wang (+12 others)
2011 CVPR 2011  
Our dataset consists of many outdoor scenes with actions occurring naturally by non-actors in continuously captured videos of the real world.  ...  We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoor areas  ...  The views expressed are those of the authors and do not reflect the position of the the U.S. Government.  ... 
doi:10.1109/cvpr.2011.5995586 dblp:conf/cvpr/OhHPCCLMALDSWJRSVPRYTSFRD11 fatcat:fkkxv762izfetdrthrhnopjbb4

RareAct: A video dataset of unusual interactions [article]

Antoine Miech, Jean-Baptiste Alayrac, Ivan Laptev, Josef Sivic, Andrew Zisserman
2020 arXiv   pre-print
This paper introduces a manually annotated video dataset of unusual actions, namely RareAct, including actions such as "blend phone", "cut keyboard" and "microwave shoes".  ...  It contains 122 different actions which were obtained by combining verbs and nouns rarely co-occurring together in the large-scale textual corpus from HowTo100M, but that frequently appear separately.  ...  The taxonomy of RareAct is constructed by collecting rarely co-occurring action verbs and object nouns from the large textual corpus of HowTo100M [8] .  ... 
arXiv:2008.01018v1 fatcat:vqxdprt7wjg5xooemr2wgpsbc4

Technical Report for Valence-Arousal Estimation in ABAW2 Challenge [article]

Hong-Xia Xie, I-Hsuan Li, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng
2021 arXiv   pre-print
Our proposed method achieves Concordance Correlation Coefficient (CCC) of 0.591 and 0.617 for valence and arousal on the validation set of Aff-wild2 dataset.  ...  The competition organizers provide an in-the-wild Aff-Wild2 dataset for participants to analyze affective behavior in real-life settings.  ...  [14] built a large-scale Aff-Wild dataset, collected from Youtube, and proposed deep convolutional and recurrent neural architecture, AffWildNet.  ... 
arXiv:2107.03891v1 fatcat:mt26hciygvfx5hcr44gio434uu

BABEL: Bodies, Action and Behavior with English Labels [article]

Abhinanda R. Punnakkal
2021 arXiv   pre-print
To address this, we present BABEL, a large dataset with language labels describing the actions being performed in mocap sequences.  ...  Existing datasets take one of two approaches. Large-scale video datasets contain many action labels but do not contain ground-truth 3D human motion.  ...  NTU RGB+D 60 [29] and 120 [21] are large, widely used datasets for 3D action recognition.  ... 
arXiv:2106.09696v2 fatcat:rp5tuag7bbdppb7z5qo35jcq44

Learning Actions from the Identity in the Web

Khawla Hussein Ali, Tianjiang Wang
2014 Journal of Computer and Communications  
We present the simple experimental evidence that using action images related with identity collected from the web, annotating identity is possible.  ...  The idea is to use images collected from the web to learn representations of actions related with identity, use this knowledge to automatically annotate identity in videos.  ...  Acknowledgements We would like to thank the anonymous reviewers for their constructive comments and suggestions that help to improve the quality of this manuscript.  ... 
doi:10.4236/jcc.2014.29008 fatcat:xoitz4jidje57fajvr2unnbjjy

ActivityNet: A large-scale video benchmark for human activity understanding

Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, Juan Carlos Niebles
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
In spite of many dataset efforts for human action recognition, current computer vision algorithms are still severely limited in terms of the variability and complexity of the actions that they can recognize  ...  This is in part due to the simplicity of current benchmarks, which mostly focus on simple actions and movements occurring on manually trimmed videos.  ...  Acknowledgments We would like to thank the Stanford Vision Lab for their helpful comments and support.  ... 
doi:10.1109/cvpr.2015.7298698 dblp:conf/cvpr/HeilbronEGN15 fatcat:mwlxj6rbdvay7ior2fs3lb6s54

Morphset:Augmenting categorical emotion datasets with dimensional affect labels using face morphing [article]

Vassilios Vonikakis, Dexter Neo, Stefan Winkler
2021 arXiv   pre-print
However, dimensional emotion annotations are difficult and expensive to collect, therefore they are not as prevalent in the affective computing community.  ...  To address these issues, we propose a method to generate synthetic images from existing categorical emotion datasets using face morphing as well as dimensional labels in the circumplex space with full  ...  The annotations for valence and arousal were collected continuously via joystick.  ... 
arXiv:2103.02854v2 fatcat:xlu6qj6cwfbo5a5iaohiichoqi

MineRL: A Large-Scale Dataset of Minecraft Demonstrations [article]

William H. Guss, Brandon Houghton, Nicholay Topin, Phillip Wang, Cayden Codel, Manuela Veloso, Ruslan Salakhutdinov
2019 arXiv   pre-print
The dataset consists of over 60 million automatically annotated state-action pairs across a variety of related tasks in Minecraft, a dynamic, 3D, open-world environment.  ...  As demonstrated in the computer vision and natural language processing communities, large-scale datasets have the capacity to facilitate research by serving as an experimental and benchmarking platform  ...  conversations and support.  ... 
arXiv:1907.13440v1 fatcat:63khufur7nd73fb5b43f5jf5gm

FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding [article]

Dian Shao, Yue Zhao, Bo Dai, Dahua Lin
2020 arXiv   pre-print
where the sub-action in each set will be further annotated with finely defined class labels.  ...  In particular, it provides temporal annotations at both action and sub-action levels with a three-level semantic hierarchy.  ...  Generally, data for large-scale action datasets are mainly collected in two ways, namely crawling from the Internet and self-recording from invited workers.  ... 
arXiv:2004.06704v1 fatcat:35sbtynolze6xhv7bhayqtap5e

ConvSearch: A Open-Domain Conversational Search Behavior Dataset [article]

Zhumin Chu, Zhihong Wang, Yiqun Liu, Yingye Huang, Min Zhang, Shaoping Ma
2022 arXiv   pre-print
The ConvSearch dataset contains 1,131 dialogues together with annotated search results and corresponding search behaviors.  ...  We develop a novel conversational search platform to collect dialogue contents, annotate dialogue quality and candidate search results and record agent search behaviors. 25 search agents and 51 users are  ...  Since reveal in user intent and answer in agent action occupy a large proportion in dataset, we subdivide several subcategories under these two classes.  ... 
arXiv:2204.02659v1 fatcat:bcwzlaliefhabfobcbdq4wu6qm

Human in Events: A Large-Scale Benchmark for Human-centric Video Analysis in Complex Events [article]

Weiyao Lin, Huabin Liu, Shizhan Liu, Yuxi Li, Rui Qian, Tao Wang, Ning Xu, Hongkai Xiong, Guo-Jun Qi, Nicu Sebe
2021 arXiv   pre-print
To this end, we present a new large-scale dataset, named Human-in-Events or HiEve (Human-centric video analysis in complex Events), for the understanding of human motions, poses, and actions in a variety  ...  We expect that the dataset will advance the development of cutting-edge techniques in human-centric analysis and the understanding of complex events.  ...  THE HIEVE DATASET Collection and Annotation Collection. We start by selecting several crowded places with complex and diverse events for video collection.  ... 
arXiv:2005.04490v5 fatcat:4yjayreakney3bztfnxjrc22ru
« Previous Showing results 1 — 15 out of 52,283 results