Filters








914 Hits in 6.9 sec

Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection [article]

Sara Beery, Guanhang Wu, Vivek Rathod, Ronny Votel, Jonathan Huang
2020 arXiv   pre-print
Specifically, we propose an attention-based approach that allows our model, Context R-CNN, to index into a long term memory bank constructed on a per-camera basis and aggregate contextual features from  ...  We apply Context R-CNN to two settings: (1) species detection using camera traps, and (2) vehicle detection in traffic cameras, showing in both settings that Context R-CNN leads to performance gains over  ...  Acknowlegdements We would like to thank Pietro Perona, David Ross, Zhichao Lu, Ting Yu, Tanya Birch and the Wildlife Insights Team, Joe Marino, and Oisin MacAodha for their valuable insight.  ... 
arXiv:1912.03538v3 fatcat:ocabhva3azfinlwlypiplbweyu

AU R-CNN: Encoding Expert Prior Knowledge into R-CNN for Action Unit Detection [article]

Chen Ma, Li Chen, Junhai Yong
2018 arXiv   pre-print
This design produces considerably better detection performance than do existing approaches. (2) We also integrate various dynamic models (including convolutional long short-term memory, two stream network  ...  , conditional random field, and temporal action localization network) into AU R-CNN and then investigate and analyze the reason behind the performance of dynamic models.  ...  Object detection networks, such as Fast R-CNN, aim to identify and localize the object.  ... 
arXiv:1812.05788v1 fatcat:hcrbmgtq7ne4dntfdudd3w6ksa

Self-Enhanced R-CNNs for Human Detection with Semi-Supervised Assumptions

Xuexian Chen, Si Wu, Zhiwen Yu.
2020 IEEE Access  
To reduce the reliance of detection models on large amount of labeled data, we modify Faster R-CNN to facilitate semi-supervised human detection.  ...  INDEX TERMS Semi-supervised learning, human detection, reliability analysis, sample selection.  ...  Since context may provide a cue for more accurate detection, Chen et al. [22] proposed multi-order features to capture co-occurrence contexts of objects from different categories. Shen et al.  ... 
doi:10.1109/access.2020.2967414 fatcat:sxukl2ygmnehvb4ye63n2ndjo4

Context-aware CNNs for person head detection [article]

Tuan-Hung Vu, Anton Osokin, Ivan Laptev
2015 arXiv   pre-print
In this work we focus on detecting human heads in natural scenes. Starting from the recent local R-CNN object detector, we extend it with two types of contextual cues.  ...  Person detection is a key problem for many computer vision tasks.  ...  The pipeline of our Local model is similar to the one of R-CNN (see Section 3.1 for details). The use of image context was proposed to support object detection in [34] .  ... 
arXiv:1511.07917v1 fatcat:taatpemh7jfefjnwricwogfhrq

Context-Aware CNNs for Person Head Detection

Tuan-Hung Vu, Anton Osokin, Ivan Laptev
2015 2015 IEEE International Conference on Computer Vision (ICCV)  
In this work we focus on detecting human heads in natural scenes. Starting from the recent local R-CNN object detector, we extend it with two types of contextual cues.  ...  Person detection is a key problem for many computer vision tasks.  ...  The pipeline of our Local model is similar to the one of R-CNN (see Section 3.1 for details). The use of image context was proposed to support object detection in [34] .  ... 
doi:10.1109/iccv.2015.331 dblp:conf/iccv/VuOL15 fatcat:yymm6nkrtbci5ggiaoi2yhubdi

Mimetics: Towards Understanding Human Actions Out of Context [article]

Philippe Weinzaepfel, Grégory Rogez
2021 arXiv   pre-print
However, they tend to leverage context such as scenes or objects instead of focusing on understanding the human action itself.  ...  The best example of out-of-context actions are mimes, that people can typically recognize despite missing relevant objects and scenes.  ...  To this end, we compare the temporal convolution on LCR pose features (blue curve, 'Pose Feats'), to features extracted from a Faster R-CNN model with ResNet50 backbone trained to classify actions (red  ... 
arXiv:1912.07249v3 fatcat:wio3wwk7indztg737ls5hysfmi

CADP: A Novel Dataset for CCTV Traffic Camera based Accident Analysis [article]

Ankit Shah, Jean Baptiste Lamare, Tuan Nguyen Anh, Alexander Hauptmann
2018 arXiv   pre-print
To this end, we propose to integrate contextual information into conventional Faster R-CNN using Context Mining (CM) and Augmented Context Mining (ACM) to complement the accuracy for small pedestrian detection  ...  Our experiments indicate a considerable improvement in object detection accuracy: +8.51% for CM and +6.20% for ACM.  ...  of Context Mining and Augmented Context Mining within Faster R-CNN to improve the detection of small objects such as person and improve the Faster R-CNN baseline scores.  ... 
arXiv:1809.05782v2 fatcat:rmnohxbjmna5hcycg3v7hhdf3a

Unsupervised Visual Representation Learning by Context Prediction [article]

Carl Doersch and Abhinav Gupta and Alexei A. Efros
2016 arXiv   pre-print
For example, this representation allows us to perform unsupervised visual discovery of objects like cats, people, and even birds from the Pascal VOC 2011 detection dataset.  ...  This work explores the use of spatial context as a source of free and plentiful supervisory signal for training a rich visual representation.  ...  Acknowledgements We thank Xiaolong Wang and Pulkit Agrawal for help with baselines, Berkeley and CMU vision group members for many fruitful discussions, and Jitendra Malik for putting gelato on the line  ... 
arXiv:1505.05192v3 fatcat:qmxkyrgijngizjlhy6yndp7yma

Context-driven Multi-stream LSTM (M-LSTM) for Recognizing Fine-Grained Activity of Drivers [chapter]

Ardhendu Behera, Alexander Keidel, Bappaditya Debnath
2019 Lecture Notes in Computer Science  
In this paper, we present a novel Multi-stream Long Short-Term Memory (M-LSTM) network for recognizing driver activities.  ...  We bring together ideas from recent works on LSTMs, transfer learning for object detection and body pose by exploring the use of deep convolutional neural networks (CNN).  ...  We would like to thank Taylor Smith in State Farm Corporation for providing information about their dataset. The GPU used in this research is generously donated by the NVIDIA Corporation.  ... 
doi:10.1007/978-3-030-12939-2_21 fatcat:k25yj72bf5hadgaumtflaptwfe

Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation [article]

Wenhao Li, Hong Liu, Runwei Ding, Mengyuan Liu, Pichao Wang, Wenming Yang
2022 arXiv   pre-print
The modified VTE is termed as Strided Transformer Encoder (STE), which is built upon the outputs of VTE.  ...  This scheme imposes extra temporal smoothness constraints in conjunction with the single target frame supervision and hence helps produce smoother and more accurate 3D poses.  ...  DETR [46] presented a new Transformer-based design for object detection systems.  ... 
arXiv:2103.14304v8 fatcat:hwkokc4d3fgw5isw7hhkp5wggy

The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose [article]

Yizhak Ben-Shabat, Xin Yu, Fatemeh Sadat Saleh, Dylan Campbell, Cristian Rodriguez-Opazo, Hongdong Li, Stephen Gould
2020 arXiv   pre-print
In the context of understanding human activities, existing public datasets, while large in size, are often limited to a single RGB camera and provide only per-frame or per-clip action annotations.  ...  Additionally, we benchmark prominent methods for video action recognition, object segmentation and human pose estimation tasks on this challenging dataset.  ...  Long-term recurrent convolutional networks for visual recogni- tion and description.  ... 
arXiv:2007.00394v1 fatcat:q7d5gc3le5c6zkfhj6rtg3y2fe

Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation [article]

Weiyao Wang, Matt Feiszli, Heng Wang, Du Tran
2021 arXiv   pre-print
However, many real-world applications require detecting or segmenting novel objects, i.e., object categories never seen during training.  ...  Current state-of-the-art object detection and segmentation methods work well under the closed-world assumption.  ...  We use the popular object detection framework Detectron2 [46] for experiments with class-agnostic Mask R-CNN and MaskTrack R-CNN.  ... 
arXiv:2104.04691v1 fatcat:sylor76rorduzbnppsjoyjs5qq

Detection and Tracking Meet Drones Challenge [article]

Pengfei Zhu, Longyin Wen, Dawei Du, Xiao Bian, Heng Fan, Qinghua Hu, Haibin Ling
2021 arXiv   pre-print
We provide a large-scale drone captured dataset, VisDrone, which includes four tracks, i.e., (1) image object detection, (2) video object detection, (3) single object tracking, and (4) multi-object tracking  ...  In this paper, we first present a thorough review of object detection and tracking datasets and benchmarks, and discuss the challenges of collecting large-scale drone-based object detection and tracking  ...  ACKNOWLEDGEMENTS We would like to thank Jiayu Zheng and Tao Peng for valuable and constructive suggestions to improve the quality of this paper.  ... 
arXiv:2001.06303v3 fatcat:q2nekdwiz5gulaaa4b66o6khhy

Cross-Granularity Graph Inference for Semantic Video Object Segmentation

Huiling Wang, Tinghuai Wang, Ke Chen, Joni-Kristian Kämäräinen
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
context cues.  ...  multi-scale contextual information and spatial-temporal relations of video object.  ...  The initial confidence Y is initialized based on the detection confidences of R-CNN.  ... 
doi:10.24963/ijcai.2017/634 dblp:conf/ijcai/WangWCK17 fatcat:yps5tmibrbaydklmejlepjrpyu

Learning a Layout Transfer Network for Context Aware Object Detection

Tao Wang, Xuming He, Yuanzheng Cai, Guobao Xiao
2019 IEEE transactions on intelligent transportation systems (Print)  
We present a context aware object detection method based on a retrieve-and-transform scene layout model.  ...  The above steps are implemented as a Layout Transfer Network which we integrate into Faster RCNN to allow for joint reasoning of object detection and scene layout estimation.  ...  ACKNOWLEDGMENT We thank the anonymous reviewers for their insightful comments. We also thank Zhiming Luo  ... 
doi:10.1109/tits.2019.2939213 fatcat:mkrgdni2cnalxbnovnmsc2wgxi
« Previous Showing results 1 — 15 out of 914 results