239 Hits in 4.7 sec

Do More Dropouts in Pool5 Feature Maps for Better Object Detection [article]

Zhiqiang Shen, Xiangyang Xue
2014 arXiv   pre-print
The experimental results for classification-based object detection on canonical datasets including VOC 2007 (60.1%), 2010 (56.4%) and 2012 (56.3%) show obvious improvement in mean average precision (mAP  ...  Deep Convolutional Neural Networks (CNNs) have gained great success in image classification and object detection.  ...  In this paper we use selective search for pre-detecting, but if you care efficiency more, BING will be a better choice.  ... 
arXiv:1409.6911v3 fatcat:g3eh5ulctnhxzklquqk5clorh4

Residual Transfer Learning for Multiple Object Tracking

Juan Diego Gonzales Zuniga, Thi-Lan-Anh Nguyen, Francois Bremond
2018 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)  
Beyond that, our proposed method provides more flexibility in terms of modelling the difference between these two tasks with a four-stage training.  ...  To address the Multiple Object Tracking (MOT) challenge, we propose to enhance the tracklet appearance features, given by a Convolutional Neural Network (CNN), based on the Residual Transfer Learning (  ...  To accom-plish this, we split the output of pool5 in 7 parts, one for each horizontal stripe of the feature map.  ... 
doi:10.1109/avss.2018.8639320 dblp:conf/avss/ZunigaNB18 fatcat:5qc4zjgk3jafzeys6mbn2z6krm

Cross Domain Residual Transfer Learning for Person Re-Identification

Furqan Khan, Francois Bremond
2019 2019 IEEE Winter Conference on Applications of Computer Vision (WACV)  
It also argues for hybrid models that use learned (deep) features and statistical metric learning for multi-shot person re-identification when training sets are small.  ...  This is in contrast to popular end-to-end neural network based models or models that use hand-crafted features with adaptive matching models (neural nets or statistical metrics).  ...  More precisely for RT L pool4 , we pass the output of pool4 through another max-pool layer with same parameters, giving us the same sized feature maps as pool5.  ... 
doi:10.1109/wacv.2019.00219 dblp:conf/wacv/KhanB19 fatcat:ueeoky2fdjb3jh6l3nrku65ly4

Representation Learning on Large and Small Data [article]

Chun-Nan Chou, Chuen-Kai Shie, Fu-Chieh Chang, Jocelyn Chang, Edward Y. Chang
2017 arXiv   pre-print
In terms of big data, it has been widely accepted in the research community that the more data the better for both representation and classification improvement.  ...  We addressed the first question by presenting CNN model enhancements in the aspects of representation, optimization, and generalization.  ...  For an introduction to the computation of neural network models, please refer to [17] .  ... 
arXiv:1707.09873v1 fatcat:lhrqlkdfcrfgtn6rluyotvyn4u

Large-Scale Social Multimedia Analysis [chapter]

Benjamin Bischke, Damian Borth, Andreas Dengel
2019 Big Data Analytics for Large-Scale Multimedia Search  
object detection and image classification on a subset of ImageNet, 1.2 million images over 1000 categories.  ...  For example, the features of an image can be affected by the other images in the CNN (because the structure parameters modified through back-propagation are affected by all training images), but the feature  ...  For an introduction to the computation of neural network models, refer to [71] .  ... 
doi:10.1002/9781119376996.ch6 fatcat:dw4rzuqeanbvxmaabtsgrid2ty

Hierarchical Object Detection with Deep Reinforcement Learning [article]

Miriam Bellver, Xavier Giro-i-Nieto, Ferran Marques, Jordi Torres
2016 arXiv   pre-print
We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent.  ...  , and a second one that computes the feature maps for the whole image to later generate crops for each region proposal.  ...  We also want to thank all the members of the X-theses group for their advice.  ... 
arXiv:1611.03718v2 fatcat:xguk2woacndbxos2gzjdu5akmu

Random Erasing Data Augmentation [article]

Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, Yi Yang
2017 arXiv   pre-print
, object detection and person re-identification.  ...  In this paper, we introduce Random Erasing, a new data augmentation method for training the convolutional neural network (CNN).  ...  In testing, we extract the output of Pool5 as feature for Market-1501 and DukeMTMC-reID datasets, and the fully connected layer with 128 units as feature for CUHK03.  ... 
arXiv:1708.04896v2 fatcat:7qhtgiwhwbd7rbvkszhcy6eqfq

A Two-Branch Network for Weakly Supervised Object Localization

Chang Sun, Yibo Ai, Sheng Wang, Weidong Zhang
2020 Electronics  
Weakly supervised object localization (WSOL) has attracted intense interest in computer vision for instance level annotations.  ...  To overcome this challenge and to improve the detection performance of feature extracting related WSOL methods, a CNN-based two-branch model was presented in this paper to locate objects using supervised  ...  We applied multi-scale detection to output two-scale features in order to improve the detection performance in localization. 3.  ... 
doi:10.3390/electronics9060955 fatcat:xzktxjwrovbmleyn74q6as5w7q

Discriminative Feature Representation for Person Re-identification by Batch-contrastive Loss

Guopeng Zhang, Jinhua Xu
2018 Asian Conference on Machine Learning  
In this work, we introduce a new auxiliary loss function, called batch-contrastive loss, for person reID to further separate the features of different identities and pulls the features of same identity  ...  The softmax loss function is an important component for learning discriminative features. However, the classifier trained by the softmax loss is difficult to distinguish the hard samples.  ...  We utilize the dropout strategy on pool5 layer because the feature map (1×1) of pool5 is similar to the unit of FC layer, which makes the discard work on the channel. Loss Function Softmax loss.  ... 
dblp:conf/acml/ZhangX18 fatcat:k5q5tq7girfftourvsadfakv24

Rich feature hierarchies for accurate object detection and semantic segmentation [article]

Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik
2014 arXiv   pre-print
In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012---achieving a mAP of 53.3%  ...  Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years.  ...  The GPUs used in this research were generously donated by the NVIDIA Corporation.  ... 
arXiv:1311.2524v5 fatcat:aduyci2whfgztjxeaky5boz4ha

Actions and Attributes from Wholes and Parts

Georgia Gkioxari, Ross Girshick, Jitendra Malik
2015 2015 IEEE International Conference on Computer Vision (ICCV)  
detection system.  ...  We develop a part-based approach by leveraging convolutional network features inspired by recent advances in computer vision.  ...  The GPUs used in this research were generously donated by the NVIDIA Corporation  ... 
doi:10.1109/iccv.2015.284 dblp:conf/iccv/GkioxariGM15a fatcat:7vranzelszb53hrhm5qjnutmuu

Actions and Attributes from Wholes and Parts [article]

Georgia Gkioxari, Ross Girshick, Jitendra Malik
2015 arXiv   pre-print
detection system.  ...  We develop a part-based approach by leveraging convolutional network features inspired by recent advances in computer vision.  ...  Indeed, [12, 13] show an impressive jump in object detection performance using pool5 instead of HOG.  ... 
arXiv:1412.2604v2 fatcat:jmfzpjhrnfdhxgzdqeeyxpqtrm

DeepIrisNet2: Learning Deep-IrisCodes from Scratch for Segmentation-Robust Visible Wavelength and Near Infrared Iris Recognition [article]

Abhishek Gangwar, Akanksha Joshi, Padmaja Joshi, R. Raghavendra
2019 arXiv   pre-print
In addition, we present a dual CNN iris segmentation pipeline comprising of a iris/pupil bounding boxes detection network and a semantic pixel-wise segmentation network.  ...  Since, no ground truth dataset are available for CNN training for iris segmentation, We build large scale hand labeled datasets and make them public; i) iris, pupil bounding boxes, ii) labeled iris texture  ...  , b) more discriminative features at lower stages (we tested with features extracted from pool5 stage), c) additional regularization in the network which makes the network more difficult to overfit.  ... 
arXiv:1902.05390v1 fatcat:oat7xzrrevgslfufzk5l2qhedi

TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images [article]

Shubham Paliwal, Vishwanath D, Rohit Rahul, Monika Sharma, Lovekesh Vig
2020 arXiv   pre-print
In this paper, we propose TableNet: a novel end-to-end deep learning model for both table detection and structure recognition.  ...  While some progress has been made in table detection, extracting the table contents is still a challenge since this involves more fine grained table structure(rows & columns) recognition.  ...  The difference is that the tolerance for noise in table/column detection is much smaller than in object detection.  ... 
arXiv:2001.01469v1 fatcat:fbs6to3yonccrchwvef55ybrnu

Reducing Overfitting in Deep Networks by Decorrelating Representations [article]

Michael Cogswell, Faruk Ahmed, Ross Girshick, Larry Zitnick, Dhruv Batra
2016 arXiv   pre-print
In this work, we propose a new regularizer called DeCov which leads to significantly reduced overfitting (as indicated by the difference between train and val performance), and better generalization.  ...  This simple intuition has been explored in a number of past works but surprisingly has never been applied as a regularizer in supervised learning.  ...  This work was supported in part by the following awards to DB: National Science Foundation CAREER award, Army Research Office YIP award, Office of Naval Research grant N00014-14-1-0679, AWS in Education  ... 
arXiv:1511.06068v4 fatcat:pve4x33gyrf4fi6gkp4sfrb3cq
« Previous Showing results 1 — 15 out of 239 results