Filters








25 Hits in 1.3 sec

Places205-VGGNet Models for Scene Recognition [article]

Limin Wang, Sheng Guo, Weilin Huang, Yu Qiao
2015 arXiv   pre-print
However, it is unable to yield good performance by directly adapting the VGGNet models trained on the ImageNet dataset for scene recognition.  ...  VGGNets have turned out to be effective for object recognition in still images.  ...  We release our trained Places205-VGGNet models for further research in scene recognition. Table 2.  ... 
arXiv:1508.01667v1 fatcat:24fyvhfqtnbcnf34ifz4reypq4

Better Exploiting OS-CNNs for Better Event Recognition in Images

Limin Wang, Zhe Wang, Sheng Guo, Yu Qiao
2015 2015 IEEE International Conference on Computer Vision Workshop (ICCVW)  
OS-CNNs are composed of object nets and scene nets, which transfer the learned representations from the pre-trained models on large-scale object and scene recognition datasets, respectively.  ...  Event recognition from still images is one of the most important problems for image understanding.  ...  ImageNet [4] ), and scene nets are based on models learned from the large-scale scene recognition datasets (e.g. Places205 [28] ).  ... 
doi:10.1109/iccvw.2015.46 dblp:conf/iccvw/WangWG015 fatcat:liczsen3xrh5bptdimd2dyxzwq

WS-AM: Weakly Supervised Attention Map for Scene Recognition

Shifeng Xia, Jiexian Zeng, Lu Leng, Xiang Fu
2019 Electronics  
Compared with traditional hand-crafted features, CNN can be used to extract more robust and generalized features for scene recognition.  ...  The regions, where the local mean and the local center value are both large in the AM, correspond to the discriminative regions helpful for scene recognition.  ...  The backbone network for Grad-CAM is VGGNet pre-trained on the Places205 dataset, i.e., Places205-VGGNet.  ... 
doi:10.3390/electronics8101072 fatcat:3jimldiie5djvczssfzqwpdagy

Better Exploiting OS-CNNs for Better Event Recognition in Images [article]

Limin Wang, Zhe Wang, Sheng Guo, Yu Qiao
2015 arXiv   pre-print
OS-CNNs are composed of object nets and scene nets, which transfer the learned representations from the pre-trained models on large-scale object and scene recognition datasets, respectively.  ...  Event recognition from still images is one of the most important problems for image understanding.  ...  ImageNet [4] ), and scene nets are based on models learned from the large-scale scene recognition datasets (e.g. Places205 [28] ).  ... 
arXiv:1510.03979v1 fatcat:62sug3kacnf6re36k2andeh74i

Collaborative Layer-wise Discriminative Learning in Deep Neural Networks [article]

Xiaojie Jin, Yunpeng Chen, Jian Dong, Jiashi Feng, Shuicheng Yan
2016 arXiv   pre-print
Experiments with multiple popular deep networks, including Network in Network, GoogLeNet and VGGNet, on scale-various object classification benchmarks, including CIFAR100, MNIST and ImageNet, and scene  ...  classification benchmarks, including MIT67, SUN397 and Places205, demonstrate the effectiveness of our method.  ...  Specifically, among all Places205-VGGNet models with different depths (# of layers: 11, 13 and 16), Places205-VGGNet-11 and Places205-VGGNet-16 models are used as base models in our method as they have  ... 
arXiv:1607.05440v1 fatcat:7fclzpj2ffctdmpi27tj7ci3qa

Weakly Supervised PatchNets: Describing and Aggregating Local Patches for Scene Recognition

Zhe Wang, Limin Wang, Yali Wang, Bowen Zhang, Yu Qiao
2017 IEEE Transactions on Image Processing  
In this paper, we propose a hybrid representation, which leverages the discriminative capacity of CNNs and the simplicity of descriptor encoding schema for image recognition, with a focus on scene recognition  ...  ., SIFT) and recent convolutional neural networks (CNNs) are two classes of successful methods for image recognition.  ...  -VGGNet-16 [52] arXiv2015 66.9 LS-DHM [54] arXiv2016 67.6 Human performance [22] CVPR2010 68.5 Our VSAD - 71.7 Our VSAD+FV - 72.2 Our VSAD+Places205-VGGNet-16 - 72.5 Our VSAD+FV+ Places205  ... 
doi:10.1109/tip.2017.2666739 pmid:28207394 fatcat:3igegqypazcdlircqjyt6t7m4i

Knowledge Guided Disambiguation for Large-Scale Scene Classification With Multi-Resolution CNNs

Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, Yu Qiao
2017 IEEE Transactions on Image Processing  
We release the code and models at .  ...  This paper focuses on large-scale scene recognition and makes two major contributions to tackle these issues.  ...  Following the original evaluation protocol, we use 80 Model MIT Indoor67 SUN397 ImageNet-VGGNet-16 [33] 67.7% 51.7% Places205-AlexNet [1] 68.2% 54.3% Places205-GoogLeNet [48] 74.0% 58.8% DAG-VggNet19  ... 
doi:10.1109/tip.2017.2675339 pmid:28252402 fatcat:kja5k2ho65empkmytublsxedue

Unsupervised Feature Learning for Visual Place Recognition in Changing Environments

Dongye Zhao, Bailu Si, Fengzhen Tang
2019 2019 International Joint Conference on Neural Networks (IJCNN)  
Visual place recognition in changing environments is a challenging and critical task for autonomous robot navigation.  ...  The proposed siamese VisNet constitutes a biologically plausible yet efficient method for unsupervised place recognition.  ...  among the siamese VisNet, the VggNet pretrained on Places dataset with 205 categories (VggNet-Places205) [11] , and the CaffeNet pretrained on ImageNet [21] .  ... 
doi:10.1109/ijcnn.2019.8852466 dblp:conf/ijcnn/ZhaoST19 fatcat:cmaya4wtubfwnmctt4ybhh4qsu

An Indoor Room Classification System for Social Robots via Integration of CNN and ECOC

Kamal Othman, Ahmad Rad
2019 Applied Sciences  
The ability to classify rooms in a home is one of many attributes that are desired for social robots.  ...  We also propose and examine a combination model of CNN and a multi-binary classifier referred to as error correcting output code (ECOC) with the clean data.  ...  The open research objectives are diverse and include, but are not limited to, emotion recognition, perception, pattern recognition (face, object, scene, and voice), and navigation.  ... 
doi:10.3390/app9030470 fatcat:rpboxwuew5gh7hgy5rwdbrxmai

Temporal and Fine-Grained Pedestrian Action Recognition on Driving Recorder Database

Hirokatsu Kataoka, Yutaka Satoh, Yoshimitsu Aoki, Shoko Oikawa, Yasuhiro Matsui
2018 Sensors  
We find out how to learn an effective recognition model with only a small-scale database.  ...  It is believed that the fine-grained action recognition induces a pedestrian intention estimation for a helpful advanced driver-assistance systems (ADAS).  ...  We used ImageNet pre-trained model (ImageNet, ImageNet with VGG-16) [6, 16] , Places205 pre-trained model (Places205) [49] , and ImageNet + Places205 pre-trained model (HybridCNN) [49] .  ... 
doi:10.3390/s18020627 pmid:29461473 pmcid:PMC5855092 fatcat:qybfwjyuebdpzkoay2x53abrrq

ChaLearn Looking at People 2015: Apparent Age and Cultural Event Recognition Datasets and Results

Sergio Escalera, Junior Fabian, Pablo Pardo, Xavier Baro, Jordi Gonzalez, Hugo J. Escalante, Dusan Misevic, Ulrich Steiner, Isabelle Guyon
2015 2015 IEEE International Conference on Computer Vision Workshop (ICCVW)  
In terms of cultural event recognition, one hundred categories had to be recognized. These tasks involved scene understanding and human body analysis.  ...  previous series on Looking at People (LAP) competitions [14, 13, 11, 12, 2] , in 2015 ChaLearn ran two new competitions within the field of Looking at People: (1) age estimation, and (2) cultural event recognition  ...  NU&C Model: CaffeNet based on ImageNet and Places205. Combination of Object CNN stream and Scene CNN stream for prediction. CVL ETHZ Model: VGG-16 based on ImageNet and Places205.  ... 
doi:10.1109/iccvw.2015.40 dblp:conf/iccvw/EscaleraFPBGEMS15 fatcat:fhckcio5hvajlomwafenp65bya

Action recognition: From static datasets to moving robots

Fahimeh Rezazadegan, Sareh Shirazi, Ben Upcrofit, Michael Milford
2017 2017 IEEE International Conference on Robotics and Automation (ICRA)  
We also validate our action recognition method in an abnormal behavior detection scenario to improve workplace safety.  ...  The results verify a higher success rate for our method due to the ability of our system to recognize human actions regardless of environment and camera motion.  ...  ACKNOWLEDGMENT This Research has supported by a QUTPRA and Australian Centre of Excellence for Robotic Vision (project number CE140100016).  ... 
doi:10.1109/icra.2017.7989361 dblp:conf/icra/RezazadeganSUM17 fatcat:2wy37watljaglgolkojlhde5qq

Seeing with Humans: Gaze-Assisted Neural Image Captioning [article]

Yusuke Sugano, Andreas Bulling
2016 arXiv   pre-print
Previous works demonstrated the potential of gaze for object-centric tasks, such as object localization and recognition, but it remains unclear if gaze can also be beneficial for scene-centric tasks, such  ...  Using a public large-scale gaze dataset, we first assess the relationship between state-of-the-art object and scene recognition models, bottom-up visual saliency, and human gaze.  ...  s pre-trained model [51] on the ILSVRC-2012 dataset [52] . Similarly, Wang et al.'s pre-trained model [53] on the Places205 dataset [54] is used for scene recognition.  ... 
arXiv:1608.05203v1 fatcat:7ekgjddrgjc27pvb2u34fjwtui

Deep Learning for Scene Classification: A Survey [article]

Delu Zeng, Minyu Liao, Mohammad Tavakolian, Yulan Guo, Bolei Zhou, Dewen Hu, Matti Pietikäinen, Li Liu
2021 arXiv   pre-print
Scene classification, aiming at classifying a scene image to one of the predefined scene categories by comprehending the entire image, is a longstanding, fundamental and challenging problem in computer  ...  directly from big raw data, have been bringing remarkable progress in the field of scene representation and classification.  ...  ACKNOWLEDGMENTS The authors would like to thank the pioneer researchers in scene classification and other related fields. This work was supported in part by grants from National Science  ... 
arXiv:2101.10531v2 fatcat:hwqw5so46ngxdlnfw7zynmpu6m

Learning From Less Data: Diversified Subset Selection and Active Learning in Image Classification Tasks [article]

Vishal Kaushal, Anurag Sahoo, Khoshrav Doctor, Narasimha Raju, Suyash Shetty, Pankaj Singh, Rishabh Iyer, Ganesh Ramakrishnan
2018 arXiv   pre-print
We do this for a variety of computer vision tasks including Gender Recognition, Scene Recognition and Object Recognition.  ...  In this work we empirically demonstrate the effectiveness of two diversity models, namely the Facility-Location and Disparity-Min models for training-data subset selection and reducing labeling effort.  ...  1, β = 10 Gender Recognition 2a GenderData VGGFace/CelebFaces [39] B = 0.12, β = 10 Scene Recognition 2a MIT-67 GoogleNet/Places205 [16] B = 2, β = 10 Gender Recognition 2b Adience VGGFace/  ... 
arXiv:1805.11191v1 fatcat:szo6btnj3zaynnhnuzudhozkqa
« Previous Showing results 1 — 15 out of 25 results