Filters








242 Hits in 9.0 sec

Knowledge Guided Disambiguation for Large-Scale Scene Classification With Multi-Resolution CNNs

Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, Yu Qiao
2017 IEEE Transactions on Image Processing  
Second, we design two knowledge guided disambiguation techniques to deal with the problem of label ambiguity.  ...  In addition, with the increasing number of scene categories, label ambiguity has become another crucial issue in large-scale classification.  ...  KNOWLEDGE GUIDED DISAMBIGUATION As analyzed above, many scene categories may overlap with others in large-scale datasets, such as Places2 [19] .  ... 
doi:10.1109/tip.2017.2675339 pmid:28252402 fatcat:kja5k2ho65empkmytublsxedue

Deep Learning for Scene Classification: A Survey [article]

Delu Zeng, Minyu Liao, Mohammad Tavakolian, Yulan Guo, Bolei Zhou, Dewen Hu, Matti Pietikäinen, Li Liu
2021 arXiv   pre-print
The rise of large-scale datasets, which constitute the corresponding dense sampling of diverse real-world scenes, and the renaissance of deep learning techniques, which learn powerful feature representations  ...  Scene classification, aiming at classifying a scene image to one of the predefined scene categories by comprehending the entire image, is a longstanding, fundamental and challenging problem in computer  ...  ACKNOWLEDGMENTS The authors would like to thank the pioneer researchers in scene classification and other related fields. This work was supported in part by grants from National Science  ... 
arXiv:2101.10531v2 fatcat:hwqw5so46ngxdlnfw7zynmpu6m

Transferring Object-Scene Convolutional Neural Networks for Event Recognition in Still Images [article]

Limin Wang, Zhe Wang, Yu Qiao, Luc Van Gool
2016 arXiv   pre-print
With OS2E-CNN, we design a multi-ratio and multi-scale cropping strategy, and propose an end-to-end event recognition pipeline.  ...  This paper addresses the problem of event recognition by proposing a convolutional neural network that exploits knowledge of objects and scenes for event classification (OS2E-CNN).  ...  Recently, Convolutional Neural Networks (CNNs) (LeCun et al, 1998) have delivered great successes in large-scale image classification, in particular for object recognition (Krizhevsky et al, 2012)  ... 
arXiv:1609.00162v1 fatcat:tn332joywfb7lmi52yb7ywrep4

Scenes-Objects-Actions: A Multi-task, Multi-label Video Dataset [chapter]

Jamie Ray, Heng Wang, Du Tran, Yufei Wang, Matt Feiszli, Lorenzo Torresani, Manohar Paluri
2018 Lecture Notes in Computer Science  
This paper introduces a large-scale, multi-label and multitask video dataset named Scenes-Objects-Actions (SOA).  ...  The final dataset includes 562K videos with 3.64M annotations spanning 49 categories for scenes, 356 for objects, 148 for actions, and naturally captures the long tail distribution of visual concepts in  ...  Multi-task Investigations SOA is uniquely designed for innovation in the large-scale multi-task arena.  ... 
doi:10.1007/978-3-030-01264-9_39 fatcat:rny2zdol7vcndl7ofypkfgxpx4

A Survey on Theories and Applications for Self-Driving Cars Based on Deep Learning Methods

Jianjun Ni, Yinan Chen, Yan Chen, Jinxiu Zhu, Deena Ali, Weidong Cao
2020 Applied Sciences  
Then the main problems in self-driving cars and their solutions based on deep learning methods are analyzed, such as obstacle detection, scene recognition, lane detection, navigation and path planning.  ...  Finally, the future challenges in the applications of deep learning for self-driving cars are given out.  ...  This model includes coarse resolution CNNs and fine resolution CNNs, which are used to capture the visual structures at a large scale and a relatively smaller scale respectively.  ... 
doi:10.3390/app10082749 fatcat:iohm7uqj2vbojmnao6kyhzeliu

Survey of recent progress in semantic image segmentation with CNNs

Qichuan Geng, Zhong Zhou, Xiaochun Cao
2017 Science China Information Sciences  
In recent years, convolutional neural networks (CNNs) are leading the way in many computer vision tasks, such as image classification, object detection, and face recognition.  ...  for the segmentation of specific semantic objects.  ...  This dataset can be used for object classification, detection, segmentation, action classification and a competition on large-scale recognition run by ImageNet.  ... 
doi:10.1007/s11432-017-9189-6 fatcat:chdj4yrzfbfrzjubehdgmo22nm

Unsupervised Foveal Vision Neural Networks with Top-Down Attention [article]

Ryan Burt, Nina N. Thigpen, Andreas Keil, Jose C. Principe
2020 arXiv   pre-print
We also develop a topdown attention mechanism based on the Gamma saliency applied to the top layer of CNNs to improve scene understanding in multi-object images or images with strong background clutter  ...  When we compare the results with human observers in an image dataset of animals occluded in natural scenes, we show that topdown attention is capable of disambiguating object from background and improves  ...  the eye to relevant visual details to disambiguate the scene with respect to the current goal [10] .  ... 
arXiv:2010.09103v1 fatcat:uuk3bg3usfajxh6tunxzutqgue

INVESTIGATING THE POTENTIAL OF DEEP NEURAL NETWORKS FOR LARGE-SCALE CLASSIFICATION OF VERY HIGH RESOLUTION SATELLITE IMAGES

T. Postadjian, A. Le Bris, H. Sahbi, C. Mallet
2017 ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences  
Therefore, current architectures are perfectly tailored for urban areas over restricted areas but not designed for large-scale purposes.  ...  This paper presents an end-to-end automatic processing chain, based on DCNNs, that aims at performing large-scale classification of VHR satellite images (here SPOT 6/7).  ...  Both issues are exacerbated when dealing with Very High spatial Resolution (VHR) geospatial images and large-scale classification tasks : (i) Standard hand-crafted features, either spectral or texturebased  ... 
doi:10.5194/isprs-annals-iv-1-w1-183-2017 fatcat:j5jtcy36lfdp5lvnvlzw75t64a

3D Semantic Scene Completion: a Survey [article]

Luis Roldao, Raoul de Charette, Anne Verroust-Blondet
2021 arXiv   pre-print
In the last years following the multiplication of large-scale 3D datasets, SSC has gained significant momentum in the research community because it holds unresolved challenges.  ...  Semantic Scene Completion (SSC) aims to jointly estimate the complete geometry and semantics of a scene, assuming partial sparse input.  ...  contextual information from multiple scales, which enables to disambiguate between similar objects present in the scene.  ... 
arXiv:2103.07466v3 fatcat:swz4azlznre3laziatls6sdrfm

Global Sum Pooling: A Generalization Trick for Object Counting with Small Datasets of Large Images [article]

Shubhra Aich, Ian Stavness
2019 arXiv   pre-print
In this paper, we explore the problem of training one-look regression models for counting objects in datasets comprising a small number of high-resolution, variable-shaped images.  ...  Our GSP models improve upon the state-of-the-art approaches on all four datasets with a simple architecture.  ...  Another approach [44] for multi-scale context aggregation for density map estimation use multi-column networks with different kernel sizes.  ... 
arXiv:1805.11123v2 fatcat:k7scwfkmhrcjbmv7mnzpdyzlwa

A 4D Light-Field Dataset and CNN Architectures for Material Recognition [article]

Ting-Chun Wang, Jun-Yan Zhu, Ebi Hiroaki, Manmohan Chandraker, Alexei A. Efros, Ravi Ramamoorthi
2016 arXiv   pre-print
In our experiments, the best performing CNN architecture achieves a 7% boost compared with 2D image classification (70% to 77%).  ...  To the best of our knowledge, this is the first mid-size dataset for light-field images.  ...  Acknowledgements This work was funded in part by ONR grant N00014152013, NSF grant IIS-1617234, Draper Lab, a Google Research Award, support by Nokia, Samsung and Sony to the UC San Diego Center for Visual  ... 
arXiv:1608.06985v1 fatcat:44tq5gwtnre2nalm422r3hcyqi

A 4D Light-Field Dataset and CNN Architectures for Material Recognition [chapter]

Ting-Chun Wang, Jun-Yan Zhu, Ebi Hiroaki, Manmohan Chandraker, Alexei A. Efros, Ravi Ramamoorthi
2016 Lecture Notes in Computer Science  
In our experiments, the best performing CNN architecture achieves a 7% boost compared with 2D image classification (70% → 77%).  ...  To the best of our knowledge, this is the first mid-size dataset for light-field images.  ...  Acknowledgements This work was funded in part by ONR grant N00014152013, NSF grant IIS-1617234, Draper Lab, a Google Research Award, support by Nokia, Samsung and Sony to the UC San Diego Center for Visual  ... 
doi:10.1007/978-3-319-46487-9_8 fatcat:e2kwcsvasbcadmumyhfbhgrxbu

Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry and Fusion [article]

Yang Wang
2020 arXiv   pre-print
With the development of web technology, multi-modal or multi-view data has surged as a major stream for big data, where each modal/view encodes individual property of data objects.  ...  Most of the existing state-of-the-art focused on how to fuse the energy or information from multi-modal spaces to deliver a superior performance over their counterparts with single modal.  ...  [91] proposed a multi-scale CNN architecture to extract deep features at multiple scales corresponding to visual concepts of different levels.  ... 
arXiv:2006.08159v1 fatcat:g4467zmutndglmy35n3eyfwxku

How can big data and machine learning benefit environment and water management: A survey of methods, applications, and future directions

Alexander Y. Sun, Bridget R Scanlon
2019 Environmental Research Letters  
The authors are grateful to Dr Michael Fienen and an anonymous reviewer for their constructive comments on the original manuscript.  ...  The use of the salience measure enables rapid classification and disambiguation of topics.  ...  DL applications in EWM Earth data classification A large number of existing DL studies in EWM pertain to remote sensing data classification problems, including scene classification, semantic segmentation  ... 
doi:10.1088/1748-9326/ab1b7d fatcat:vx4thuy45vhlnmhu7bk2hwh2g4

Computer Vision for Autonomous Vehicles: Problems, Datasets and State of the Art [article]

Joel Janai, Fatma Güney, Aseem Behl, Andreas Geiger
2021 arXiv   pre-print
understanding, and end-to-end learning for autonomous driving.  ...  As with any rapidly growing field, it becomes increasingly difficult to stay up-to-date or enter the field as a beginner.  ...  Figure 5 . 5 : 55 Figure 5.5: Multi-scale Deep CNN for Object Detection.  ... 
arXiv:1704.05519v3 fatcat:xiintiarqjbfldheeg2hsydyra
« Previous Showing results 1 — 15 out of 242 results