Filters








32,664 Hits in 2.8 sec

Spatial Sampling Network for Fast Scene Understanding

Davide Mazzini, Raimondo Schettini
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
We propose a network architecture to perform efficient scene understanding.  ...  Finally, we propose a novel efficient network design that includes the new modules and test it against different datasets for outdoor scene understanding.  ...  We gratefully acknowledge the support of NVIDIA Corporation with the donation of a Titan Xp GPU used for this research.  ... 
doi:10.1109/cvprw.2019.00168 dblp:conf/cvpr/MazziniS19 fatcat:m6wohnxilndu7dnwnpntd25b4q

PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN [article]

Yaxu Xie, Jason Rambach, Fangwen Shu, Didier Stricker
2021 arXiv   pre-print
Instance segmentation of planar regions in indoor scenes benefits visual SLAM and other applications such as augmented reality (AR) where scene understanding is required.  ...  We also utilize a Residual Feature Augmentation module in the Feature Pyramid Network (FPN).  ...  INTRODUCTION Detection of 3D geometry features in scenes supports tasks such as 3D scene understanding, robot navigation and Simultaneous Localization and Mapping (SLAM).  ... 
arXiv:2103.15428v1 fatcat:l6cczxtxajh3voxt3oknmqdzsm

Real-time Semantic Segmentation with Fast Attention [article]

Ping Hu, Federico Perazzi, Fabian Caba Heilbron, Oliver Wang, Zhe Lin, Kate Saenko, Stan Sclaroff
2020 arXiv   pre-print
Moreover, to efficiently process high-resolution input, we apply an additional spatial reduction to intermediate feature stages of the network with minimal loss in accuracy thanks to the use of the fast  ...  The proposed architecture relies on our fast spatial attention, which is a simple yet efficient modification of the popular self-attention mechanism and captures the same rich spatial context at a small  ...  FAST ATTENTION NETWORK In this section, we describe the Fast Attention Network (FANet) for real-time image semantic segmentation.  ... 
arXiv:2007.03815v2 fatcat:ldrnwzmwsjekvk7zki5b6wtv5m

Spatio-Temporal Context for Action Detection [article]

Manuel Sarmiento Calderó, David Varas, Elisenda Bou-Balust
2021 arXiv   pre-print
Modelling the interactions (either spatial or temporal) between actors and their context has proven to be essential for this task.  ...  Research in action detection has grown in the recentyears, as it plays a key role in video understanding.  ...  Backbone Network. The backbone is a SlowFast R50 with input sampling T ×τ = 8×8, without non local blocks and spatial resolution of res 5 increased by 2.  ... 
arXiv:2106.15171v1 fatcat:wukoutuwjvbnhonkf6py6t2e34

DSNet: An Efficient CNN for Road Scene Segmentation [article]

Ping-Rong Chen, Hsueh-Ming Hang, Sheng-Wei Chan, Jing-Jhih Lin
2019 arXiv   pre-print
Road scene understanding is a critical component in an autonomous driving system.  ...  Although the deep learning-based road scene segmentation can achieve very high accuracy, its complexity is also very high for developing real-time applications.  ...  We would like to thank Shao-Yuan Lo for his helpful discussions during the course of this project.  ... 
arXiv:1904.05022v1 fatcat:o3hhp2n7angg3kxjscos3g5c74

Attend, Infer, Repeat: Fast Scene Understanding with Generative Models [article]

S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, Geoffrey E. Hinton
2016 arXiv   pre-print
We achieve this by performing probabilistic inference using a recurrent neural network that attends to scene elements and processes them one at a time.  ...  We present a framework for efficient inference in structured image models that explicitly reason about objects.  ...  The result is fast, feed-forward, interpretable scene understanding trained without supervision.  ... 
arXiv:1603.08575v3 fatcat:xouycxqk5jf57ce4bghtva3doa

Convolutional Patch Networks with Spatial Prior for Road Detection and Urban Scene Understanding

Clemens-Alexander Brust, Sven Sickert, Marcel Simon, Erik Rodner, Joachim Denzler
2015 Proceedings of the 10th International Conference on Computer Vision Theory and Applications  
We also show how to incorporate spatial information of the patch as an input to the network, which allows for learning spatial priors for certain categories jointly with an appearance model.  ...  Classifying single image patches is important in many different applications, such as road detection or scene understanding.  ...  Urban scene understanding For urban scene understanding, each pixel is classified into one of K classes.  ... 
doi:10.5220/0005355105100517 dblp:conf/visapp/BrustSSRD15 fatcat:hmkpme6d55emljdoy4m2bcmlaa

Convolutional Patch Networks with Spatial Prior for Road Detection and Urban Scene Understanding [article]

Clemens-Alexander Brust, Sven Sickert, Marcel Simon, Erik Rodner, Joachim Denzler
2015 arXiv   pre-print
We also show how to incorporate spatial information of the patch as an input to the network, which allows for learning spatial priors for certain categories jointly with an appearance model.  ...  Classifying single image patches is important in many different applications, such as road detection or scene understanding.  ...  Urban scene understanding For urban scene understanding, each pixel is classified into one of K classes.  ... 
arXiv:1502.06344v1 fatcat:y2so3iwhwfe75h43cjsmbhlmyy

Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs [article]

Haithem Turki, Deva Ramanan, Mahadev Satyanarayanan
2022 arXiv   pre-print
To address these challenges, we begin by analyzing visibility statistics for large-scale scenes, motivating a sparse network structure where parameters are specialized to different regions of the scene  ...  , each of which capture only a small subset of the scene, (2) prohibitively large model capacities that make it infeasible to train on a single GPU, and (3) significant challenges for fast rendering that  ...  Acknowledgments This research was supported by the National Science Foundation (NSF) under grant number CNS-2106862, the Defense Science and Technology Agency of Singapore (DSTA), and the CMU Argo AI Center for  ... 
arXiv:2112.10703v2 fatcat:zdiiudegzvewjhmc2unvhhiukm

DSNet: an efficient CNN for road scene segmentation

Ping-Rong Chen, Hsueh-Ming Hang, Sheng-Wei Chan, Jing-Jhih Lin
2020 APSIPA Transactions on Signal and Information Processing  
Road scene understanding is a critical component in an autonomous driving system.  ...  Although the deep learning-based road scene segmentation can achieve very high accuracy, its complexity is also very high for developing real-time applications.  ...  The second dataset is Cityscapes [19] , which is a larger dataset for semantic understanding of urban street scenes. All images are at 2048 × 1024 resolution and there are 19 classes for training.  ... 
doi:10.1017/atsip.2020.25 fatcat:7rlk6gqk3jhhdkt3ik35ip5st4

Fast and Flexible Indoor Scene Synthesis via Deep Convolutional Generative Models

Daniel Ritchie, Kai Wang, Yu-An Lin
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Abstract We present a new, fast and flexible pipeline for indoor scene synthesis that is based on deep convolutional generative models.  ...  Our method operates on a top-down image-based representation, and inserts objects iteratively into the scene by predicting their category, location, orientation and size with separate neural network modules  ...  Acknowledgments We thank the anonymous reviewers for their helpful suggestions. Scene renderings shown in this paper were created using the Mitsuba physically-based renderer [12] .  ... 
doi:10.1109/cvpr.2019.00634 dblp:conf/cvpr/Ritchie0L19 fatcat:yze6miopsbga3g3xaj7lwf4xa4

Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning [article]

Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim
2020 arXiv   pre-print
We propose spatial semantic embedding network (SSEN), a simple, yet efficient algorithm for 3D instance segmentation using deep metric learning.  ...  For high-level intelligent tasks from a large scale scene, 3D instance segmentation recognizes individual instances of objects.  ...  The neural network is trained extensively calculating the similarity for pairs of sampled pixels.  ... 
arXiv:2007.03169v1 fatcat:cms4hzgwjnf6hkqaeclr4eln6y

CP-SSD: Context Information Scene Perception Object Detection Based on SSD

Yun Jiang, Tingting Peng, Ning Tan
2019 Applied Sciences  
CP-SSD promotes the network's understanding of context information by using context information scene perception modules, so as to capture context information for objects of different scales.  ...  In order to alleviate these problems, we propose a single-shot object detection network Context Perception-SSD (CP-SSD).  ...  ability to understand the scene.  ... 
doi:10.3390/app9142785 fatcat:6ly7erl53belbkwumoezqoxf6u

Real-time Semantic Segmentation with Context Aggregation Network [article]

Michael Ying Yang, Saumya Kumaar, Ye Lyu, Francesco Nex
2021 arXiv   pre-print
With the increasing demand of autonomous systems, pixelwise semantic segmentation for visual scene understanding needs to be not only accurate but also efficient for potential real-time applications.  ...  Building upon the existing dual branch architectures for high-speed semantic segmentation, we design a cheap high resolution branch for effective spatial detailing and a context branch with light-weight  ...  [21] developed an approach which exploits light-weight upsampling and lateral connections with a residual network as the main recognition engine for real-time scene understanding.  ... 
arXiv:2011.00993v2 fatcat:qftm2mwbubhi3bgh5si34f5vza

Fast Light Field Reconstruction with Deep Coarse-to-Fine Modeling of Spatial-Angular Clues [chapter]

Henry Wing Fung Yeung, Junhui Hou, Jie Chen, Yuk Ying Chung, Xiaoming Chen
2018 Lecture Notes in Computer Science  
In this paper, we propose a learning based algorithm to reconstruct a densely-sampled LF fast and accurately from a sparsely-sampled LF in one forward pass.  ...  Specifically, our end-to-end model first synthesizes a set of intermediate novel sub-aperture images (SAIs) by exploring the coarse characteristics of the sparsely-sampled LF input with spatial-angular  ...  In this paper, we propose a novel learning based model for fast reconstruction of a densely-sampled LF from a very sparsely-sampled LF.  ... 
doi:10.1007/978-3-030-01231-1_9 fatcat:fe7cg5sv3jfcrj54avrylhwvjq
« Previous Showing results 1 — 15 out of 32,664 results