Filters








475 Hits in 8.6 sec

Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos [article]

Vincent Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
2018 arXiv   pre-print
In this work we address unsupervised learning of scene depth and robot ego-motion where supervision is provided by monocular videos, as cameras are the cheapest, least restrictive and most ubiquitous sensor  ...  Learning to predict scene depth from RGB inputs is a challenging task both for indoor and outdoor robot navigation.  ...  We would like to thank Ayzaan Wahid for helping us with data collection.  ... 
arXiv:1811.06152v1 fatcat:zkrl6iv4wbbrroepf5tkpfynzu

Depth Prediction without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos

Vincent Casser, Soeren Pirk, Reza Mahjourian, Anelia Angelova
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
In this work we address unsupervised learning of scene depth and robot ego-motion where supervision is provided by monocular videos, as cameras are the cheapest, least restrictive and most ubiquitous sensor  ...  Learning to predict scene depth from RGB inputs is a challenging task both for indoor and outdoor robot navigation.  ...  We would like to thank Ayzaan Wahid for helping us with data collection.  ... 
doi:10.1609/aaai.v33i01.33018001 fatcat:4p5kkn52bndubnbrshiz2le62i

Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor

Lu Xiong, Yongkun Wen, Yuyao Huang, Junqiao Zhao, Wei Tian
2020 Sensors  
We propose a completely unsupervised approach to simultaneously estimate scene depth, ego-pose, ground segmentation and ground normal vector from only monocular RGB video sequences.  ...  In our approach, estimation for different scene structures can mutually benefit each other by the joint optimization.  ...  (SS) learning, and using video streams for unsupervised (US) learning.  ... 
doi:10.3390/s20133737 pmid:32635370 pmcid:PMC7374458 fatcat:brff5rdebzdfrbll5x4r5atnzu

A novel no-sensors 3D model reconstruction from monocular video frames for a dynamic environment

Ghada M. Fathy, Hanan A. Hassan, Walaa Sheta, Fatma A. Omara, Emad Nabil
2021 PeerJ Computer Science  
The framework is composed of two main phases. The first uses an unsupervised learning technique to predict scene depth, camera pose, and objects' motion from RGB monocular videos.  ...  reconstruction that overcomes the occlusion problem in a complex dynamic scene without using sensors' data.  ...  The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.  ... 
doi:10.7717/peerj-cs.529 pmid:34084931 pmcid:PMC8157153 fatcat:qfeod4vbrnahlg4to5jlkjushm

Superb Monocular Depth Estimation Based on Transfer Learning and Surface Normal Guidance

Kang Huang, Xingtian Qu, Shouqian Chen, Zhen Chen, Wang Zhang, Haogang Qi, Fengshang Zhao
2020 Sensors  
Specifically, the coarse depth prediction network is designed as pre-trained encoder–decoder architecture for describing the 3D structure.  ...  In this paper, a novel monocular depth estimation method was proposed that primarily utilizes a lighter-weight Convolutional Neural Network (CNN) structure for coarse depth prediction and then refines  ...  Acknowledgments: The authors want to express gratitude to the editor and anonymous reviewers for their professional comments and valuable suggestions.  ... 
doi:10.3390/s20174856 pmid:32867293 pmcid:PMC7506624 fatcat:5etk23xh3jdi7h7wbxxhgonyxm

Semantic Segmentation Leveraging Simultaneous Depth Estimation

Wenbo Sun, Zhi Gao, Jinqiang Cui, Bharath Ramesh, Bin Zhang, Ziyao Li
2021 Sensors  
While rich context information of the input images can be learned from multi-scale receptive fields by convolutions with deep layers, traditional CNNs have great difficulty in learning the geometrical  ...  Specifically, we estimate depth information on RGB images via a depth estimation network, and then feed the depth map into the CNN which is able to guide the semantic segmentation.  ...  [75] put forward an unsupervised learning framework that can jointly predict the depth map and the ego-motion from the monocular video. Vincent et al.  ... 
doi:10.3390/s21030690 pmid:33498358 fatcat:d7g5i2gbsjgxlcjszahmqo7ice

Semi-Supervised Learning with Mutual Distillation for Monocular Depth Estimation [article]

Jongbeom Baek, Gyeongnyeon Kim, Seungryong Kim
2022 arXiv   pre-print
We propose a semi-supervised learning framework for monocular depth estimation.  ...  Compared to existing semi-supervised learning methods, which inherit limitations of both sparse supervised and unsupervised loss functions, we achieve the complementary advantages of both loss functions  ...  This research was supported by the MSIT, Korea (IITP-2022-2020-0-01819, ICT Creative Consilience program), and National Research Foundation of Korea (NRF-2021R1C1C1006897).  ... 
arXiv:2203.09737v1 fatcat:yiobudoacbdtheq6mymnmnkfky

Real-time 3D Perception of Scene with Monocular Camera

Shadi Saleh, Shanmugapriyan Manoharan, Wolfram Hardt
2020 Embedded Selforganising Systems  
This study presents two approaches (unsupervised learning and semi-supervised learning) to learn the depth information using only a single RGB-image.  ...  Depth is a vital prerequisite for the fulfillment of various tasks such as perception, navigation, and planning.  ...  , by leveraging knowledge from both supervised and unsupervised learning.  ... 
doi:10.14464/ess.v7i2.436 fatcat:ndvpqbr3s5axbhajzwlk3r4dae

Project to Adapt: Domain Adaptation for Depth Completion from Noisy and Sparse Sensor Data [article]

Adrian Lopez-Rodriguez and Benjamin Busam and Krystian Mikolajczyk
2020 arXiv   pre-print
We propose a domain adaptation approach for sparse-to-dense depth completion that is trained from synthetic data, without annotations in the real domain or additional sensors.  ...  Depth completion aims to predict a dense depth map from a sparse depth input.  ...  Some monocular depth estimators leverage a pre-computation stage with an SfM pipeline to provide supervision for both camera pose and depth [27, 28] or incorporate hints from stereo algorthms [29] .  ... 
arXiv:2008.01034v2 fatcat:gzr5xijq5zfw5gethvddjfppi4

Deep Learning based Monocular Depth Prediction: Datasets, Methods and Applications [article]

Qing Li, Jiasong Zhu, Jun Liu, Rui Cao, Qingquan Li, Sen Jia, Guoping Qiu
2020 arXiv   pre-print
In this survey, we first introduce the datasets for depth estimation, and then give a comprehensive introduction of the methods from three perspectives: supervised learning-based methods, unsupervised  ...  Recently, monocular depth estimation has obtained great progress owing to the rapid development of deep learning techniques.  ...  Wu et al. [110] present a novel SC-GAN network with end-to-end adversarial training for depth estimation from monocular videos without estimating the camera pose and pose change over time.  ... 
arXiv:2011.04123v1 fatcat:by6swdegvvdrxk73ti46k2rj2e

Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video [article]

Jia-Wang Bian, Zhichao Li, Naiyan Wang, Huangying Zhan, Chunhua Shen, Ming-Ming Cheng, Ian Reid
2019 arXiv   pre-print
Recent work has shown that CNN-based depth and ego-motion estimators can be learned using unlabelled monocular videos.  ...  To the best of our knowledge, this is the first work to show that deep networks trained using unlabelled monocular videos can predict globally scale-consistent camera trajectories over a long video sequence  ...  Acknowledgments The work was supported by the Australian Centre for Robotic Vision. Jiawang would also like to thank TuSimple, where he started research in this field.  ... 
arXiv:1908.10553v2 fatcat:m6ext2y33bhqfdbuejonuvtyhu

Occlusion-Aware Unsupervised Learning of Monocular Depth, Optical Flow and Camera Pose with Geometric Constraints

Qianru Teng, Yimin Chen, Chen Huang
2018 Future Internet  
We present an occlusion-aware unsupervised neural network for jointly learning three low-level vision tasks from monocular videos: depth, optical flow, and camera motion.  ...  Empirical evaluation on the KITTI dataset demonstrates the effectiveness and improvement of our approach: (1) monocular depth estimation outperforms state-of-the-art unsupervised methods and is comparable  ...  In this paper, we propose a jointly learning network in an utterly unsupervised manner to predict depth, camera pose and optical flow from monocular video sequences with no labeling data or ground truth  ... 
doi:10.3390/fi10100092 fatcat:5jb6dyteuvf2dji4eiuqu6tiym

Unsupervised Monocular Depth Learning with Integrated Intrinsics and Spatio-Temporal Constraints [article]

Kenny Chen, Alexandra Pogue, Brett T. Lopez, Ali-akbar Agha-mohammadi, Ankur Mehta
2021 arXiv   pre-print
To this end, this work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion, in addition to camera intrinsics, from a sequence of monocular images via a  ...  Monocular depth inference has gained tremendous attention from researchers in recent years and remains as a promising replacement for expensive time-of-flight sensors, but issues with scale acquisition  ...  To this end, we propose an unsupervised, single-network monocular depth inference approach that considers both spatial and temporal geometric constraints to resolve the scale of a predicted depth map.  ... 
arXiv:2011.01354v3 fatcat:vcchczj3dne5ra56iwxx5eqkgm

Generative Adversarial Networks for Unsupervised Monocular Depth Prediction [chapter]

Filippo Aleotti, Fabio Tosi, Matteo Poggi, Stefano Mattoccia
2019 Lecture Notes in Computer Science  
The generator network learns to infer depth from the reference image to generate a warped target image.  ...  Inspired by these works and compelling results achieved by Generative Adversarial Network (GAN) on image reconstruction and generation tasks, in this paper we propose to cast unsupervised monocular depth  ...  Acknowledgement We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research.  ... 
doi:10.1007/978-3-030-11009-3_20 fatcat:fglhrxid5vcl5fdyiyc3u5exvm

Learning Structure-from-Motion from Motion [chapter]

Clément Pinard, Laure Chevalley, Antoine Manzanera, David Filliat
2019 Lecture Notes in Computer Science  
To overcome these limitations, we propose to learn in the same unsupervised manner a depth map inference system from monocular videos that takes a pair of images as input.  ...  learning of depth from videos.  ...  Single Frame Prediction vs Reality As already mentioned, in the current state of the art of learning from monocular footage, depth is always inferred by the network using a single image.  ... 
doi:10.1007/978-3-030-11015-4_27 fatcat:tbioucxpwzeuvpszrkit7ze6l4
« Previous Showing results 1 — 15 out of 475 results