Filters








11,981 Hits in 5.5 sec

3D Bounding Box Estimation Using Deep Learning and Geometry [article]

Arsalan Mousavian, Dragomir Anguelov, John Flynn, Jana Kosecka
2017 arXiv   pre-print
These estimates, combined with the geometric constraints on translation imposed by the 2D bounding box, enable us to recover a stable and accurate 3D object pose.  ...  combines these estimates with geometric constraints provided by a 2D object bounding box to produce a complete 3D bounding box.  ...  In summary, the main contributions of our paper include: 1) A method to estimate an object's full 3D pose and dimensions from a 2D bounding box using the constraints provided by projective geometry and  ... 
arXiv:1612.00496v2 fatcat:uhlm3n6xxfbk5pzen3msvmmzm4

Holistic 3D Scene Understanding from a Single Image with Implicit Representation [article]

Cheng Zhang, Zhaopeng Cui, Yinda Zhang, Bing Zeng, Marc Pollefeys, Shuaicheng Liu
2021 arXiv   pre-print
Extensive experiments demonstrate that our method outperforms the state-of-the-art methods in terms of object shape, scene layout estimation, and 3D object detection.  ...  We not only propose an image-based local structured implicit network to improve the object shape estimation, but also refine the 3D object pose and scene layout via a novel implicit scene graph neural  ...  The input image is also fed into a Layout Estimation Network (LEN) to produce a 3D layout bounding box and relative camera pose.  ... 
arXiv:2103.06422v3 fatcat:i5mmoev4gzfo5h7jepbti35oq4

Geometry Uncertainty Projection Network for Monocular 3D Object Detection [article]

Yan Lu, Xinzhu Ma, Lei Yang, Tianzhu Zhang, Yating Liu, Qi Chu, Junjie Yan, Wanli Ouyang
2021 arXiv   pre-print
Geometry Projection is a powerful depth estimation method in monocular 3D object detection. It estimates depth dependent on heights, which introduces mathematical priors into the deep model.  ...  The overall model can infer more reliable object depth than existing methods and outperforms the state-of-the-art image-based monocular 3D detectors by 3.74% and 4.7% AP40 of the car and pedestrian categories  ...  Thus we use this confidence as the conditional 3D bounding box scores p 3d|2d in the testing.  ... 
arXiv:2107.13774v2 fatcat:gkqcrxw4dvaqtdsaykrcdcxozi

GenScan: A Generative Method for Populating Parametric 3D Scan Datasets [article]

Mohammad Keshavarzi, Oladapo Afolabi, Luisa Caldas, Allen Y. Yang, Avideh Zakhor
2020 arXiv   pre-print
We believe our system would facilitate data augmentation to expand the currently limited 3D geometry datasets commonly used in 3D computer vision, generative design, and general 3D deep learning tasks.  ...  The availability of rich 3D datasets corresponding to the geometrical complexity of the built environments is considered an ongoing challenge for 3D deep learning methodologies.  ...  We use the estimated floorplan layout and door sizes to construct threshold bounding boxes centered on each parametric line.  ... 
arXiv:2012.03998v1 fatcat:ecmbnamsunhdnpzhhybddcp3mm

Full 3D layout reconstruction from one single 360º image

Clara Fernández Labrador, Alejandro Pérez Yus, Gonzalo López Nicolás, José Jesús Guerrero Campo
2018 Jornada de Jóvenes Investigadores del I3A  
We exploit deep learning combined with geometry to obtain structural lines, and thus structural corners, from which we generate final layout models assuming Manhattan world.  ...  We propose an entire pipeline which receives as input a 360º panorama and returns a closed, 3D reconstruction of the room faithful to its actual shape.  ...  Layouts from Panoramic Images with Geometry and Deep Learning. Submitted 2018. Figure 2 . 2 Layout estimations handling different geometries. Figure 1 . 1 Proposed pipeline.  ... 
doi:10.26754/jji-i3a.201802835 fatcat:6gl2ttekdzd4xk64z7mhlxacre

Autonomous Driving with Deep Learning: A Survey of State-of-Art Technologies [article]

Yu Huang, Yue Chen
2020 arXiv   pre-print
Almost at the same time, deep learning has made breakthrough by several pioneers, three of them (also called fathers of deep learning), Hinton, Bengio and LeCun, won ACM Turin Award in 2019.  ...  Due to the limited space, we focus the analysis on several key areas, i.e. 2D and 3D object detection in perception, depth estimation from cameras, multiple sensor fusion on the data, feature and task  ...  3D bounding box.  ... 
arXiv:2006.06091v3 fatcat:nhdgivmtrzcarp463xzqvnxlwq

IM2CAD [article]

Hamid Izadinia, Qi Shan, Steven M. Seitz
2017 arXiv   pre-print
Our approach iteratively optimizes the placement and scale of objects in the room to best match scene renderings to the input photo, using image comparison metrics trained via deep convolutional neural  ...  Given a single photo of a room and a large database of furniture CAD models, our goal is to reconstruct a scene that is as similar as possible to the scene depicted in the photograph, and composed of objects  ...  Acknowledgements This work was supported by funding from National Science Foundation grant IIS-1250793, Google, and the UW Animation Research Labs.  ... 
arXiv:1608.05137v2 fatcat:p2mvia5vlrdr5a2j4akm5ntece

PointPoseNet: Point Pose Network for Robust 6D Object Pose Estimation

Wei Chen, Jinming Duan, Hector Basevi, Hyung Jin Chang, Ales Leonardis
2020 2020 IEEE Winter Conference on Applications of Computer Vision (WACV)  
Specifically, our method takes the point cloud as input and regresses the point-wise unit vectors pointing to the 3D keypoints.  ...  We then use these vectors to generate keypoint hypotheses from which the 6D object pose hypotheses are computed.  ...  that learning in 3D space better exploits the geometric and topological structure of 3D space which is useful to pose estimate.  ... 
doi:10.1109/wacv45572.2020.9093272 dblp:conf/wacv/Chen0BCL20 fatcat:5n4r4dskvjgmfpyehuz3u3unby

Deep MANTA: A Coarse-to-Fine Many-Task Network for Joint 2D and 3D Vehicle Analysis from Monocular Image

Florian Chabot, Mohamed Chaouch, Jaonary Rabarisoa, Celine Teuliere, Thierry Chateau
2017 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
In the inference, the network's outputs are used by a real time robust pose estimation algorithm for fine orientation estimation and 3D vehicle localization.  ...  A robust convolutional network is introduced for simultaneous vehicle detection, part localization, visibility characterization and 3D dimension estimation.  ...  First, the input image is passed through the Deep MANTA network that outputs 2D scored bounding boxes, associated vehicle geometry (vehicle part Using these outputs, the inference step allows to choose  ... 
doi:10.1109/cvpr.2017.198 dblp:conf/cvpr/ChabotCRTC17 fatcat:6vu2eukgtjcxnhsasnts4kms4u

Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image [article]

Florian Chabot, Mohamed Chaouch, Jaonary Rabarisoa, Céline Teulière, Thierry Chateau
2017 arXiv   pre-print
In the inference, the network's outputs are used by a real time robust pose estimation algorithm for fine orientation estimation and 3D vehicle localization.  ...  A robust convolutional network is introduced for simultaneous vehicle detection, part localization, visibility characterization and 3D dimension estimation.  ...  First, the input image is passed through the Deep MANTA network that outputs 2D scored bounding boxes, associated vehicle geometry (vehicle part Using these outputs, the inference step allows to choose  ... 
arXiv:1703.07570v1 fatcat:26i65dkp3fcm7dcu5day2c6ncq

Geometry-based Distance Decomposition for Monocular 3D Object Detection [article]

Xuepeng Shi, Qi Ye, Xiaozhi Chen, Chuangrong Chen, Zhixiang Chen, Tae-Kyun Kim
2022 arXiv   pre-print
Our method directly predicts 3D bounding boxes from RGB images with a compact architecture, making the training and inference simple and efficient.  ...  The decomposition also enables us to trace the causes of the distance uncertainty for different scenarios. Such decomposition makes the distance prediction interpretable, accurate, and robust.  ...  Uncertainty Estimation There are two seminal works [20, 21] exploring uncertainties in deep learning for computer vision.  ... 
arXiv:2104.03775v3 fatcat:yrbzdikqcbb2ffd43jf2cwde5q

Deep Hough Voting for 3D Object Detection in Point Clouds [article]

Charles R. Qi, Or Litany, Kaiming He, Leonidas J. Guibas
2019 arXiv   pre-print
3D boxes.  ...  However, due to the sparse nature of the data -- samples from 2D manifolds in 3D space -- we face a major challenge when directly predicting bounding box parameters from scene points: a 3D object centroid  ...  We thank Daniel Huber, Justin Johnson, Georgia Gkioxari and Jitendra Malik for valuable discussions and feedback.  ... 
arXiv:1904.09664v2 fatcat:7lwvrvnklvetpf6ama326ieh3i

Scene Recomposition by Learning-based ICP [article]

Hamid Izadinia, Steven M. Seitz
2020 arXiv   pre-print
While LICP is trained on synthetic data and without 3D real scene annotations, it outperforms both learned local deep feature matching and geometric based alignment methods in real scenes.  ...  In addition to the fully automatic system, the key technical contribution is a novel approach for aligning CAD models to 3D scans, based on deep reinforcement learning.  ...  Acknowledgments This work was supported in part by the University of Washington Animation Research Labs and Google.  ... 
arXiv:1812.05583v2 fatcat:r54bqxvb7zahpn4gtpzzaw3uga

Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation [article]

Siyuan Huang, Siyuan Qi, Yinxue Xiao, Yixin Zhu, Ying Nian Wu, Song-Chun Zhu
2019 arXiv   pre-print
We employ three cooperative losses for 3D bounding boxes, 2D projections, and physical constraints to estimate a geometrically consistent and physically plausible 3D scene.  ...  Holistic 3D indoor scene understanding refers to jointly recovering the i) object bounding boxes, ii) room layout, and iii) camera pose, all in 3D.  ...  Hongjing Lu from the UCLA Psychology Department for useful discussions on the motivation of this work, and three anonymous reviewers for their constructive comments.  ... 
arXiv:1810.13049v2 fatcat:x5robvibl5hb3p4v6javs72kfa

Scene Recomposition by Learning-Based ICP

Hamid Izadinia, Steven M. Seitz
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
While a fused reconstruction (top) contains holes and noisy geometry, our recomposition (bottom) models the scene as a set of high quality 3D shapes from CAD databases.  ...  Given an RGBD sequence from a moving camera, we produce a 3D CAD recomposition of the scene.  ...  Acknowledgments This work was supported in part by the University of Washington Animation Research Labs and Google.  ... 
doi:10.1109/cvpr42600.2020.00101 dblp:conf/cvpr/IzadiniaS20 fatcat:yxu3unto2rbfnbxfdq6clmu3by
« Previous Showing results 1 — 15 out of 11,981 results