A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
[article]
2018
arXiv
pre-print
The goal of this paper is to take a single 2D image of a scene and recover the 3D structure in terms of a small set of factors: a layout representing the enclosing surfaces as well as a set of objects ...
represented in terms of shape and pose. ...
We gratefully acknowledge NVIDIA corporation for the donation of Tesla GPUs used for this research. ...
arXiv:1712.01812v2
fatcat:cwpfek42s5bvzdvkh5iciy6gk4
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
2018
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
The goal of this paper is to take a single 2D image of a scene and recover the 3D structure in terms of a small set of factors: a layout representing the enclosing surfaces as well as a set of objects ...
represented in terms of shape and pose. ...
We gratefully acknowledge NVIDIA corporation for the donation of Tesla GPUs used for this research. ...
doi:10.1109/cvpr.2018.00039
dblp:conf/cvpr/TulsianiGFEM18
fatcat:roavoix2mfej5gnfku6ettj35q
Holistic 3D Scene Understanding from a Single Image with Implicit Representation
[article]
2021
arXiv
pre-print
We present a new pipeline for holistic 3D scene understanding from a single image, which could predict object shapes, object poses, and scene layout. ...
We not only propose an image-based local structured implicit network to improve the object shape estimation, but also refine the 3D object pose and scene layout via a novel implicit scene graph neural ...
The input image is also fed into a Layout Estimation Network (LEN) to produce a 3D layout bounding box and relative camera pose. ...
arXiv:2103.06422v3
fatcat:i5mmoev4gzfo5h7jepbti35oq4
DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization
[article]
2021
arXiv
pre-print
In this paper, we propose a novel method for panoramic 3D scene understanding which recovers the 3D room layout and the shape, pose, position, and semantic category for each object from a single full-view ...
Panorama images have a much larger field-of-view thus naturally encode enriched scene context information compared to standard perspective images, which however is not well exploited in the previous scene ...
understanding from a single full-view panorama image, which recovers the 3D room layout and the shape, pose, position, and semantic category of each object in the scene. ...
arXiv:2108.10743v1
fatcat:knc6p65etvfihcb6rey3oihrfu
Pano2CAD: Room Layout From A Single Panorama Image
[article]
2016
arXiv
pre-print
This paper presents a method of estimating the geometry of a room and the 3D pose of objects from a single 360-degree panorama image. ...
The method combines surface normal estimation, 2D object detection and 3D object pose estimation. ...
From these we obtain a first scene layout up to an unknown scale. Next, objects are detected using a trained detector and initial 3D poses are estimated using a libary of 3D models. ...
arXiv:1609.09270v2
fatcat:sm42hneuurgzzaccvzau5mk2bu
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image
[article]
2018
arXiv
pre-print
Specifically, we introduce a Holistic Scene Grammar (HSG) to represent the 3D scene structure, which characterizes a joint distribution over the functional and geometric space of indoor scenes. ...
We propose a computational framework to jointly parse a single RGB image and reconstruct a holistic 3D configuration composed by a set of CAD models using a stochastic grammar model. ...
and the 3D geometric structure of an indoor scene from a single RGB image. ...
arXiv:1808.02201v1
fatcat:ulqbp66cnfbnxmqnu4np6enxji
360-DFPE: Leveraging Monocular 360-Layouts for Direct Floor Plan Estimation
[article]
2022
arXiv
pre-print
Since our task is to sequentially capture the floor plan using monocular images, the entire scene structure, room instances, and room shapes are unknown. ...
Our approach leverages a loosely coupled integration between a monocular visual SLAM solution and a monocular 360-room layout approach, which estimate camera poses and layout geometries, respectively. ...
ACKNOWLEDGMENTS This work is supported by the MOST Joint Research Center for AI Technology and All Vista Healthcare, Taiwan Computing Cloud, and MOST 110-2634-F-007-016. ...
arXiv:2112.06180v3
fatcat:xt3t7w2xoretnmfsfbuxqhx7vy
3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image
2013
2013 IEEE Conference on Computer Vision and Pattern Recognition
The proposed method uses 2D face locations from a single image to estimate the camera pose and the spatial arrangement of people in 3D.
Figure 3 : 3 Taxonomy of attributes for Visual Proxemics. ...
Our 3D shape descriptors are invariant to camera pose variations often seen in web images and videos. The proposed approach also estimates camera pose and uses it to capture the intent of the photo. ...
doi:10.1109/cvpr.2013.437
dblp:conf/cvpr/ChakrabortyCJ13
fatcat:j535v2bljjazdgccjobmhi4nea
Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild
[article]
2020
arXiv
pre-print
We present a method that infers spatial arrangements and shapes of humans and objects in a globally consistent 3D scene, all from a single image in-the-wild captured in an uncontrolled environment. ...
In particular, we introduce a scale loss that learns the distribution of object size from data; an occlusion-aware silhouette re-projection loss to optimize object pose; and a human-object interaction ...
This work was funded in part by the CMU Argo AI Center for Autonomous Vehicle Research. ...
arXiv:2007.15649v2
fatcat:yccc5eaccncava3knmyw5dyzbe
IM2CAD
[article]
2017
arXiv
pre-print
Given a single photo of a room and a large database of furniture CAD models, our goal is to reconstruct a scene that is as similar as possible to the scene depicted in the photograph, and composed of objects ...
Our approach iteratively optimizes the placement and scale of objects in the room to best match scene renderings to the input photo, using image comparison metrics trained via deep convolutional neural ...
Acknowledgements This work was supported by funding from National Science Foundation grant IIS-1250793, Google, and the UW Animation Research Labs. ...
arXiv:1608.05137v2
fatcat:p2mvia5vlrdr5a2j4akm5ntece
Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes
[article]
2022
arXiv
pre-print
We present a new framework to reconstruct holistic 3D indoor scenes including both room background and indoor objects from single-view images. ...
Existing methods can only produce 3D shapes of indoor objects with limited geometry quality because of the heavy occlusion of indoor scenes. ...
From left to right: input image, the scene reconstructed by our method, results of Total3D [34] , Im3D [57] and our method in a different camera pose. ...
arXiv:2207.08656v2
fatcat:5yrv6mxk6fgjbc3d3b4ukm4n7i
Joint 3D Object and Layout Inference from a Single RGB-D Image
[chapter]
2015
Lecture Notes in Computer Science
Inferring 3D objects and the layout of indoor scenes from a single RGB-D image captured with a Kinect camera is a challenging task. ...
Towards this goal, we propose a high-order graphical model and jointly reason about the layout, objects and superpixels in the image. ...
In particular, we reason about the type, semantic class, 3D pose and 3D shape of each object and layout element. ...
doi:10.1007/978-3-319-24947-6_15
fatcat:yg4mq7f2vnajnoc2r6gf3bbqcu
Scene shape from texture of objects
2011
CVPR 2011
Tests against ground truth obtained from stereo images demonstrate that we can coarsely reconstruct a 3D model of the scene from a single image, without learning the layout of common scene surfaces, as ...
We present an approach to: (1) detecting distinct textures of objects in a scene, (2) reconstructing the 3D shape of detected texture surfaces, and (3) combining object detections and shape-from-texture ...
At t, descriptors X
Reconstructing 3D Scene Layout Deformations of texture elements from the known canonical pose can be used to estimate the underlying 3D shape of the texture surface. ...
doi:10.1109/cvpr.2011.5995326
dblp:conf/cvpr/PayetT11
fatcat:tje3cu6bg5fypchszwsoan3jde
Complete 3D Scene Parsing from an RGBD Image
[article]
2018
arXiv
pre-print
One major goal of vision is to infer physical models of objects, surfaces, and their layout from sensors. In this paper, we aim to interpret indoor scenes from one RGBD image. ...
Our representation encodes the layout of orthogonal walls and the extent of objects, modeled with CAD-like 3D shapes. ...
We thank David Forsyth for insightful comments and discussion and Saurabh Singh, Kevin Shih and Tanmay Gupta for their comments on an earlier version of the manuscript. ...
arXiv:1710.09490v2
fatcat:mctuctwslvgdnkt2l5shoruolq
Marr Revisited: 2D-3D Alignment via Surface Normal Prediction
2016
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
We use the image and predicted surface normals to retrieve a 3D model from a large library of object CAD models. ...
When using the predicted surface normals, our two-stream network matches prior work using surface normals computed from RGB-D images on the task of pose prediction, and achieves state of the art when using ...
We thank Saining Xie for discussion on skip-network architectures, David Fouhey for providing code to compute normals from Kinect data, and Saurabh Gupta for help with the pose estimation evaluation setup ...
doi:10.1109/cvpr.2016.642
dblp:conf/cvpr/BansalRG16
fatcat:oz54dqxt4feh5njx7xiaru3iki
« Previous
Showing results 1 — 15 out of 4,057 results