Filters








329 Hits in 7.5 sec

Novel View Synthesis of Dynamic Scenes with Globally Coherent Depths from a Monocular Camera [article]

Jae Shin Yoon, Kihwan Kim, Orazio Gallo, Hyun Soo Park, Jan Kautz
2020 arXiv   pre-print
A key challenge for the novel view synthesis arises from dynamic scene reconstruction where epipolar geometry does not apply to the local motion of dynamic contents.  ...  Our insight is that although its scale and quality are inconsistent with other views, the depth estimation from a single view can be used to reason about the globally coherent geometry of dynamic contents  ...  More recent approach, which predicts the depth from single view with human specific priors, realizes the view synthesis of dynamic scene of moving people from a monocular camera [24] .  ... 
arXiv:2004.01294v1 fatcat:nl5ntfmmznbtfcbpip5o6k6hki

Learning Dynamic View Synthesis With Few RGBD Cameras [article]

Shengze Wang, YoungJoong Kwon, Yuan Shen, Qian Zhang, Andrei State, Jia-Bin Huang, Henry Fuchs
2022 arXiv   pre-print
There have been significant advancements in dynamic novel view synthesis in recent years.  ...  We propose to utilize RGBD cameras to remove these limitations and synthesize free-viewpoint videos of dynamic indoor scenes.  ...  Introduction Dynamic novel view synthesis is the task of using a set of input video frames to synthesize videos of the dynamic scene from novel viewpoints.  ... 
arXiv:2204.10477v2 fatcat:tfortvxrwrcthkcrnbxjdslf7y

Unsupervised Learning of Depth, Camera Pose and Optical Flow from Monocular Video [article]

Dipan Mandal, Abhilash Jain, Sreenivas Subramoney
2022 arXiv   pre-print
We propose DFPNet -- an unsupervised, joint learning system for monocular Depth, Optical Flow and egomotion (Camera Pose) estimation from monocular image sequences.  ...  Evaluation on KITTI and Cityscapes driving datasets reveals that our model achieves results comparable to state-of-the-art in all of the three tasks, even with the significantly smaller model size.  ...  Depth & Pose Supervision The key supervision signal for our depth and pose training comes from the task of novel view synthesis: given one input view of a scene, synthesize a new image of the scene seen  ... 
arXiv:2205.09821v1 fatcat:yepxjozka5b2hko6tab2ev4p7u

Unsupervised Video Depth Estimation Based on Ego-motion and Disparity Consensus [article]

Lingtao Zhou, Jiaojiao Fang, Guizhong Liu
2019 arXiv   pre-print
In this paper, we propose a novel unsupervised monocular video depth estimation method in natural scenes by taking advantage of the state-of-the-art method of Zhou et al. which jointly estimates depth  ...  and camera motion.  ...  Novel view synthesis based on both spatial and temporal Geometry Optics The key supervision signal for our CNNs model comes from the task of novel view synthesis: given one input view of a scene, synthesize  ... 
arXiv:1909.01028v1 fatcat:3sgsalnd6jaaln7wp2iemixawy

Video Extrapolation in Space and Time [article]

Yunzhi Zhang, Jiajun Wu
2022 arXiv   pre-print
However, they can both be seen as ways to observe the spatial-temporal world: NVS aims to synthesize a scene from a new point of view, while VP aims to see a scene from a new point of time.  ...  These two tasks provide complementary signals to obtain a scene representation, as viewpoint changes from spatial observations inform depth, and temporal observations inform the motion of cameras and individual  ...  This work is in part supported by the Stanford Institute for Human-Centered AI (HAI), the Stanford Center for Integrated Facility Engineering (CIFE), the Samsung Global Research Outreach (GRO) Program,  ... 
arXiv:2205.02084v2 fatcat:cunyunwudrcnpmocgwvmh3lcga

TöRF: Time-of-Flight Radiance Fields for Dynamic Scene View Synthesis [article]

Benjamin Attal, Eliot Laidlaw, Aaron Gokaslan, Changil Kim, Christian Richardt, James Tompkin, Matthew O'Toole
2021 arXiv   pre-print
We replace these priors with measurements from a time-of-flight (ToF) camera, and introduce a neural representation based on an image formation model for continuous-wave ToF cameras.  ...  Several works extend these to dynamic scenes captured with monocular video, with promising performance.  ...  Novel view synthe- sis of dynamic scenes with globally coherent depths from a monocular camera. In CVPR, 2020.  ... 
arXiv:2109.15271v2 fatcat:bxm73ltkobaivjrxc4yv2izsoy

GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose [article]

Zhichao Yin, Jianping Shi
2018 arXiv   pre-print
We propose GeoNet, a jointly unsupervised learning framework for monocular depth, optical flow and ego-motion estimation from videos.  ...  Specifically, geometric relationships are extracted over the predictions of individual modules and then combined as an image reconstruction loss, reasoning about static and dynamic scene parts separately  ...  The core supervision typically comes from a view synthesis objective based on geometric inferences.  ... 
arXiv:1803.02276v2 fatcat:rcqfxt53qzfehb2uiqoav7w42e

GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

Zhichao Yin, Jianping Shi
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
We propose GeoNet, a jointly unsupervised learning framework for monocular depth, optical flow and egomotion estimation from videos.  ...  Specifically, geometric relationships are extracted over the predictions of individual modules and then combined as an image reconstruction loss, reasoning about static and dynamic scene parts separately  ...  The core supervision typically comes from a view synthesis objective based on geometric inferences.  ... 
doi:10.1109/cvpr.2018.00212 dblp:conf/cvpr/YinS18 fatcat:osotve7hgvetdgp6d5hvv2u32q

Unsupervised Learning of Depth and Ego-Motion from Video

Tinghui Zhou, Matthew Brown, Noah Snavely, David G. Lowe
2017 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
We present an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences.  ...  Our method uses single-view depth and multiview pose networks, with a loss based on warping nearby views to the target using the computed depth and pose.  ...  view synthesis: given one input view of a scene, synthesize a new image of the scene seen from a different camera pose.  ... 
doi:10.1109/cvpr.2017.700 dblp:conf/cvpr/ZhouBSL17 fatcat:eiztmmmcgvdtjchgbcf623woqi

Unsupervised Learning of Depth and Ego-Motion from Video [article]

Tinghui Zhou, Matthew Brown, Noah Snavely, David G. Lowe
2017 arXiv   pre-print
We present an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences.  ...  We achieve this by simultaneously training depth and camera pose estimation networks using the task of view synthesis as the supervisory signal.  ...  Approach View synthesis as supervision The key supervision signal for our depth and pose prediction CNNs comes from the task of novel view synthesis: given one input view of a scene, synthesize a new  ... 
arXiv:1704.07813v2 fatcat:clwvfmnmorh65ifi65b7fh3fuu

Human View Synthesis using a Single Sparse RGB-D Input [article]

Phong Nguyen, Nikolaos Sarafianos, Christoph Lassner, Janne Heikkila, Tony Tung
2021 arXiv   pre-print
Aiming to address these limitations, we present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D, similar  ...  to a low-cost depth camera, and without actor-specific models.  ...  Novel view synthesis of dynamic scenes and Applications, 27(3):21–31, 2007. 3 with globally coherent depths from a monocular camera.  ... 
arXiv:2112.13889v2 fatcat:u6e2uuinxra2lnrknjmrcsd6yq

Foreground-aware Dense Depth Estimation for 360 Images

Qi Feng, Hubert P. H. Shum, Ryo Shimamura, Shigeo Morishima
2020 Journal of WSCG  
Figure 1: A demonstration of incorrect representations of dynamic objects (e.g. a running person) captured with an omnidirectional RGB-D scanning device.  ...  A new local depth loss considers small regions of interests and ensures that their depth estimations are not smoothed out during the global gradient's optimization.  ...  Lately, numerous strategies have been proposed to achieve a more coherent and accurate monocular depth estimation.  ... 
doi:10.24132/jwscg.2020.28.10 fatcat:myjvc7kabrgivjs2ljrjms7p7q

Deep 3D Pan via adaptive "t-shaped" convolutions with global and local adaptive dilations [article]

Juan Luis Gonzalez Bello, Munchurl Kim
2019 arXiv   pre-print
We propose a novel network architecture to perform stereoscopic view synthesis at arbitrary camera positions along the X-axis, or Deep 3D Pan, with "t-shaped" adaptive kernels equipped with globally and  ...  Our proposed network architecture, the monster-net, is devised with a novel "t-shaped" adaptive kernel with globally and locally adaptive dilation, which can efficiently incorporate global camera shift  ...  Novel view synthesis is the task of generating a new view seen from a different camera position, given a single or multiple input images, and finds many applications in robotics, navigation, virtual and  ... 
arXiv:1910.01089v3 fatcat:thfflqsambcl7bz5in2gxu6msy

Consistent Video Depth Estimation [article]

Xuan Luo, Jia-Bin Huang, Richard Szeliski, Kevin Matzen, Johannes Kopf
2020 arXiv   pre-print
Our algorithm is able to handle challenging hand-held captured input videos with a moderate degree of dynamic motion.  ...  We present an algorithm for reconstructing dense, geometrically consistent depth for all pixels in a monocular video.  ...  Dense depth estimation facilitates a wide variety of visual effects such as synthetic depth-of-field , novel view synthesis [Hedman et al. 2017; Hedman and Kopf 2018; Shih et al. 2020] , and occlusionaware  ... 
arXiv:2004.15021v2 fatcat:k67ia5ez4rdzxc6phce4uo4hhy

Multi-view Monocular Depth and Uncertainty Prediction with Deep SfM in Dynamic Environments [article]

Christian Homeyer, Oliver Lange, Christoph Schnörr
2022 arXiv   pre-print
3D reconstruction of depth and motion from monocular video in dynamic environments is a highly ill-posed problem due to scale ambiguities when projecting to the 2D image domain.  ...  This results in cleaner reconstructions both on static and dynamic parts of the scene.  ...  .: Novel view synthesis of dynamic scenes with globally coherent depths from a monocular camera.  ... 
arXiv:2201.08633v1 fatcat:fnangw6thvafhpyuuycvy4ldiu
« Previous Showing results 1 — 15 out of 329 results