Filters








11 Hits in 5.9 sec

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose [article]

Shih-Yang Su, Frank Yu, Michael Zollhoefer, Helge Rhodin
2021 arXiv   pre-print
This enables learning volumetric body shape and appearance from scratch while jointly refining the articulated pose; all without ground truth labels for appearance, pose, or 3D shape on the input videos  ...  While deep learning reshaped the classical motion capture pipeline with feed-forward networks, generative models are required to recover fine alignment via iterative refinement.  ...  C Additional Pose Refinement Results for A-NeRF We show additional pose refinement results in Figure A3 . A-NeRF estimates human poses that align better with the training images.  ... 
arXiv:2102.06199v3 fatcat:usxlnp6wfrazdp66xrsvfqmdne

DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks [article]

Shih-Yang Su, Timur Bagautdinov, Helge Rhodin
2022 arXiv   pre-print
Deep learning greatly improved the realism of animatable human models by learning geometry and appearance from collections of 3D scans, template meshes, and multi-view imagery.  ...  First, we model correlations of body parts explicitly with a graph neural network.  ...  DANBO shows better robustness and generalization than the surface-free approach A-NeRF.  ... 
arXiv:2205.01666v1 fatcat:bwihm5aawfhbdmqusrxxlwy7wi

Advances in Neural Rendering [article]

Ayush Tewari, Justus Thies, Ben Mildenhall, Pratul Srinivasan, Edgar Tretschk, Yifan Wang, Christoph Lassner, Vincent Sitzmann, Ricardo Martin-Brualla, Stephen Lombardi, Tomas Simon, Christian Theobalt (+5 others)
2022 arXiv   pre-print
This state-of-the-art report on advances in neural rendering focuses on methods that combine classical rendering principles with learned 3D scene representations, often now referred to as neural scene  ...  ., from a CT scan), or implicit surface functions (e.g., truncated signed distance fields).  ...  The reconstructed animatable human models can be used for free-viewpoint rendering and rerendering under novel poses.  ... 
arXiv:2111.05849v2 fatcat:nbvkfg2bjvgqdopdqwl33rt4ii

GNeRF: GAN-based Neural Radiance Field without Posed Camera [article]

Quan Meng, Anpei Chen, Haimin Luo, Minye Wu, Hao Su, Lan Xu, Xuming He, Jingyi Yu
2021 arXiv   pre-print
The first phase takes the use of GANs into the new realm for optimizing coarse camera poses and radiance fields jointly, while the second phase refines them with additional photometric loss.  ...  poses.  ...  Acknowledgements We would like to thank the anonymous reviewers for their detailed and constructive comments which were helpful in refining the paper.  ... 
arXiv:2103.15606v3 fatcat:rvxbsvjxynaapckg63sgdxs2wq

H-NeRF: Neural Radiance Fields for Rendering and Temporal Reconstruction of Humans in Motion [article]

Hongyi Xu, Thiemo Alldieck, Cristian Sminchisescu
2021 arXiv   pre-print
We present neural radiance fields for rendering and temporal (4D) reconstruction of humans in motion (H-NeRF), as captured by a sparse set of cameras or even from a monocular video.  ...  Our approach combines ideas from neural scene representation, novel-view synthesis, and implicit statistical geometric human representations, coupled using novel loss functions.  ...  Related to our approach, some methods integrate human body models to fuse information over time. A-NeRF [48] uses a skeleton to rigidly transform NeRF features to refine estimated 3D poses.  ... 
arXiv:2110.13746v2 fatcat:4fhzhoqd4ncohj5i4qf6d3lpb4

BANMo: Building Animatable 3D Neural Models from Many Casual Videos [article]

Gengshan Yang, Minh Vo, Natalia Neverova, Deva Ramanan, Andrea Vedaldi, Hanbyul Joo
2021 arXiv   pre-print
On real and synthetic datasets, BANMo shows higher-fidelity 3D reconstructions than prior works for humans and animals, with the ability to render realistic images from novel viewpoints and poses.  ...  BANMo builds high-fidelity, articulated 3D models (including shape and animatable skinning weights) from many monocular casual videos in a differentiable rendering framework.  ...  Neural free-view synthesis of human actors with pose con- In CVPR, 2000. 2 trol.  ... 
arXiv:2112.12761v2 fatcat:creiz2vswzdozoghhury7g5aza

NeRS: Neural Reflectance Surfaces for Sparse-view 3D Reconstruction in the Wild [article]

Jason Y. Zhang, Gengshan Yang, Shubham Tulsiani, Deva Ramanan
2021 arXiv   pre-print
We demonstrate that surface-based neural reconstructions enable learning from such data, outperforming volumetric neural rendering-based reconstructions.  ...  But because the vast majority of real-world scenes are composed of well-defined surfaces, we introduce a surface analog of such implicit models called Neural Reflectance Surfaces (NeRS).  ...  Neural Surface Representation We represent object shape via a deformation of a unit sphere.  ... 
arXiv:2110.07604v3 fatcat:ykr3j5o6mng6lnu7o6e4u6oulu

NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing [article]

Jiakai Zhang, Liao Wang, Xinhang Liu, Fuqiang Zhao, Minzhang Li, Haizhao Dai, Boyuan Zhang, Wei Yang, Lan Xu, Jingyi Yu
2022 arXiv   pre-print
The core of NeuVV is to efficiently encode a dynamic neural radiance field (NeRF) into renderable and editable primitives.  ...  In this paper, we present a neural volumography technique called neural volumetric video or NeuVV to support immersive, interactive, and spatial-temporal rendering of volumetric video contents with photo-realism  ...  rendered by querying respective rays from the camera via neural inference.  ... 
arXiv:2202.06088v1 fatcat:23qn5ffx6raglmp363hz5iizne

Tracking and Planning with Spatial World Models [article]

Baris Kayalibay, Atanas Mirchev, Patrick van der Smagt, Justin Bayer
2022 arXiv   pre-print
We do this by planning in a learned 3D spatial world model, combined with a pose estimation algorithm previously used in the context of TSDF fusion, but now tailored to our setting and improved to incorporate  ...  We evaluate over six simulated environments based on complex human-designed floor plans and provide quantitative results.  ...  We also note the major runtime difference when rendering from a NeRF map (Mildenhall et al., 2020) at 2.72s (0.4Hz) and from a voxel map (Mirchev et al., 2021) at 0.05s (20Hz) per image (fig. 4(c)  ... 
arXiv:2201.10335v1 fatcat:rfty3jxmqzfenntfinsx2pvfo4

Multimodal Image Synthesis and Editing: A Survey [article]

Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Shijian Lu, Lingjie Liu, Adam Kortylewski, Christian Theobalt, Eric Xing
2022 arXiv   pre-print
We then describe multimodal image synthesis and editing approaches extensively with detailed frameworks including Generative Adversarial Networks (GANs), Auto-regressive models, Diffusion models, Neural  ...  Accompanied with the new task that aims to construct a NeRF conditioned on a semantic mask, Chen et al.  ...  A pre-trained CLIP is exploited to supervise the neural human generation including 3D geometry, texture and animation.  ... 
arXiv:2112.13592v3 fatcat:46twjhz3hbe6rpm33k6ilnisga

Human pose and stride length estimation

Eric Hedlin
2021
Through analysis of our refiner, we show a flaw inherent in the human body model---the inaccuracy in the typical shape-to-pose regressor (joint regressor)---for a standard human pose dataset and show that  ...  We also describe work done in improving the state of the art in human pose estimation. We first propose a pose refinement method that enhances state-of-the-art methods.  ...  Recent work in this area includes A-NeRF [40] , which optimizes an initial 3D pose estimate at the same time as a NeRF [27] network variant whose positional encoding is relative to the estimated joints  ... 
doi:10.14288/1.0401772 fatcat:ud7r3ou67va2ze5yonbgu2y7li