Filters








8,143 Hits in 6.6 sec

Learning to Regress Bodies from Images using Differentiable Semantic Rendering [article]

Sai Kumar Dwivedi, Nikos Athanasiou, Muhammed Kocabas, Michael J. Black
2022 arXiv   pre-print
To do so, we train a body regressor using a novel Differentiable Semantic Rendering - DSR loss.  ...  Learning to regress 3D human body shape and pose (e.g.  ...  Learning to Regress Bodies from Images using Differentiable Semantic Rendering Supplementary Material A.  ... 
arXiv:2110.03480v2 fatcat:6oi25y5pw5bgdbhzunz47rh2ny

Human Parsing Based Texture Transfer from Single Image to 3D Human via Cross-View Consistency

Fang Zhao, Shengcai Liao, Kaihao Zhang, Ling Shao
2020 Neural Information Processing Systems  
This paper proposes a human parsing based texture transfer model via cross-view consistency learning to generate the texture of 3D human body from a single image.  ...  We use the semantic parsing of human body as input for providing both the shape and pose information to reduce the appearance variation of human image and preserve the spatial distribution of semantic  ...  Finally, we adopt NMR [19] , a differentiable renderer, to render the human body mesh M with the predicted texture image I uv to obtain the rendered image: I rend = R(M, I uv ). (2) Model Learning via  ... 
dblp:conf/nips/ZhaoLZ020 fatcat:kxbs5ykhzrbf3eafeynmsdwr64

DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare [article]

Yuanlu Xu, Song-Chun Zhu, Tony Tung
2019 arXiv   pre-print
We present DenseRaC, a novel end-to-end framework for jointly estimating 3D human pose and body shape from a monocular RGB image.  ...  Our model jointly learns to represent the 3D human body from hybrid datasets, mitigating the problem of unpaired training data.  ...  We would like to thank Tengyu Liu and Elan Markowitz for helping with data collection, Tuur Jan M Stuyck and Aaron Ferguson for cloth simulation, Natalia Neverova and colleagues at FRL, FAIR and UCLA for  ... 
arXiv:1910.00116v2 fatcat:u67kndqjqneufc6rkatpgshkhq

DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare

Yuanlu Xu, Song-Chun Zhu, Tony Tung
2019 2019 IEEE/CVF International Conference on Computer Vision (ICCV)  
We present DenseRaC, a novel end-to-end framework for jointly estimating 3D human pose and body shape from a monocular RGB image.  ...  Our model jointly learns to represent the 3D human body from hybrid datasets, mitigating the problem of unpaired training data.  ...  We would like to thank Tengyu Liu and Elan Markowitz for helping with data collection, Tuur Jan M Stuyck and Aaron Ferguson for cloth simulation, Natalia Neverova and colleagues at FRL, FAIR and UCLA for  ... 
doi:10.1109/iccv.2019.00785 dblp:conf/iccv/XuZT19 fatcat:u3c3lz62ezcyhev3fcbooo6lem

Learning Motion-Dependent Appearance for High-Fidelity Rendering of Dynamic Humans from a Single Camera [article]

Jae Shin Yoon, Duygu Ceylan, Tuanfeng Y. Wang, Jingwan Lu, Jimei Yang, Zhixin Shu, Hyun Soo Park
2022 arXiv   pre-print
We model an equivariant encoder that can generate the generalizable representation from the spatial and temporal derivatives of the 3D body surface.  ...  This learned representation is decoded by a compositional multi-task decoder that renders high fidelity time-varying appearance.  ...  Acknowledgement We would like to thank Julien Philip for providing useful feedback on our paper draft. Jae Shin Yoon is supported by Doctoral Dissertation Fellowship from University of Minnesota.  ... 
arXiv:2203.12780v1 fatcat:i2o7bdrgrrav5ajox6i6nzj3xe

MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

Ayush Tewari, Michael Zollhofer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Perez, Christian Theobalt
2017 2017 IEEE International Conference on Computer Vision Workshops (ICCVW)  
Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image.  ...  The trained encoder predicts these parameters from a single monocular image, all at once.  ...  We also thank Anh Tuan Tran and colleagues for making their source code publicly available, and Elad Richardson for running his approach on our images.  ... 
doi:10.1109/iccvw.2017.153 dblp:conf/iccvw/TewariZK0BPT17 fatcat:fduddae62zfi7l4ptey6ao5ase

Semantics-Aligned Representation Learning for Person Re-identification [article]

Xin Jin, Cuiling Lan, Wenjun Zeng, Guoqiang Wei, Zhibo Chen
2020 arXiv   pre-print
This is a challenging task, as the images to be matched are generally semantically misaligned due to the diversity of human poses and capture viewpoints, incompleteness of the visible bodies (due to occlusion  ...  In this paper, we propose a framework that drives the reID network to learn semantics-aligned feature representation through delicate supervision designs.  ...  To encourage the SA-Enc to learn semantically aligned features, the SA-Dec is introduced and used to regress/generate the densely semantically aligned full texture image (also referred to as texture image  ... 
arXiv:1905.13143v3 fatcat:sezzgtm32rhilixol3yamffwnm

MoFA: Model-Based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction

Ayush Tewari, Michael Zollhofer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Perez, Christian Theobalt
2017 2017 IEEE International Conference on Computer Vision (ICCV)  
Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image.  ...  The trained encoder predicts these parameters from a single monocular image, all at once.  ...  Our model-based decoder is fully differentiable and allows end-to-end learning of our network.  ... 
doi:10.1109/iccv.2017.401 dblp:conf/iccv/TewariZK0BPT17 fatcat:sxazzew3uva2jgmnvydkx5fwim

Semantics-Aligned Representation Learning for Person Re-Identification

Xin Jin, Cuiling Lan, Wenjun Zeng, Guoqiang Wei, Zhibo Chen
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
This is a challenging task, as the images to be matched are generally semantically misaligned due to the diversity of human poses and capture viewpoints, incompleteness of the visible bodies (due to occlusion  ...  In this paper, we propose a framework that drives the reID network to learn semantics-aligned feature representation through delicate supervision designs.  ...  To encourage the SA-Enc to learn semantically aligned features, the SA-Dec is introduced and used to regress/generate the densely semantically aligned full texture image (also referred to as texture image  ... 
doi:10.1609/aaai.v34i07.6775 fatcat:aivysxbepzcafieuwnwxrmmoyu

MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction [article]

Ayush Tewari, Michael Zollhöfer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Pérez, Christian Theobalt
2017 arXiv   pre-print
Due to this new way of combining CNN-based with model-based face reconstruction, the CNN-based encoder learns to extract semantically meaningful parameters from a single monocular input image.  ...  The core innovation is our new differentiable parametric decoder that encapsulates image formation analytically based on a generative model.  ...  Our model-based decoder is fully differentiable and allows end-to-end learning of our network.  ... 
arXiv:1703.10580v2 fatcat:cneur2vcw5hkfefi3dukcpk4xm

State of the Art on Neural Rendering [article]

Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, Rohit Pandey, Sean Fanello (+7 others)
2020 arXiv   pre-print
Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e.g., by the integration of differentiable rendering  ...  This state-of-the-art report is focused on the many important use cases for the described algorithms such as novel view synthesis, semantic photo manipulation, facial and body reenactment, relighting,  ...  Other approaches reconstruct a learned representation of the scene from the observations, learning it end-to-end with a differentiable renderer.  ... 
arXiv:2004.03805v1 fatcat:6qs7ddftkfbotdlfd4ks7llovq

Neural Descent for Visual 3D Human Pose and Shape [article]

Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu
2021 arXiv   pre-print
expensive state gradient descent in order to accurately minimize a semantic differentiable rendering loss at test time.  ...  Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation when training the model parameters,and  ...  [28] uses a neural network to directly regress the pose and shape parameters of a 3d body model from predicted body semantic segmentation.  ... 
arXiv:2008.06910v2 fatcat:w6iefov325bz5cfjmvi77vzgme

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows [article]

Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, Bill Freeman, Rahul Sukthankar, Cristian Sminchisescu
2020 arXiv   pre-print
Our formulation is based on kinematic latent normalizing flow representations and dynamics, as well as differentiable, semantic body part alignment loss functions that support self-supervised learning.  ...  Monocular 3D human pose and shape estimation is challenging due to the many degrees of freedom of the human body and thedifficulty to acquire training data for large-scale supervised learning in complex  ...  Differentiable Semantic Alignment Loss In order to be able to efficiently learn using weak supervision (e.g. just images of people), one needs a measure of prediction quality during the different phases  ... 
arXiv:2003.10350v2 fatcat:c6gh2fxydve6hktgemewiutl4e

EllipBody: A Light-weight and Part-based Representation for Human Pose and Shape Recovery [article]

Min Wang, Feng Qiu, Wentao Liu, Chen Qian, Xiaowei Zhou, Lizhuang Ma
2020 arXiv   pre-print
To further improve the efficiency of the task, we propose a light-weight body model called EllipBody, which uses ellipsoids to indicate each body part.  ...  To better utilize 3D information contained in part segmentation, we propose a part-level differentiable renderer which model occlusion between parts explicitly.  ...  The body part segmentation that provides critical semantic information is rarely mentioned. The reason is that rendering 3D model to 2D image is hard to be differentiable.  ... 
arXiv:2003.10873v1 fatcat:bwp3nfnjyfflvgav37onhvhahe

Learning to Reconstruct People in Clothing From a Single RGB Camera

Thiemo Alldieck, Marcus Magnor, Bharat Lal Bhatnagar, Christian Theobalt, Gerard Pons-Moll
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Figure 1 : We present a deep learning based approach to estimate personalized body shape, including hair and clothing, using a single RGB camera.  ...  Learning relies only on synthetic 3D data. Once learned, Octopus can take a variable number of frames as input, and is able to reconstruct shapes even from a single image with an accuracy of 5mm.  ...  We would like to thank Twindom for providing us with the scan data. Another thanks goes to Verica Lazova for great help in data processing.  ... 
doi:10.1109/cvpr.2019.00127 dblp:conf/cvpr/AlldieckMBTP19 fatcat:ehowolzbcbh5zonhlcvrsppwme
« Previous Showing results 1 — 15 out of 8,143 results