Filters








345 Hits in 6.1 sec

Synthetic Depth Transfer for Monocular 3D Object Pose Estimation in the Wild

Yueying Kao, Weiming Li, Qiang Wang, Zhouchen Lin, Wooshik Kim, Sunghoon Hong
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
In this paper, we aim at extracting RGB and depth features from a single RGB image with the help of synthetic RGB-depth image pairs for object pose estimation.  ...  Monocular object pose estimation is an important yet challenging computer vision problem. Depth features can provide useful information for pose estimation.  ...  the metrics for 3D pose estimation, owing to the fusion of RGB and transferred depth features from synthetic data.  ... 
doi:10.1609/aaai.v34i07.6781 fatcat:qk6cvllr5rg7zctlb4rh63fytq

Two-hand Global 3D Pose Estimation Using Monocular RGB [article]

Fanqing Lin, Connor Wilhelm, Tony Martinez
2020 arXiv   pre-print
To train the CNNs for this new task, we introduce a large-scale synthetic 3D hand pose dataset.  ...  We tackle the challenging task of estimating global 3D joint locations for both hands via only monocular RGB input images.  ...  Note that 3D pose estimation using RGB data is much more challenging due to the lack of depth information and the additional noise from images in the wild.  ... 
arXiv:2006.01320v4 fatcat:axg2q7zwqjdw5ku47km3kyu4hm

Leaving Flatland: Advances in 3D behavioral measurement [article]

Jesse D. Marshall, Tianqing Li, Joshua H. Wu, Timothy W. Dunn
2021 arXiv   pre-print
Existing 3D measurement techniques draw on specialized hardware, such as motion capture or depth cameras, as well as deep multi-view and monocular computer vision.  ...  Animals move in three dimensions (3D). Thus, 3D measurement is necessary to report the true kinematics of animal movement.  ...  Transfer learning refers to when a network pretrained on one dataset for a specific task, for instance human 3D pose estimation, performs learns better on a different task, such as rat 3D pose estimation  ... 
arXiv:2112.01987v1 fatcat:z5dqj47s3fbt3d4t75bnsp2l6q

Customer Gaze Estimation in Retail Using Deep Learning

Shashimal Senarath, Primesh Pathirana, Dulani Meedeniya, Sampath Jayarathna
2022 IEEE Access  
In the first stage, we design a mechanism to estimate the 3D gaze of the subject using image data and monocular depth estimation.  ...  The second stage presents a novel three-attention mechanism to estimate the gaze in the wild from field-of-view, depth range, and object channel attentions.  ...  DEPTH BASED Face3D MODEL As shown in Fig. 5 , the depth based Face3D Model is a depth-based approach that uses monocular depth estimation and 3D gaze estimation for gaze estimation.  ... 
doi:10.1109/access.2022.3183357 fatcat:5gokugxk3vfqzorddawtvr4uy4

RealMonoDepth: Self-Supervised Monocular Depth Estimation for General Scenes [article]

Mertalp Ocal, Armin Mustafa
2020 arXiv   pre-print
In this paper, we introduce RealMonoDepth a self-supervised monocular depth estimation approach which learns to estimate the real scene depth for a diverse range of indoor and outdoor scenes.  ...  Existing supervised methods for monocular depth estimation require accurate depth measurements for training.  ...  Hence these supervised methods require ground-truth depth or synthetic data for monocular depth estimation on real scenes.  ... 
arXiv:2004.06267v1 fatcat:aljxoq7zpbg33ihkbau4uqmwwq

Deep Learning Methods for 3D Human Pose Estimation under Different Supervision Paradigms: A Survey

Dejun Zhang, Yiqi Wu, Mingyue Guo, Yilin Chen
2021 Electronics  
Inspired by the remarkable achievements in learning-based 2D human pose estimation, numerous research studies are devoted to the topic of 3D human pose estimation via deep learning methods.  ...  The literature is reviewed, along with the general pipeline of 3D human pose estimation, which consists of human body modeling, learning-based pose estimation, and regularization for refinement.  ...  This method realizes transferring 3D annotation from indoor images to in-the-wild images that could be a big step for investigation weakly-supervised 3D human pose estimation. Kanazawa et al.  ... 
doi:10.3390/electronics10182267 fatcat:ajnizu776ncpto3jvyh3zye2si

Monocular Depth Estimation Based On Deep Learning: An Overview [article]

Chaoqiang Zhao, Qiyu Sun, Chongzhen Zhang, Yang Tang, Feng Qian
2020 arXiv   pre-print
Finally, we discuss the challenges and provide some ideas for future researches in monocular depth estimation.  ...  Meanwhile, the predicted depth maps are sparse. Inferring depth information from a single image (monocular depth estimation) is an ill-posed problem.  ...  Similarly, conditional GAN is also used in [79] for monocular depth estimation.  ... 
arXiv:2003.06620v1 fatcat:l5ei3ognova6xkyppflef5nqsq

Monocular 3D Human Pose Estimation In The Wild Using Improved CNN Supervision [article]

Dushyant Mehta, Helge Rhodin, Dan Casas, Pascal Fua, Oleksandr Sotnychenko, Weipeng Xu, Christian Theobalt
2017 arXiv   pre-print
All in all, we argue that the use of transfer learning of representations in tandem with algorithmic and data contributions is crucial for general 3D body pose estimation.  ...  Using only the existing 3D pose data and 2D pose data, we show state-of-the-art performance on established benchmarks through transfer of learned features, while also generalizing to in-the-wild scenes  ...  Due to the similarity of the tasks, features learned for 2D pose estimation on in-the-wild MPII and LSP training sets can be transferred to 3D pose estimation.  ... 
arXiv:1611.09813v5 fatcat:uka37jvzbzefzow7td73phrfly

Single-Shot 3D Mesh Estimation via Adversarial Domain Adaptation

Arthita Ghosh, Rama Chellappa
2019 SN Computer Science  
We propose a novel deep architecture for 3D pose estimation and leverage the variations in pose, body shape and background in the synthetic datasets to train our network.  ...  Existing datasets of in-the-wild images of humans have limited availability of 3D ground truth.  ...  Acknowledgements This project was supported by the Intelligence  ... 
doi:10.1007/s42979-019-0025-9 fatcat:qnwrim2htvbkjdk27nroz3rvma

In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations [article]

Ikhsanul Habibie, Weipeng Xu, Dushyant Mehta, Gerard Pons-Moll, Christian Theobalt
2019 arXiv   pre-print
We therefore propose a new deep learning based method for monocular 3D human pose estimation that shows high accuracy and generalizes better to in-the-wild scenes.  ...  Convolutional Neural Network based approaches for monocular 3D human pose estimation usually require a large amount of training images with 3D pose annotations.  ...  In this way, networks carry over features useful for in-the-wild 2D for better 3D pose estimation in out-of-studio settings.  ... 
arXiv:1904.03289v1 fatcat:myvn2qwj5fc6tmjk5wg2bqyi34

In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations

Ikhsanul Habibie, Weipeng Xu, Dushyant Mehta, Gerard Pons-Moll, Christian Theobalt
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
We therefore propose a new deep learning based method for monocular 3D human pose estimation that shows high accuracy and generalizes better to in-the-wild scenes.  ...  Convolutional Neural Network based approaches for monocular 3D human pose estimation usually require a large amount of training images with 3D pose annotations.  ...  In this way, networks carry over features useful for in-the-wild 2D for better 3D pose estimation in out-of-studio settings.  ... 
doi:10.1109/cvpr.2019.01116 dblp:conf/cvpr/HabibieXMPT19 fatcat:5h5n5puhlvfz7p52duegjtvaz4

Self-supervised Learning with Geometric Constraints in Monocular Video: Connecting Flow, Depth, and Camera [article]

Yuhua Chen, Cordelia Schmid, Cristian Sminchisescu
2019 arXiv   pre-print
We present GLNet, a self-supervised framework for learning depth, optical flow, camera pose and intrinsic parameters from monocular video - addressing the difficulty of acquiring realistic ground-truth  ...  We also show good generalization for transfer learning in YouTube videos.  ...  Methodology Overview Our Geometric Learning Network (GLNet), for which an overview is presented in fig. 1 , solves the inter-related tasks of monocular depth prediction, optical flow, camera pose and  ... 
arXiv:1907.05820v2 fatcat:6ithess5zbdwlcqjga3ko25eru

FML: Face Model Learning From Videos

Ayush Tewari, Florian Bernard, Pablo Garrido, Gaurav Bharaj, Mohamed Elgharib, Hans-Peter Seidel, Patrick Perez, Michael Zollhofer, Christian Theobalt
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Since image data is a 2D projection of a 3D face, the resulting depth ambiguity makes the problem ill-posed.  ...  We propose multi-frame self-supervised training of a deep network based on in-the-wild video data for jointly learning a face model and 3D face reconstruction.  ...  Acknowledgements We thank True-VisionSolutions Pty Ltd for providing the 2D face tracker, and the authors of [12, 48, 52, 62] for the comparisons.  ... 
doi:10.1109/cvpr.2019.01107 dblp:conf/cvpr/TewariB0BESPZT19 fatcat:6gf5b75bkzbldhzbyqnun4okzm

FML: Face Model Learning from Videos [article]

Ayush Tewari, Florian Bernard, Pablo Garrido, Gaurav Bharaj, Mohamed Elgharib, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, Christian Theobalt
2019 arXiv   pre-print
Since image data is a 2D projection of a 3D face, the resulting depth ambiguity makes the problem ill-posed.  ...  Our face model is learned using only corpora of in-the-wild video clips collected from the Internet. This virtually endless source of training data enables learning of a highly general 3D face model.  ...  Acknowledgements: We thank True-VisionSolutions Pty Ltd for providing the 2D face tracker, and the authors of [12, 48, 52, 62] for the comparisons.  ... 
arXiv:1812.07603v2 fatcat:mdnemyu7xjf5lbhszg5i3e53fu

Kinematic-Structure-Preserved Representation for Unsupervised 3D Human Pose Estimation [article]

Jogendra Nath Kundu, Siddharth Seth, Rahul M V, Mugalodi Rakesh, R. Venkatesh Babu, Anirban Chakraborty
2020 arXiv   pre-print
Estimation of 3D human pose from monocular image has gained considerable attention, as a key step to several human-centric applications.  ...  However, generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable, as these models often perform unsatisfactorily on unseen in-the-wild  ...  This work was supported by a Wipro PhD Fellowship (Jogendra) and in part by DST, Govt. of India (DST/INT/UK/P-179/2017).  ... 
arXiv:2006.14107v1 fatcat:r3c3m6ugx5bxhfrepsake366t4
« Previous Showing results 1 — 15 out of 345 results