Filters








2,409 Hits in 5.0 sec

Monocular 3D human pose estimation using sparse motion features

Ben Daubney, David Gibson, Neill Campbell
2009 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops  
In this paper we demonstrate that the motion of a sparse set of tracked features can be used to extract 3D pose from a single viewpoint.  ...  We use low-level part detectors consisting of 3D motion models, these describe probabilistically how well the observed motion of a tracked feature fits each model.  ...  In this paper we describe a method to estimate 3D pose from a monocular camera using sparse and noisy motion features, we believe this to be the first attempt to do so and represents a significant step  ... 
doi:10.1109/iccvw.2009.5457586 dblp:conf/iccvw/DaubneyGC09 fatcat:chb5tz47bbhclkd2ijhpfd2ehy

MonoPerfCap: Human Performance Capture from Monocular Video [article]

Weipeng Xu, Avishek Chatterjee, Michael Zollhöfer, Helge Rhodin, Dushyant Mehta, Hans-Peter Seidel, Christian Theobalt
2018 arXiv   pre-print
We tackle these challenges by a novel approach that employs sparse 2D and 3D human pose detections from a convolutional neural network using a batch-based pose estimation strategy.  ...  We present the first marker-less approach for temporally coherent 3D performance capture of a human with general clothing from monocular video.  ...  We tackle these challenges by a novel approach that employs sparse 2D and 3D human pose detections from a convolutional neural network using a batch-based pose estimation strategy.  ... 
arXiv:1708.02136v2 fatcat:toewmmbynnbppmxsop43d4xk3e

DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes [article]

Dongki Jung, Jaehoon Choi, Yonghan Lee, Deokhwa Kim, Changick Kim, Dinesh Manocha, Donghwan Lee
2021 arXiv   pre-print
Our network leverages RGB images and sparse depth maps generated from traditional 3D reconstruction methods to estimate dense depth maps.  ...  We use two constraints to handle depth for non-rigidly moving people without tracking their motion explicitly.  ...  We observe that a joint training framework of pose and depth from a monocular video is extremely difficult be-cause pose networks often fail to estimate proper camera ego-motion in complex and crowded  ... 
arXiv:2108.05615v1 fatcat:xqr72w6pnfhzzcfqp2wciqhsne

CamOdoCal: Automatic intrinsic and extrinsic calibration of a rig with multiple generic cameras and odometry

Lionel Heng, Bo Li, Marc Pollefeys
2013 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems  
For robustness, vision applications tend to use wheel odometry as a strong prior for camera pose estimation, and in these cases, an accurate extrinsic calibration is required in addition to an accurate  ...  The extrinsic calibration is unsupervised, uses natural features, and only requires the vehicle to be driven around for a short time.  ...  Monocular VO We run monocular VO for each camera in order to obtain a set of camera motions together which is required for the subsequent step of computing an initial estimate of the extrinsics.  ... 
doi:10.1109/iros.2013.6696592 dblp:conf/iros/HengLP13 fatcat:crjxrd7cuzefblfqek3mr6np3e

PL-SVO: Semi-direct Monocular Visual Odometry by combining points and line segments

Ruben Gomez-Ojeda, Jesus Briales, Javier Gonzalez-Jimenez
2016 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)  
Most approaches to visual odometry estimates the camera motion based on point features, consequently, their performance deteriorates in lowtextured scenes where it is difficult to find a reliable set of  ...  This paper extends a popular semi-direct approach to monocular visual odometry known as SVO [1] to work with line segments, hence obtaining a more robust system capable of dealing with both textured and  ...  using the 3D warping provided by the known 3D features.  ... 
doi:10.1109/iros.2016.7759620 dblp:conf/iros/Gomez-OjedaBJ16 fatcat:fiksk6rg6ndojoldf6jkio27aq

Lightweight Multi-person Total Motion Capture Using Sparse Multi-view Cameras [article]

Yuxiang Zhang, Zhe Li, Liang An, Mengcheng Li, Tao Yu, Yebin Liu
2021 arXiv   pre-print
To overcome these challenges above, we contribute a lightweight total motion capture system for multi-person interactive scenarios using only sparse multi-view cameras.  ...  The results and experiments show that our method achieves more accurate results than existing methods under sparse-view setups.  ...  Even though these methods are able to capture 3D human poses using skeletons, they cannot reconstruct full body behaviours, i.e., facial expressions, hand motions, and body surfaces. 3D Hand Reconstruction  ... 
arXiv:2108.10378v1 fatcat:3sqrnawevrcwfozjjyuidlz42u

Self-Supervised 3D Keypoint Learning for Ego-motion Estimation [article]

Jiexiong Tang, Rares Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim, Patric Jensfelt, Adrien Gaidon
2020 arXiv   pre-print
We jointly learn keypoint and depth estimation networks by combining appearance and geometric matching via a differentiable structure-from-motion module based on Procrustean residual pose correction.  ...  We describe how our self-supervised keypoints can be integrated into state-of-the-art visual odometry frameworks for robust and accurate ego-motion estimation of autonomous vehicles in real-world conditions  ...  To alleviate the limitation of traditional PnP and allow end-to-end learning, we show how the initial pose estimate can be used to derive a 3D loss based on 3D-3D correspondences.  ... 
arXiv:1912.03426v3 fatcat:sxfvfyh75beivbiuzvb7prx2gq

Mobile Robot Simultaneous Localization and Mapping Based on a Monocular Camera

Songmin Jia, Ke Wang, Xiuzhi Li
2016 Journal of Robotics  
In the tracking thread, a ground feature-based pose estimation method is employed to initialize the algorithm for the constraint moving of the mobile robot.  ...  This paper proposes a novel monocular vision-based SLAM (Simultaneous Localization and Mapping) algorithm for mobile robot.  ...  It is worth noting that projection of a 3D point in the sparse map is generally inconsistent with the corresponding feature point due to inaccuracy involved in feature finding, pose estimation, and so  ... 
doi:10.1155/2016/7630340 fatcat:udba7l2qm5c45drafo2vr2w5ma

Trajectory planning for monocular SLAM based exploration system

Sarthak Upadhyay, Ayush Dewan, Arun Kumar Singh, Madhava Krishna
2015 Proceedings of the 2015 Conference on Advances In Robotics - AIR '15  
In VSLAM, the objective is to estimate the trajectory of camera and simultaneously identify 3D feature points and build a map, using camera as a depth sensor.  ...  As a consequence of this motion planning framework we are able to automate SLAM and generate automated monocular SLAM maps of an indoor lab area.  ...  The camera pose, estimated using motion model is updated by Tracker and the subsequent triangulation of 3D map points depends on pose of the camera.  ... 
doi:10.1145/2783449.2783476 dblp:conf/air/UpadhyayDSK15 fatcat:ghvo57pwzncwfoghlypvylvuhm

Multi-Sensor Fusion Self-Supervised Deep Odometry and Depth Estimation

Yingcai Wan, Qiankun Zhao, Cheng Guo, Chenlong Xu, Lijing Fang
2022 Remote Sensing  
We then join deep visual-inertial odometry (DeepVIO) with depth estimation by using sparse depth and the pose from DeepVIO pipeline to align the scale of the depth prediction with the triangulated point  ...  to produce the sparse depth and pose with absolute scale.  ...  Perspective-n-Point (PnP) is used to solve camera pose given 3D-2D correspondences when the camera motion is pure rotation or the camera translation, tinily.  ... 
doi:10.3390/rs14051228 fatcat:srqcx7oo4fhztjq4qrutqosaau

EventCap: Monocular 3D Capture of High-Speed Human Motions using an Event Camera [article]

Lan Xu, Weipeng Xu, Vladislav Golyanik, Marc Habermann, Lu Fang and Christian Theobalt
2019 arXiv   pre-print
In this paper, we propose EventCap --- the first approach for 3D capturing of high-speed human motions using a single event camera.  ...  Our method combines model-based optimization and CNN-based human pose detection to capture high-frequency motion details and to reduce the drifting in the tracking.  ...  Second, we estimate the 3D motion of the human actor using a batchbased optimization algorithm.  ... 
arXiv:1908.11505v1 fatcat:wr36edgfuncpxkl3spr4pbiqcu

EventCap: Monocular 3D Capture of High-Speed Human Motions Using an Event Camera

Lan Xu, Weipeng Xu, Vladislav Golyanik, Marc Habermann, Lu Fang, Christian Theobalt
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
In this paper, we propose EventCap -the first approach for 3D capturing of high-speed human motions using a single event camera.  ...  Our method combines model-based optimization and CNNbased human pose detection to capture high-frequency motion details and to reduce the drifting in the tracking.  ...  Second, we estimate the 3D motion of the human actor using a batchbased optimization algorithm.  ... 
doi:10.1109/cvpr42600.2020.00502 dblp:conf/cvpr/XuXGHFT20 fatcat:v3j6xqdmcjei5gd6tm7mov7kje

Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction [article]

Yana Hasson, Bugra Tekin, Federica Bogo, Ivan Laptev, Marc Pollefeys, Cordelia Schmid
2020 arXiv   pre-print
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy by leveraging information from neighboring  ...  Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.  ...  Then, we focus on methods using motion and photometric cues for self-supervision, in particular in the context of human body pose estimation. Hand and object pose estimation.  ... 
arXiv:2004.13449v1 fatcat:gky7kjfxkfa4plfoh7be6icn6y

Leveraging Photometric Consistency Over Time for Sparsely Supervised Hand-Object Reconstruction

Yana Hasson, Bugra Tekin, Federica Bogo, Ivan Laptev, Marc Pollefeys, Cordelia Schmid
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy by leveraging information from neighboring  ...  Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.  ...  Then, we focus on methods using motion and photometric cues for self-supervision, in particular in the context of human body pose estimation. Hand and object pose estimation.  ... 
doi:10.1109/cvpr42600.2020.00065 dblp:conf/cvpr/HassonTBLPS20 fatcat:unt2mamyovbibml27ors23p4ru

Joint Spatio-temporal Depth Features Fusion Framework for 3D Structure Estimation in Urban Environment [chapter]

Mohamad Motasem Nawaf, Alain Trémeau
2012 Lecture Notes in Computer Science  
Our idea is to introduce the monocular depth cues that exist in a single image, and add time constraints on the estimated 3D structure.  ...  Temporal depth information is obtained via sparse optical flow based structure from motion approach. That allows decreasing the estimation ambiguity by forcing some constraints on camera motion.  ...  Temporal depth features are obtained using a sparse optical flow based structure from motion technique.  ... 
doi:10.1007/978-3-642-33885-4_53 fatcat:ffpodhncbfejhbxyhlkjzxairq
« Previous Showing results 1 — 15 out of 2,409 results