Filters








2,097 Hits in 9.3 sec

3D Convolution on RGB-D Point Clouds for Accurate Model-free Object Pose Estimation [article]

Zhongang Cai, Cunjun Yu, Quang-Cuong Pham
2018 arXiv   pre-print
The conventional pose estimation of a 3D object usually requires the knowledge of the 3D model of the object.  ...  In this paper, we propose a two-stage pipeline that takes in raw colored point cloud data and estimates an object's translation and rotation by running 3D convolutions on voxels.  ...  The authors would like to thank Nanyang Technological University for financial support.  ... 
arXiv:1812.11284v1 fatcat:3c65bom3ubhyrezkqwljl3ubby

A Comprehensive Review on 3D Object Detection and 6D Pose Estimation with Deep Learning

Sabera Hoque, MD. Yasir Arafat, Shuxiang Xu, Ananda Maiti, Yuchen Wei
2021 IEEE Access  
., [194] offered a fancy concept called "Frastum PointNets" based on RGB-D data in a point cloud and expects a semantic class for each point in that point cloud.  ...  cloud networks A number of 3D point cloud networks can be replaced directly by the PointNet network [82] for potential improvement in accurate 3D object detection and 6DF pose estimation.  ... 
doi:10.1109/access.2021.3114399 fatcat:kvdwsslqxff3lkh27tsdsciqma

Vision-based Robotic Grasp Detection From Object Localization, Object Pose Estimation To Grasp Estimation: A Review [article]

Guoguang Du, Kai Wang, Shiguo Lian, Kaiyong Zhao
2020 arXiv   pre-print
All the above subtasks are reviewed with traditional methods and latest deep learning-based methods based on the RGB-D image inputs.  ...  Some object pose estimation methods need not object localization, and they conduct object localization and object pose estimation jointly.  ...  ., 2019] estimates accurate 3D geometry of transparent objects from a single RGB-D image for robotic manipulation.  ... 
arXiv:1905.06658v2 fatcat:6u3k2ltwifaanjpp2nkayyj2f4

Multi-view Self-supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge [article]

Andy Zeng, Kuan-Ting Yu, Shuran Song, Daniel Suo, Ed Walker Jr., Alberto Rodriguez, Jianxiong Xiao
2017 arXiv   pre-print
In the proposed approach, we segment and label multiple views of a scene with a fully convolutional neural network, and then fit pre-scanned 3D object models to the resulting segmentation to get the 6D  ...  object pose.  ...  There are two primary approaches for estimating the 6D pose of an object. The first aligns 3D CAD models to 3D point clouds with algorithms such as iterative closest point [9] .  ... 
arXiv:1609.09475v3 fatcat:tpv5alfmpjf6fbtfrlktlor72u

SSL-Net: Point-cloud generation network with self-supervised learning

Ran Sun, Yongbin Gao, Zhijun Fang, Anjie Wang, Cengsi Zhong
2019 IEEE Access  
In addition, a pose estimation network is integrated into the 3D point cloud generation network to eliminate the pose ambiguity of the input image, and the estimated pose is also used for rendering the  ...  2D image with the same pose as input image from 3D point clouds.  ...  Meanwhile, a pose estimation network is integrated into the 3D point cloud generation network to eliminate the pose ambiguity of the input image, and the estimated pose is also used for rendering the 2D  ... 
doi:10.1109/access.2019.2923842 fatcat:s53kttnwk5dw3jzy2aqe6deaji

Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview [article]

Zhaoxin Fan, Yazhi Zhu, Yulin He, Qi Sun, Hongyan Liu, Jun He
2022 arXiv   pre-print
Among methods for object pose detection and tracking, deep learning is the most promising one that has shown better performance than others.  ...  detection, category-level monocular object pose detection, and monocular object pose tracking.  ...  While existing (RGB)D-based methods are all evaluated on datasets with dense point clouds generated from depth maps, their performances on sparse point clouds are under explored, and this has caused a  ... 
arXiv:2105.14291v2 fatcat:2kxd4owthvf7tbcbnlqlqu4r3m

3D Semantic Scene Perception using Distributed Smart Edge Sensors [article]

Simon Bultmann, Sven Behnke
2022 arXiv   pre-print
Efficient vision CNN models for object detection, semantic segmentation, and human pose estimation run on-device in real time. 2D human keypoint estimations, augmented with the RGB-D depth estimate, as  ...  well as semantically annotated point clouds are streamed from the sensors to a central backend, where multiple viewpoints are fused into an allocentric 3D semantic scene model.  ...  Multi-Modal Semantic Point Cloud Fusion We obtain a geometric point cloud by projecting the RGB-D range image into 3D.  ... 
arXiv:2205.01460v1 fatcat:fpe5lpemz5al5p5l4fdiltraoa

3D Object Proposals using Stereo Imagery for Accurate Object Class Detection [article]

Xiaozhi Chen and Kaustav Kundu and Yukun Zhu and Huimin Ma and Sanja Fidler and Raquel Urtasun
2017 arXiv   pre-print
Our experiments show significant performance gains over existing RGB and RGB-D object proposal methods on the challenging KITTI benchmark.  ...  , point cloud densities and distance to the ground.  ...  We would like to thank NVIDIA for supporting our research by donating GPUs.  ... 
arXiv:1608.07711v2 fatcat:3xhhpw2o7nggnjfmf24w5wy6ty

RGB-D-Based Pose Estimation of Workpieces with Semantic Segmentation and Point Cloud Registration

Hui Xu, Guodong Chen, Zhenhua Wang, Lining Sun, Fan Su
2019 Sensors  
By using RGB-D (color and depth) information, we propose an efficient and practical solution that fuses the approaches of semantic segmentation and point cloud registration to perform object recognition  ...  Then, we determine the point cloud of the workpieces by incorporating the depth information to estimate the real-time pose of the workpieces.  ...  To address the problem of estimating the free-form 3D objects' poses in point clouds, Drost et al. created a global model description and locally matched the model by using a fast voting scheme [29] .  ... 
doi:10.3390/s19081873 fatcat:z6gmc7eldzgpdhrwrjnlgkpcy4

Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation [article]

Dengsheng Chen and Jun Li and Zheng Wang and Kai Xu
2021 arXiv   pre-print
We train a variational auto-encoder (VAE) for generating 3D point clouds in the canonical space from an RGBD image.  ...  Since the 3D point cloud is generated in normalized pose (with actual size), the encoder of the VAE learns view-factorized RGBD embedding.  ...  Acknowledgement We thank the anonymous reviewers for the valuable suggestions. We are grateful to Chen Wang, one of the authors of DenseFusion, for the help and discussion.  ... 
arXiv:2001.09322v3 fatcat:pyehmt5pu5fklgkcb4bgdcz4k4

Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation

Dengsheng Chen, Jun Li, Zheng Wang, Kai Xu
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
We train a variational auto-encoder (VAE) for generating 3D point clouds in the canonical space from an RGBD image.  ...  Since the 3D point cloud is generated in normalized pose (with actual size), the encoder of the VAE learns view-factorized RGBD embedding.  ...  Acknowledgement We thank the anonymous reviewers for the valuable suggestions. We are grateful to Chen Wang, one of the authors of DenseFusion, for the help and discussion.  ... 
doi:10.1109/cvpr42600.2020.01199 dblp:conf/cvpr/ChenLWX20 fatcat:xwmysf6sezelrl3fxxv5uzyaj4

Learning Depth-Guided Convolutions for Monocular 3D Object Detection [article]

Mingyu Ding, Yuqi Huo, Hongwei Yi, Zhe Wang, Jianping Shi, Zhiwu Lu, Ping Luo
2019 arXiv   pre-print
D^4LCN overcomes the limitation of conventional 2D convolutions and narrows the gap between image representation and 3D point cloud representation.  ...  To better represent 3D structure, prior arts typically transform depth maps estimated from 2D images into a pseudo-LiDAR representation, and then apply existing 3D point-cloud based object detectors.  ...  single monocular image. (2) We carefully design a single-stage 3D object detection framework based on D 4 LCN to learn better 3D representation for reducing the gap between 2D convolutions and 3D point  ... 
arXiv:1912.04799v2 fatcat:5gx5lr6o45dkhe3scq5jgb2ypa

Augmenting ViSP's 3D Model-Based Tracker with RGB-D SLAM for 3D Pose Estimation in Indoor Environments

J. Li-Chee-Ming, C. Armenakis
2016 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences  
This paper explores the integration of ViSP with RGB-D SLAM.  ...  This paper presents a novel application of the Visual Servoing Platform's (ViSP) for pose estimation in indoor and GPS-denied outdoor environments.  ...  for providing the 3D model of the Bergeron Centre of Excellence in Engineering.  ... 
doi:10.5194/isprs-archives-xli-b1-925-2016 fatcat:lpa6sad2ivbsjcgake4nv2es5u

Fruit Detection and Pose Estimation for Grape Cluster–Harvesting Robot Using Binocular Imagery Based on Deep Neural Networks

Wei Yin, Hanjin Wen, Zhengtong Ning, Jian Ye, Zhiqiang Dong, Lufeng Luo
2021 Frontiers in Robotics and AI  
Finally, the accurate grape point cloud was used with the RANSAC algorithm for grape cylinder model fitting, and the axis of the cylinder model was used to estimate the pose of the grape.  ...  The pose of fruits is crucial to guide robots to approach target fruits for collision-free picking.  ...  AUTHOR CONTRIBUTION All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.  ... 
doi:10.3389/frobt.2021.626989 fatcat:yfvdadqzobax5daoq7y5prqz3i

MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

Kentaro Wada, Edgar Sucar, Stephen James, Daniel Lenton, Andrew J. Davison
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Our approach makes 3D object pose proposals from single RGB-D views, accumulates pose estimates and non-parametric occupancy information from multiple views as the camera moves, and performs joint optimization  ...  to estimate consistent, non-intersecting poses for multiple objects in contact.  ...  Recent RGB-D-based system are PointFusion [31] and DenseFusion [28] , which individually process the two sensor modalities (CNNs for RGB, PointNet [23] for point-cloud), and then fuse them to extract  ... 
doi:10.1109/cvpr42600.2020.01455 dblp:conf/cvpr/WadaSJLD20 fatcat:uzgyxyoh35ct7k5abdknr7pq4a
« Previous Showing results 1 — 15 out of 2,097 results