Filters








566 Hits in 6.8 sec

The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation [article]

Christopher Xie, Yu Xiang, Arsalan Mousavian, Dieter Fox
2020 arXiv   pre-print
We propose a novel method that separately leverages synthetic RGB and synthetic depth for unseen object instance segmentation.  ...  We also show that our method can segment unseen objects for robot grasping. Code, models and video can be found at https://rse-lab.cs.washington.edu/projects/unseen-object-instance-segmentation/.  ...  Conclusion We proposed a framework that separately leverages RGB and depth to provide sharp and accurate masks for unseen object instance segmentation.  ... 
arXiv:1907.13236v2 fatcat:qp6k5p5sovamrpjxnp5teztw3m

Unseen Object Instance Segmentation for Robotic Environments [article]

Christopher Xie, Yu Xiang, Arsalan Mousavian, Dieter Fox
2021 arXiv   pre-print
Our proposed method, UOIS-Net, separately leverages synthetic RGB and synthetic depth for unseen object instance segmentation.  ...  We show that our method can produce sharp and accurate segmentation masks, outperforming state-of-the-art methods on unseen object instance segmentation.  ...  CONCLUSION We proposed a deep network, UOIS-Net, that separately leverages RGB and depth to provide sharp and accurate masks for unseen object instance segmentation.  ... 
arXiv:2007.08073v2 fatcat:urn5ojcgw5gfxjeuuoqwcq3ezm

Robust Visual Object Tracking with Two-Stream Residual Convolutional Networks [article]

Ning Zhang, Jingen Liu, Ke Wang, Dan Zeng, Tao Mei
2020 arXiv   pre-print
To the best of our knowledge, TS-RCN is the first end-to-end trainable two-stream visual tracking system, which makes full use of both appearance and motion features of the target.  ...  tracking, which successfully exploits both appearance and motion features for model update.  ...  Due to the lack of background appearance (e.g., distractor objects ), however, the Siamese approaches are inferior to deal with unseen objects and distractor objects.  ... 
arXiv:2005.06536v1 fatcat:5mhnd5pb7zgupp4mzd72mjf5sm

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo [article]

Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan, Mark Tjersland
2021 arXiv   pre-print
In unknown object grasping experiments, the predictions from the baseline RGB-D network and SimNet enable successful grasps of most of the easy objects.  ...  By inferring grasp positions using the OBB and keypoint predictions, SimNet can be used to perform end-to-end manipulation of unknown objects in both easy and hard scenarios using our fleet of Toyota HSR  ...  We observe that predicting coarse disparity for the monocular, depth, and RGB-D networks results in very little difference in performance.  ... 
arXiv:2106.16118v1 fatcat:7rofj6dpfnbunpibndqvkhf5be

Semi-Supervised Deep Learning Approach for Transportation Mode Identification Using GPS Trajectory Data

Sina Dabiri, Chang-Tien Lu, Kevin Heaslip, Chandan K Reddy
2019 IEEE Transactions on Knowledge and Data Engineering  
The two components are simultaneously trained using both labeled and unlabeled GPS segments, which have already been converted into an efficient representation for the convolutional operation.  ...  Therefore, the unlabeled GPS data are also leveraged by developing a novel deep-learning architecture that is capable of extracting information from both labeled and unlabeled data.  ...  The SECA model was capable of leveraging both unlabeled and labeled GPS trajectories for predicting transportation modes.  ... 
doi:10.1109/tkde.2019.2896985 fatcat:32wehszwsvhd3fp3oqji4g4enu

Semantic Implicit Neural Scene Representations With Semi-Supervised Training [article]

Amit Kohli, Vincent Sitzmann, Gordon Wetzstein
2021 arXiv   pre-print
The recent success of implicit neural scene representations has presented a viable new method for how we capture and store 3D scenes.  ...  We take the next step and demonstrate that an existing implicit representation (SRNs) is actually multi-modal; it can be further leveraged to perform per-point semantic segmentation while retaining its  ...  , given a single posed RGB image and/or label mask of an instance unseen at training time, we infer the latent code of the novel object. (4) Subsequently, we may render multi-view consistent novel RGB  ... 
arXiv:2003.12673v2 fatcat:pcijezws6vhznmz4ih5buuytva

Practical Imitation Learning in the Real World via Task Consistency Loss [article]

Mohi Khansari and Daniel Ho and Yuqing Du and Armando Fuentes and Matthew Bennice and Nicolas Sievers and Sean Kirmani and Yunfei Bai and Eric Jang
2022 arXiv   pre-print
The policy performs control from RGB and depth images and generalizes to doors not encountered in training data.  ...  Such approaches are expensive both because they require large amounts of real world training demonstrations and because identifying the best model to deploy in the real world requires time-consuming real-world  ...  Thus we leverage observations collected in both sim and reality for not just IL, but also for domain adaptation.  ... 
arXiv:2202.01862v2 fatcat:beqcnc6vsbbnhelzkgslbdkfky

UnrealROX: an extremely photorealistic virtual reality environment for robotics simulations and synthetic data generation

Pablo Martinez-Gonzalez, Sergiu Oprea, Alberto Garcia-Garcia, Alvaro Jover-Alvarez, Sergio Orts-Escolano, Jose Garcia-Rodriguez
2019 Virtual Reality  
segmentation, object detection, depth estimation, visual grasping, and navigation.  ...  This virtual reality environment enables robotic vision researchers to generate realistic and visually plausible data with full ground truth for a wide variety of problems such as class and instance semantic  ...  This work has also been supported by three Spanish national grants for PhD studies (FPU15/04516, FPU17/00166, and ACIF/2018/197), by the University of Alicante project GRE16-19, and by the Valencian Government  ... 
doi:10.1007/s10055-019-00399-5 fatcat:jkrft36t5zcqnke4r7qst7xpha

UnrealROX: An eXtremely Photorealistic Virtual Reality Environment for Robotics Simulations and Synthetic Data Generation [article]

Pablo Martinez-Gonzalez, Sergiu Oprea, Alberto Garcia-Garcia, Alvaro Jover-Alvarez, Sergio Orts-Escolano, Jose Garcia-Rodriguez
2019 arXiv   pre-print
segmentation, object detection, depth estimation, visual grasping, and navigation.  ...  This virtual reality environment enables robotic vision researchers to generate realistic and visually plausible data with full ground truth for a wide variety of problems such as class and instance semantic  ...  This work has also been supported by three Spanish national grants for PhD studies (FPU15/04516, FPU17/00166, and ACIF/2018/197), by the University of Alicante project GRE16-19, and by the Valencian Government  ... 
arXiv:1810.06936v2 fatcat:3e4vionszbhfjetr2zzrbd4uxu

LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation

Keunhong Park, Arsalan Mousavian, Yu Xiang, Dieter Fox
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
We use this pipeline to perform pose estimation on unseen objects using simple gradient updates in a render-and-compare fashion.  ...  Figure 1 : We present an end-to-end differentiable reconstruction and rendering pipeline.  ...  Acknowledgements We thank Xinke Deng for helpful discussions.  ... 
doi:10.1109/cvpr42600.2020.01072 dblp:conf/cvpr/ParkMXF20 fatcat:ch7mvs3x5fhcbgstipasmkaquu

LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation [article]

Keunhong Park, Arsalan Mousavian, Yu Xiang, Dieter Fox
2020 arXiv   pre-print
We present a new dataset for unseen object pose estimation--MOPED. We evaluate the performance of our method for unseen object pose estimation on MOPED as well as the ModelNet and LINEMOD datasets.  ...  As a result, they are difficult to scale to a large number of objects and cannot be directly applied to unseen objects. We propose a novel framework for 6D pose estimation of unseen objects.  ...  Acknowledgements We thank Xinke Deng for helpful discussions.  ... 
arXiv:1912.00416v3 fatcat:oj54cc5xqff7ngsw5dixusdjna

Single-Stage Keypoint-Based Category-Level Object Pose Estimation from an RGB Image [article]

Yunzhi Lin, Jonathan Tremblay, Stephen Tyree, Patricio A. Vela, Stan Birchfield
2022 arXiv   pre-print
These quantities are estimated in a sequential fashion, leveraging the recent idea of convGRU for propagating information from easier tasks to those that are more difficult.  ...  In this work, we propose a single-stage, keypoint-based approach for category-level object pose estimation that operates on unknown object instances within a known category using a single RGB image as  ...  CONCLUSION We have presented a single-stage method for categorylevel 6-DoF pose prediction of previously unseen object instances from RGB input.  ... 
arXiv:2109.06161v2 fatcat:y3ok3uacwvhv3pdblzjpytrneu

Semantic 3D Models from Real World Scene Recordings for Traffic Accident Simulation

Ludwig Mohr, Martin Öttl, Michael Haberl, Matthias Rüther, Horst Bischof
2018 Zenodo  
By adjusting the desired level of detail, these objects are suitable for both direct integration into the 3D scene reconstruction for use in the accident simulation software PC-Crash, as well as for fine  ...  In this course, the effect of Advanced Driver Assistance Systems (ADAS) can be simulated, as well as the visibility of objects from people's perspectives.  ...  Acknowledgements This study is part of the project IMPROVE and is financially supported by the FFG, the Austrian Research Promotion Agency of the Austrian Federal Ministry for Transport, Innovation and  ... 
doi:10.5281/zenodo.1487620 fatcat:mniny6y4fvgybgm2d2mkjn4egi

Self-Supervised Object-in-Gripper Segmentation from Robotic Motions [article]

Wout Boerdijk, Martin Sundermeyer, Maximilian Durner, Rudolph Triebel
2020 arXiv   pre-print
Accurate object segmentation is a crucial task in the context of robotic manipulation.  ...  The object masks and views are shown to be suitable training data for segmentation networks that generalize to novel environments and also allow for watertight 3D reconstruction.  ...  Acknowledgments We thank the reviewers for their useful comments. This work was partially supported by the DLR-internal project "Factory of the Future".  ... 
arXiv:2002.04487v3 fatcat:wrfcp2rqyfbn3koacn7ocaiht4

RICE: Refining Instance Masks in Cluttered Environments with Graph Neural Networks [article]

Christopher Xie, Arsalan Mousavian, Yu Xiang, Dieter Fox
2021 arXiv   pre-print
Segmenting unseen object instances in cluttered environments is an important capability that robots need when functioning in unstructured environments.  ...  We train deep networks capable of sampling smart perturbations to the segmentations, and a graph neural network, which can encode relations between objects, to evaluate the perturbed segmentations.  ...  [30] proposed to separate the processing of depth and RGB in order to generalize their method from sim-to-real settings and provide sharp masks.  ... 
arXiv:2106.15711v1 fatcat:5ty3udy3gbcadclxdjbgx5qvfe
« Previous Showing results 1 — 15 out of 566 results