Filters








11,889 Hits in 4.2 sec

Learning 3D Semantic Segmentation with only 2D Image Supervision [article]

Kyle Genova, Xiaoqi Yin, Abhijit Kundu, Caroline Pantofaru, Forrester Cole, Avneesh Sud, Brian Brewington, Brian Shucker, Thomas Funkhouser
2021 arXiv   pre-print
In this paper, we investigate how to use only those labeled 2D image collections to supervise training 3D semantic segmentation models.  ...  Our approach is to train a 3D model from pseudo-labels derived from 2D semantic image segmentations using multiview fusion.  ...  The result is a generalizable 3D model with supervision only from unpaired 2D images.  ... 
arXiv:2110.11325v1 fatcat:oz23efciznhhxa7wgo64bfkahi

Semantic Implicit Neural Scene Representations With Semi-Supervised Training [article]

Amit Kohli, Vincent Sitzmann, Gordon Wetzstein
2021 arXiv   pre-print
Our method is simple, general, and only requires a few tens of labeled 2D segmentation masks in order to achieve dense 3D semantic segmentation.  ...  We explore two novel applications for this semantically aware implicit neural scene representation: 3D novel view and semantic label synthesis given only a single input RGB image or 2D label mask, as well  ...  We take the latent 3D feature representation of SRNs, learned in an unsupervised manner given only posed 2D RGB images, and map them to a set of labeled semantic segmentation maps.  ... 
arXiv:2003.12673v2 fatcat:pcijezws6vhznmz4ih5buuytva

Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point Clouds of Wild Scenes [article]

Haiyan Wang, Xuejian Rong, Liang Yang, Jinglun Feng, Jizhong Xiao, Yingli Tian
2020 arXiv   pre-print
3D semantic segmentation models of natural scene point clouds while not explicitly capturing their inherent structures, even with only single view per training sample.  ...  During the projection process, perspective rendering and semantic fusion modules are proposed to provide refined 2D supervision signals for training along with a 2D-3D joint optimization strategy.  ...  CONCLUSION In this paper, we have proposed a novel deep graph convolutional model for large-scale semantic scene segmentation in 3D point clouds of wild scenes with only 2D supervision.  ... 
arXiv:2004.12498v2 fatcat:5mr6uuli6baixai7bjud24heda

Learning Semantics-enriched Representation via Self-discovery, Self-classification, and Self-restoration [article]

Fatemeh Haghighi, Mohammad Reza Hosseinzadeh Taher, Zongwei Zhou, Michael B. Gotway, Jianming Liang
2020 arXiv   pre-print
We examine our Semantic Genesis with all the publicly-available pre-trained models, by either self-supervision or fully supervision, on the six distinct target tasks, covering both classification and segmentation  ...  Our extensive experiments demonstrate that Semantic Genesis significantly exceeds all of its 3D counterparts as well as the de facto ImageNet-based transfer learning in 2D.  ...  Moreover, as an ablation study, we examine our Semantic Genesis 2D with Models Genesis 2D (self-supervised) and Ima-geNet models (fully supervised) in four target tasks, covering classification and segmentation  ... 
arXiv:2007.06959v1 fatcat:aikw2jksvrbnvjigjucjgk7nya

Towards Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point Clouds of Wild Scenes

Haiyan Wang, Xuejian Rong, Liang Yang, Shuihua Wang, Yingli Tian
2019 British Machine Vision Conference  
3D semantic segmentation model of natural scene point clouds while not explicitly capturing their inherent structures, even with only single view per sample.  ...  to provide refined 2D supervision signals for training along with a 2D-3D joint optimization strategy.  ...  Instead of using 3D ground truth label as supervision, here only 2D segmentation ground truth map is adopted to optimize the training process together with the predicted 2D segmentation map in 2D by direct  ... 
dblp:conf/bmvc/WangRYWT19 fatcat:apyspv5qgvdpzgxhltrvg3bgpq

Pri3D: Can 3D Priors Help 2D Representation Learning? [article]

Ji Hou, Saining Xie, Benjamin Graham, Angela Dai, Matthias Nießner
2021 arXiv   pre-print
This results not only in improvement over 2D-only representation learning on the image-based tasks of semantic segmentation, instance segmentation, and object detection on real-world in-door datasets,  ...  Inspired by these advances in geometric understanding, we aim to imbue image-based perception with representations learned under geometric constraints.  ...  Acknowledgments This work was supported by a TUM-IAS Rudolf Moßbauer Fellowship, the ERC Starting Grant Scan2CAD (804724), the German Research Foundation (DFG) Grant Making Machine Learning on Static and  ... 
arXiv:2104.11225v3 fatcat:nwgaipmbszg37k7xggnryr43du

3D Guided Weakly Supervised Semantic Segmentation [article]

Weixuan Sun, Jing Zhang, Nick Barnes
2020 arXiv   pre-print
In this paper, we propose a weakly supervised 2D semantic segmentation model by incorporating sparse bounding box labels with available 3D information, which is much easier to obtain with advanced sensors  ...  We manually labeled a subset of the 2D-3D Semantics(2D-3D-S) dataset with bounding boxes, and introduce our 2D-3D inference module to generate accurate pixel-wise segment proposal masks.  ...  -Our 3D weakly supervised semantic segmentation model learns an initial classifier from segment proposals, then uses the 2D-3D inference to transductively generate new segment proposals, resulting in further  ... 
arXiv:2012.00242v1 fatcat:ruki4paggzdxzawncsl2gk62fy

Self-supervised Single-view 3D Reconstruction via Semantic Consistency [article]

Xueting Li, Sifei Liu, Kihwan Kim, Shalini De Mello, Varun Jampani, Ming-Hsuan Yang, Jan Kautz
2020 arXiv   pre-print
We learn a self-supervised, single-view 3D reconstruction model that predicts the 3D mesh shape, texture and camera pose of a target object with a collection of 2D images and silhouettes.  ...  with supervision.  ...  example) (b) Semantic part segmentation for each image learned via self-supervision.  ... 
arXiv:2003.06473v1 fatcat:kf4djdg7d5ddzb2wibdlzxfltu

Self-Supervised Image Representation Learning with Geometric Set Consistency [article]

Nenglun Chen, Lei Chu, Hao Pan, Yan Lu, Wenping Wang
2022 arXiv   pre-print
We propose a method for self-supervised image representation learning under the guidance of 3D geometric consistency.  ...  of 2D image representations without semantic labels.  ...  Compared with 2D images, 3D data has complementary advantages for learning discriminative image features.  ... 
arXiv:2203.15361v1 fatcat:jnbjimrgcfbh3mdcfw2jnktkge

Learning from 2D: Contrastive Pixel-to-Point Knowledge Transfer for 3D Pretraining [article]

Yueh-Cheng Liu, Yu-Kai Huang, Hung-Yueh Chiang, Hung-Ting Su, Zhe-Yu Liu, Chin-Tang Chen, Ching-Yu Tseng, Winston H. Hsu
2021 arXiv   pre-print
In this paper, we present a novel 3D pretraining method by leveraging 2D networks learned from rich 2D datasets.  ...  With a pretrained 2D network, the proposed pretraining process requires no additional 2D or 3D labeled data, further alleviating the expensive 3D data annotation cost.  ...  Related Work Cross-modal 2D-3D Learning In the field of 3D computer vision, many works leverage 2D information fused with 3D data for 3D semantic segmentation [13, 10] , 3D object detection [9, 35,  ... 
arXiv:2104.04687v3 fatcat:zfmpvxsv6vevlfcualoorajmkm

xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation

Maximilian Jaritz, Tuan-Hung Vu, Raoul de Charette, Emilie Wirbel, Patrick Perez
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
On this 3D semantic segmentation example, the UDA Baseline [16] prediction from 2D camera image does not detect the car on the right due to the day/night domain shift.  ...  With xMUDA, 2D learns the appearance of cars in the dark from information exchange with the 3D LiDAR point cloud, and 3D learns to reduce false predictions.  ...  Particularly, it should be beneficial for supervised learning and other modalities than image and point cloud.  ... 
doi:10.1109/cvpr42600.2020.01262 dblp:conf/cvpr/JaritzVCWP20 fatcat:ekb6uwe7qfaf5fnaqmpkz5cs74

xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation [article]

Maximilian Jaritz, Tuan-Hung Vu, Raoul de Charette, Émilie Wirbel, Patrick Pérez
2020 arXiv   pre-print
In this work, we explore how to learn from multi-modality and propose cross-modal UDA (xMUDA) where we assume the presence of 2D images and 3D point clouds for 3D semantic segmentation.  ...  In xMUDA, modalities learn from each other through mutual mimicking, disentangled from the segmentation objective, to prevent the stronger modality from adopting false predictions from the weaker one.  ...  The SemanticKITTI dataset [1] provides 3D point cloud labels for the Odometry dataset of Kitti [6] which features large angle front camera and a 64-layer LiDAR.  ... 
arXiv:1911.12676v2 fatcat:yaguklx3jvdcnmbbpt54k32ioe

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows [article]

Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, Bill Freeman, Rahul Sukthankar, Cristian Sminchisescu
2020 arXiv   pre-print
Our formulation is based on kinematic latent normalizing flow representations and dynamics, as well as differentiable, semantic body part alignment loss functions that support self-supervised learning.  ...  Monocular 3D human pose and shape estimation is challenging due to the many degrees of freedom of the human body and thedifficulty to acquire training data for large-scale supervised learning in complex  ...  CMU -in order to construct kinematic priors -but without the corresponding images. Additionally we also rely on images in the wild, with only 2D body joint or semantic segmentation maps ground truth.  ... 
arXiv:2003.10350v2 fatcat:c6gh2fxydve6hktgemewiutl4e

Learning to Regress Bodies from Images using Differentiable Semantic Rendering [article]

Sai Kumar Dwivedi, Nikos Athanasiou, Muhammed Kocabas, Michael J. Black
2022 arXiv   pre-print
~SMPL parameters) from monocular images typically exploits losses on 2D keypoints, silhouettes, and/or part-segmentation when 3D training data is not available.  ...  Learning to regress 3D human body shape and pose (e.g.  ...  , we supervise 3D body regression training with clothed and minimal-clothed regions differently using our novel DSR loss and our learned semantic prior.  ... 
arXiv:2110.03480v2 fatcat:6oi25y5pw5bgdbhzunz47rh2ny

Data Efficient 3D Learner via Knowledge Transferred from 2D Model [article]

Ping-Chung Yu, Cheng Sun, Min Sun
2022 arXiv   pre-print
Specifically, we utilize a strong and well-trained semantic segmentation model for 2D images to augment RGB-D images with pseudo-label. The augmented dataset can then be used to pre-train 3D models.  ...  In this work, we deal with the data scarcity challenge of 3D tasks by transferring knowledge from strong 2D models via RGB-D images.  ...  Specifically, we employ a 2D semantic segmentation model, which is trained on a large and diverse scene parsing dataset, to augment the RGB-D images with pseudo-labels.  ... 
arXiv:2203.08479v2 fatcat:4xhrrwld7ngs3kz4b6ry6ba364
« Previous Showing results 1 — 15 out of 11,889 results