Filters








494 Hits in 6.5 sec

Mesh R-CNN [article]

Georgia Gkioxari, Jitendra Malik, Justin Johnson
2020 arXiv   pre-print
We then deploy our full Mesh R-CNN system on Pix3D, where we jointly detect objects and predict their 3D shapes.  ...  Our system, called Mesh R-CNN, augments Mask R-CNN with a mesh prediction branch that outputs meshes with varying topological structure by first predicting coarse voxel representations which are converted  ...  Mask R-CNN is an end-toend region-based object detector. It inputs a single RGB image and outputs a bounding box, category label, and segmentation mask for each detected object.  ... 
arXiv:1906.02739v2 fatcat:4gk4jyeqhrffdga65wfcxcjje4

Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph [article]

Honghui Yang, Zili Liu, Xiaopei Wu, Wenxiao Wang, Wei Qian, Xiaofei He, Deng Cai
2022 arXiv   pre-print
Based on these modules, we construct our Graph R-CNN as the second stage, which can be applied to existing one-stage detectors to consistently improve the detection performance.  ...  Two-stage detectors have gained much popularity in 3D object detection.  ...  Voxel R-CNN [7] proposes a voxel RoI pooling to extract RoI features directly from voxel features to refine proposals in the second stage.  ... 
arXiv:2208.03624v1 fatcat:62i22yxxyjf2xasawpqni3tzqa

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection [article]

Jiageng Mao and Minzhe Niu and Haoyue Bai and Xiaodan Liang and Hang Xu and Chunjing Xu
2021 arXiv   pre-print
We present a flexible and high-performance framework, named Pyramid R-CNN, for two-stage 3D object detection from point clouds.  ...  Extensive experiments show that Pyramid R-CNN outperforms the state-of-the-art 3D detection models by a large margin on both the KITTI dataset and the Waymo Open dataset.  ...  In this paper, we propose a general two-stage 3D detection framework, named Pyramid R-CNN, which can be applied on multiple 3D backbones to enhance the detection adaptability and performance.  ... 
arXiv:2109.02499v1 fatcat:iqbuhwa45zcmxiyaqj72bspm5e

SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds [article]

Qingdong He, Zhengning Wang, Hao Zeng, Yi Zeng, Yijun Liu
2021 arXiv   pre-print
Accurate 3D object detection from point clouds has become a crucial component in autonomous driving.  ...  Experiments on KITTI detection benchmark demonstrate the efficiency of extending the graph representation to 3D object detection and the proposed SVGA-Net can achieve decent detection accuracy.  ...  In 2018 IEEE/RSJ Faster r-cnn: Towards real-time object detection with region International Conference on Intelligent Robots and Systems proposal networks.  ... 
arXiv:2006.04043v2 fatcat:p4jzhx74pnclpdgkh4uo7hrbwy

Improved-Mask R-CNN: Towards an Accurate Generic MSK MRI instance segmentation platform (Data from the Osteoarthritis Initiative) [article]

Banafshe Felfeliyan, Abhilash Hareendranathan, Gregor Kuntze, Jacob L. Jaremko, Janet L. Ronsky
2021 arXiv   pre-print
(0.21), indicating a high agreement between the human readers and both Mask R-CNN and iMaskRCNN.  ...  The CoV values for effusion detection between Reader1 and Mask R-CNN (0.33), Reader1 and iMaskRCNN (0.34), Reader2 and Mask R-CNN (0.22), Reader2 and iMaskRCNN (0.29) are close to CoV between two readers  ...  This algorithm is based on region proposal-based object detection and is an extended version of Fast R-CNN (Girshick, 2015) and Faster R-CNN (Ren et al., 2017) .  ... 
arXiv:2107.12889v1 fatcat:onyo2dfyevdadfu3ai244rjvea

3DV: 3D Dynamic Voxel for Action Recognition in Depth Video [article]

Yancheng Wang, Yang Xiao, Fu Xiong, Wenxiang Jiang, Zhiguo Cao, Joey Tianyi Zhou, Junsong Yuan
2020 arXiv   pre-print
To facilitate depth-based 3D action recognition, 3D dynamic voxel (3DV) is proposed as a novel 3D motion representation.  ...  With 3D space voxelization, the key idea of 3DV is to encode 3D motion information within depth video into a regular voxel set (i.e., 3DV) compactly, via temporal rank pooling.  ...  First YOLOv3-Tiny [30] is used for human detection instead of Faster R-CNN [31] , concerning running speed. Meanwhile, human and background are separated by depth thresholding.  ... 
arXiv:2005.05501v1 fatcat:ug6bplg7zzh2rpwz4aixpvftpq

3DV: 3D Dynamic Voxel for Action Recognition in Depth Video

Yancheng Wang, Yang Xiao, Fu Xiong, Wenxiang Jiang, Zhiguo Cao, Joey Tianyi Zhou, Junsong Yuan
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
To facilitate depth-based 3D action recognition, 3D dynamic voxel (3DV) is proposed as a novel 3D motion representation.  ...  With 3D space voxelization, the key idea of 3DV is to encode 3D motion information within depth video into a regular voxel set (i.e., 3DV) compactly, via temporal rank pooling.  ...  First YOLOv3-Tiny [30] is used for human detection instead of Faster R-CNN [31] , concerning running speed. Meanwhile, human and background are separated by depth thresholding.  ... 
doi:10.1109/cvpr42600.2020.00059 dblp:conf/cvpr/Wang0XJCZY20 fatcat:wng7nvryfrh7dcqvjc5375nug4

3D Aggregated Faster R-CNN for General Lesion Detection [article]

Ning Zhang, Yu Cao, Benyuan Liu, Yan Luo
2020 arXiv   pre-print
The local area of lesions can be very confusing, leading the region based classifier branch of Faster R-CNN easily fail.  ...  In this paper, we enforce an end-to-end 3D Aggregated Faster R-CNN solution by stacking an "aggregated classifier branch" on the backbone of RPN.  ...  Our major contribution lies in that we present the first successful approach to enforce an end-to-end full 3D Faster R-CNN solution for the small and sparse lesion detection task.  ... 
arXiv:2001.11071v1 fatcat:vrdjs5smz5hhxei4herpbfuoq4

Building Footprint Extraction in Dense Area from LiDAR Data using Mask R-CNN

Sayed A. Mohamed, Amira S. Mahmoud, Marwa S. Moustafa, Ashraf K. Helmy, Ayman H. Nasr
2022 International Journal of Advanced Computer Science and Applications  
The mask R-CNN object detection framework used to effectively extract building in dense areas sometimes fails to provide an adequate building boundary result due to urban edge intersections and unstructured  ...  Thus, we introduced a modified workflow to train ensemble of the mask R-CNN using two backbones ResNet (34, 101).  ...  CNN-based object detectors are single and two-stage. Fast R-CNN, faster R-CNN [20] , and mask R-CNN are widely identified as two-stage detectors.  ... 
doi:10.14569/ijacsa.2022.0130643 fatcat:7zxq5xcnxnb6vdp6iutofydabq

Learning 3D Scene Semantics and Structure from a Single Depth Image

Bo Yang, Zihang Lai, Xiaoxuan Lu, Shuyu Lin, Hongkai Wen, Andrew Markham, Niki Trigoni
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
Recent deep neural networks based methods aim to simultaneously learn object class labels and infer the 3D shape of a scene represented by a large voxel grid.  ...  Our key idea is to deeply fuse an efficient 3D shape estimator with existing recognition (e.g., ResNets) and segmentation (e.g., Mask R-CNN) techniques.  ...  Building on R-CNN [5] and Fast R-CNN [4] , Faster R-CNN [16] applies attention mechanism with a Region Proposal Network (RPN) and then achieves leading performance in object detection.  ... 
doi:10.1109/cvprw.2018.00069 dblp:conf/cvpr/YangLLLWMT18 fatcat:caqpxygknzcfrcswpe5fxfqr3i

3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans [article]

Ji Hou, Angela Dai, Matthias Nießner
2019 arXiv   pre-print
Our network leverages high-resolution RGB input by associating 2D images with the volumetric grid based on the pose alignment of the 3D reconstruction.  ...  This combination of 2D and 3D feature learning allows significantly higher accuracy object detection and instance segmentation than state-of-the-art alternatives.  ...  In Table 5 , we quantitatively evaluate our object detection against Deep Sliding Shapes and Frustum PointNet, which operate on RGB-D frame, as well as Mask R-CNN [12] projected to 3D.  ... 
arXiv:1812.07003v3 fatcat:dsqjflgqbnaexbxxiagmin7bk4

Deep End-to-end 3D Person Detection from Camera and Lidar

Markus Roth, Dominik Jargot, Dariu M. Gavrila
2019 2019 IEEE Intelligent Transportation Systems Conference (ITSC)  
Experiments on the validation set of the KITTI 3D object detection benchmark [2] show that the proposed method outperforms state-of-the-art methods with an average precision (AP) of 47.06% on moderate  ...  We present a method for 3D person detection from camera images and lidar point clouds in automotive scenes.  ...  Popular methods adopting this approach are Region-based Convolutional Neural Networks (R-CNN) [5] , Fast R-CNN [6] , Faster R-CNN [7] and Region-based Fully Convolutional Network (RFCN) [8] . 2) Single-stage  ... 
doi:10.1109/itsc.2019.8917366 dblp:conf/itsc/RothJG19 fatcat:2kubr3qfg5hz7jyj3ntgdrzylq

Incremental Instance-Oriented 3D Semantic Mapping via RGB-D Cameras for Unknown Indoor Scene

Wei Li, Junhua Gu, Benwen Chen, Jungong Han
2020 Discrete Dynamics in Nature and Society  
To ensure an efficient reconstruction of 3D objects with semantic and instance IDs, the input RGB images are operated by a real-time deep-learned object detector.  ...  Finally, a map integration strategy fuses information about their 3D shapes, locations, and instance IDs in a faster way.  ...  Beginning with the object detection [3, 28] in RGB images, soon afterwards, Mask R-CNN came out which is further able to predict a per-pixel semantically annotated mask for each of the detected instances  ... 
doi:10.1155/2020/2528954 fatcat:7qpcxw5bebaqzfvmhnfyensoa4

Detection, Segmentation, and Model Fitting of Individual Tree Stems from Airborne Laser Scanning of Forests Using Deep Learning

Lloyd Windrim, Mitch Bryson
2020 Remote Sensing  
In this paper, we develop new approaches to automated tree detection, segmentation and stem reconstruction using algorithms based on deep supervised machine learning which are designed for use with aerially  ...  acquired high-resolution LiDAR pointclouds.  ...  (a) 2D bounding box detection; (b) 3D cuboid detection. Figure 5 . 5 Graphical comparison of the voxel and point-based segmentation approaches. (a) Voxel-based 3D-FCN approach.  ... 
doi:10.3390/rs12091469 fatcat:ep6p2wrps5a27ifdn6agdjpxiy

3D Object Detection for Autonomous Driving: A Review and New Outlooks [article]

Jiageng Mao, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
2022 arXiv   pre-print
Second, we conduct a comprehensive survey of the progress in 3D object detection from the aspects of models and sensory inputs, including LiDAR-based, camera-based, and multi-modal detection approaches  ...  Finally, we conduct a performance analysis of the 3D object detection approaches, and we further summarize the research trends over the years and prospect the future directions of this area.  ...  - - 76.30 69.04 LiDAR R-CNN [134] LiDAR Point-Voxel 2021 - - - - - - 76.0 68.3 Pyramid R-CNN [171] LiDAR Point-Voxel 2021 - 88.39 82.08 77.49 - - 76.30 67.23 LaserNet [178] LiDAR Range Image 2019 30 -  ... 
arXiv:2206.09474v1 fatcat:3skws77uqngjtpo6mycpo4dhny
« Previous Showing results 1 — 15 out of 494 results