M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
[article]
Tianrui Guan, Jun Wang, Shiyi Lan, Rohan Chandra, Zuxuan Wu, Larry Davis, Dinesh Manocha
2021
arXiv
pre-print
We present a novel architecture for 3D object detection, M3DeTR, which combines different point cloud representations (raw, voxels, bird-eye view) with different feature scales based on multi-scale feature ...
In particular, our approach ranks 1st on the well-known KITTI 3D Detection Benchmark for both car and cyclist classes, and ranks 1st on Waymo Open Dataset with single frame point cloud input. ...
Adopted from PointNet++ [36] and PV-RCNN [39] , Set Abstraction and Voxel Set Abstraction (VSA) module take raw point coordinates P and the 3D voxel-based features f voxel , respectively, to generate ...
arXiv:2104.11896v3
fatcat:dytvkxn6bjaozljmvsfvqz4tuy