Unifying Voxel-based Representation with Transformer for 3D Object Detection
[article]
Yanwei Li, Yilun Chen, Xiaojuan Qi, Zeming Li, Jian Sun, Jiaya Jia
2022
arXiv
pre-print
It surpasses previous work in single- and multi-modality entries and achieves leading performance in the nuScenes test set with 69.7%, 55.1%, and 71.1% NDS for LiDAR, camera, and multi-modality inputs, ...
Different from previous work, our approach preserves the voxel space without height compression to alleviate semantic ambiguity and enable spatial interactions. ...
However, in the short term, the current technique could not solve all the corner cases and extreme situations. It may bring potential risk to the decision process in real-world autonomous systems. ...
arXiv:2206.00630v1
fatcat:uz6ej6qxzvaaxieqg2jkyjyfae