4 Hits in 3.9 sec

CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection [article]

Yanan Zhang, Jiaxin Chen, Di Huang
2022 arXiv   pre-print
To address this issue, we propose a novel framework, namely Contrastively Augmented Transformer for multi-modal 3D object Detection (CAT-Det).  ...  In autonomous driving, LiDAR point-clouds and RGB images are two major data modalities with complementary cues for 3D object detection.  ...  multi-modal 3D object Detection (CAT-Det).  ... 
arXiv:2204.00325v2 fatcat:cijmrrtrtjhnjl5ofjn3yhullm

3D Vision with Transformers: A Survey [article]

Jean Lahoud, Jiale Cao, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Ming-Hsuan Yang
2022 arXiv   pre-print
In computer vision, the 3D field has also witnessed an increase in employing the transformer for 3D convolution neural networks and multi-layer perceptron networks.  ...  In this work, we present a systematic and thorough review of more than 100 transformers methods for different 3D vision tasks, including classification, segmentation, detection, completion, pose estimation  ...  CAT-Det [89] combines a Pointformer applied on a point cloud with Imageformer applied on RGB images.  ... 
arXiv:2208.04309v1 fatcat:h7xk3hydevhwhatx5lil3ftnri

TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving [article]

Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, Andreas Geiger
2022 arXiv   pre-print
How should we integrate representations from complementary sensors for autonomous driving? Geometry-based fusion has shown promise for perception (e.g. object detection, motion forecasting).  ...  Our approach uses transformer modules at multiple resolutions to fuse perspective view and bird's eye view feature maps.  ...  3D object detection.  ... 
arXiv:2205.15997v1 fatcat:nyteapdbr5dqbadxliletwnzcy

Revision-Based Generation of Natural Language Summaries Providing Historical Background: Corpus-Based Analysis, Design, Implementation and Evaluation

Jacques Robin, Columbia University. Computer Science
This model requires a new type of linguistic knowledge: revision operations, which specify the various ways a draft can be transformed in order to concisely accommodate a new piece of information.  ...  The second evaluation demonstrates that the revision operations acquired during the corpus analysis and implemented in STREAK are, for the most part, portable to at least one other quantitative domain  ...  It is quite impressive that even usage of the revision tools involving the most complex transformations were detectable in the stock market corpus.  ... 
doi:10.7916/d83203zf fatcat:nuk3xohy6nd3xgavmigfx2w7a4