Filters








7 Hits in 5.3 sec

The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes [article]

Abhishek Patil, Srikanth Malla, Haiming Gang, Yi-Ting Chen
2019 arXiv   pre-print
3D multi-object detection and tracking are crucial for traffic scene understanding.  ...  With unique dataset size, rich annotations, and complex scenes, H3D is gathered to stimulate research on full-surround 3D multi-object detection and tracking.  ...  Acknowledgement: We are grateful to our colleagues Behzad Dariush, Kalyani Polagani, Kenji Nakai, Athma Narayanan, and Wei Zhan for their valuable input.  ... 
arXiv:1903.01568v1 fatcat:7hvhnqsb6fbktprsmuirhh2jha

3D-SiamRPN: An End-to-end Learning Method for Real-time 3D Single Object Tracking using Raw Point Cloud

Zheng Fang, Sifan Zhou, Yubo Cui, Sebastian Scherer
2020 IEEE Sensors Journal  
3D single object tracking is a key issue for autonomous following robot, where the robot should robustly track and accurately localize the target for efficient following.  ...  Additionally, experimental results on H3D dataset demonstrate that our method also has good generalization ability and could achieve good tracking performance in a new scene without re-training.  ...  ACKNOWLEDGMENT The authors would also like to thank J. Shan, W. Qiao and M. Zhou for their help.  ... 
doi:10.1109/jsen.2020.3033034 fatcat:uqd5gowmlrci5nrqwusbyn6s6i

IPS300+: a Challenging Multimodal Dataset for Intersection Perception System [article]

Huanan Wang, Xinyu Zhang, Jun Li, Zhiwei Li, Lei Yang, Shuyue Pan, Yongqiang Deng
2021 arXiv   pre-print
Due to the high complexity and occlusion, insufficient perception in the crowded urban intersection can be a serious safety risk for both human drivers and autonomous algorithms, whereas CVIS (Cooperative  ...  The first batch of open-source data includes 14198 frames, and each frame has an average of 319.84 labels, which is 9.6 times larger than the most crowded dataset (H3D dataset in 2019) by now.  ...  After the ID checks for targets are completed, multiple ID documents in 5Hz, 20s fragments of the published data will be released for 3D multi-target tracking task.  ... 
arXiv:2106.02781v1 fatcat:wsemjp6tbrbhfi7nhwnt2fr7wm

nuScenes: A multimodal dataset for autonomous driving [article]

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, Oscar Beijbom
2020 arXiv   pre-print
Image based benchmark datasets have driven development in computer vision tasks such as object detection, tracking and segmentation of agents in the environment.  ...  Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology.  ...  The nuScenes dataset was annotated by Scale.ai and we thank Alexandr Wang and Dave Morse for their support.  ... 
arXiv:1903.11027v5 fatcat:ha265rjm4bbndnnugq5ususime

Multi-Modal 3D Object Detection in Autonomous Driving: a Survey [article]

Yingjie Wang, Qiuyu Mao, Hanqi Zhu, Yu Zhang, Jianmin Ji, Yanyong Zhang
2021 arXiv   pre-print
Next, we discuss some popular datasets for multi-modal 3D object detection, with a special focus on the sensor data included in each dataset.  ...  We hope that our detailed review can help researchers to embark investigations in the area of multi-modal 3D object detection.  ...  It first calculates mAP of each class and then averaging over the 3 classes as the final detec- tion result. -H3D (Patil et al., 2019) focuses on crowded traffic scenes in urban.  ... 
arXiv:2106.12735v2 fatcat:5twzbk4yhrcfzddp7zghnsivna

TITAN: Future Forecast using Action Priors [article]

Srikanth Malla and Behzad Dariush and Chiho Choi
2020 arXiv   pre-print
In the absence of an appropriate dataset for this task, we created the TITAN dataset that consists of 700 labeled video-clips (with odometry) captured from a moving vehicle on highly interactive urban  ...  traffic scenes in Tokyo.  ...  Acknowledgement We thank Akira Kanehara for supporting our data collection and Yuji Yasui, Rei Sakai, and Isht Dwivedi for insightful discussions.  ... 
arXiv:2003.13886v3 fatcat:djx7cu6blrddlahtodtekom3am

Mid-level Representation for Visual Recognition [article]

Moin Nabi
2015 arXiv   pre-print
In the case of image understanding, we focus on object detection/recognition task.  ...  We, additionally, study the outcomes provided by employing the subcategory-based models for undoing dataset bias.  ...  For instance, Zhao and Nevatia [162] used 3D human models to detect persons in the observed scene as well as a probabilistic framework for tracking extracted features from the persons.  ... 
arXiv:1512.07314v1 fatcat:knmhkwxqk5aczis7ce6g2sv2wm