390 Hits in 6.7 sec

Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection [article]

Li Wang, Liang Du, Xiaoqing Ye, Yanwei Fu, Guodong Guo, Xiangyang Xue, Jianfeng Feng, Li Zhang
2021 arXiv   pre-print
The objective of this paper is to learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection.  ...  We make following contributions: (i) rather than appealing to the complicated pseudo-LiDAR based approach, we propose a depth-conditioned dynamic message propagation (DDMP) network to effectively integrate  ...  Conclusion We have presented a depth-conditioned dynamic message propagation (DDMP-3D) network, a novel graphbased approach that learns context-and depth-aware feature representation for 3D object detection  ... 
arXiv:2103.16470v1 fatcat:3zikeoajn5fwjf3xz6oipoopre

Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection [article]

Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao
2022 arXiv   pre-print
3D object detection from multiple image views is a fundamental and challenging task for visual scene understanding.  ...  Recently, DETR3D introduces a novel 3D-2D query paradigm in aggregating multi-view images for 3D object detection and achieves state-of-the-art performance.  ...  images only, which can be grouped into two categories: monocular-based and multi-view based. 2.1.1 Monocular 3D object detection.  ... 
arXiv:2204.11582v2 fatcat:ubebw3psb5f6zkvizc76zinyci

Human Pose Tracking Using Multi-level Structured Models [chapter]

Mun Wai Lee, Ram Nevatia
2006 Lecture Notes in Computer Science  
In the second stage, parts such as face, shoulders and limbs are estimated and estimates are combined by grid-based belief propagation to infer 2D joint positions.  ...  Tracking body poses of multiple persons in monocular video is a challenging problem due to the high dimensionality of the state space and issues such as inter-occlusion of the persons' bodies.  ...  In [15] , a mixture density propagation approach is used to overcome the depth ambiguities of articulated joints seen in monocular view.  ... 
doi:10.1007/11744078_29 fatcat:swcdjya6mbeyjaqxu3f625heai

"The Pedestrian next to the Lamppost" Adaptive Object Graphs for Better Instantaneous Mapping [article]

Avishkar Saha, Oscar Mendez, Chris Russell, Richard Bowden
2022 arXiv   pre-print
Our approach sets a new state-of-the-art in BEV estimation from monocular images across three large-scale datasets, including a 50% relative improvement for objects on nuScenes.  ...  We propose a graph neural network which predicts BEV objects from a monocular image by spatially reasoning about an object within the context of other objects.  ...  In contrast, the field of monocular 3D detection displays far greater object localization accuracy by taking an object-based approach.  ... 
arXiv:2204.02944v1 fatcat:5l3ttonp3nhc3h2imakuzjwcgy

Real-time 3D Pose Estimation with a Monocular Camera Using Deep Learning and Object Priors On an Autonomous Racecar [article]

Ankit Dhall
2018 arXiv   pre-print
Further, a priori 3D information about the object is used to match 2D-3D correspondences and accurately estimate object positions up to a distance of 15m.  ...  We propose a complete pipeline that allows object detection and simultaneously estimate the pose of these multiple object instances using just a single image.  ...  The object detection which is the first sub-module is able to detect cones in diverse conditions including both lighting and weather.  ... 
arXiv:1809.10548v1 fatcat:5takrrvz5zfc3bjvkd3bvgbtaa

Fast 3D Semantic Mapping in Road Scenes

Xuanpeng Li, Dong Wang, Huanxuan Ao, Rachid Belaroussi, Dominique Gruyer
2019 Applied Sciences  
In this work, we propose a fast 3D semantic mapping system based on the monocular vision by fusion of localization, mapping, and scene parsing.  ...  From visual sequences, it can estimate the camera pose, calculate the depth, predict the semantic segmentation, and finally realize the 3D semantic mapping.  ...  Several recent research try to make a balance between the computing cost and the accuracy of object detection, classification, and 2D pixel labeling [6, 7] .  ... 
doi:10.3390/app9040631 fatcat:3iqebmhknnbfphezkanwbbuex4

Human Pose Tracking in Monocular Sequence Using Multilevel Structured Models

Mun Wai Lee, R. Nevatia
2009 IEEE Transactions on Pattern Analysis and Machine Intelligence  
Index Terms-3D human pose estimation, multiple-human tracking, belief propagation, data driven Markov chain Monte Carlo, generative models.  ...  In the second stage, parts such as face, shoulders, and limbs are detected using various cues, and the results are combined by a grid-based belief propagation algorithm to infer 2D joint positions.  ...  We present a three-stage approach for 3D pose estimation and the tracking of multiple people from a monocular sequence (see Fig. 1 ).  ... 
doi:10.1109/tpami.2008.35 pmid:19029544 fatcat:tprs2vm2i5bbtoqd24hp6nxb6i

Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection [article]

Xianpeng Liu, Nan Xue, Tianfu Wu
2021 arXiv   pre-print
Monocular 3D object detection aims to localize 3D bounding boxes in an input single 2D image.  ...  This paper proposes a simple yet effective formulation for monocular 3D object detection without exploiting any extra information.  ...  Depth-conditioned Dynamic Message Ouyang, W. 2021. Delving into Localization Errors for Propagation for Monocular 3D Object Detection. In CVPR, Monocular 3D Object Detection.  ... 
arXiv:2112.04628v1 fatcat:e5ev2xesvjgmpe5cfm57cscu6i

Sensor and Sensor Fusion Technology in Autonomous Vehicles: A Review

De Jong Yeong, Gustavo Velasco-Hernandez, John Barry, Joseph Walsh
2021 Sensors  
The current paper, therefore, provides an end-to-end review of the hardware and software methods required for sensor fusion object detection.  ...  We also summarize the three main approaches to sensor fusion and review current state-of-the-art multi-sensor fusion techniques and algorithms for object detection in autonomous driving applications.  ...  3D object detection.  ... 
doi:10.3390/s21062140 pmid:33803889 pmcid:PMC8003231 fatcat:j52leqrvwnhu5brd7lxgozvwya

Multi-Robot Collaborative Perception with Graph Neural Networks [article]

Yang Zhou, Jiuhong Xiao, Yue Zhou, Giuseppe Loianno
2022 arXiv   pre-print
We show that the proposed framework can address multi-view visual perception problems such as monocular depth estimation and semantic segmentation.  ...  Several experiments both using photo-realistic and real data gathered from multiple aerial robots' viewpoints show the effectiveness of the proposed approach in challenging inference conditions including  ...  We thank Jatin Palchuri and Rundong Ge for help on dataset collection. Digital Object Identifier (DOI) 10.1109/LRA.2022.3141661  ... 
arXiv:2201.01760v1 fatcat:tx3w52t5crdidiaf7s5hwepiom

Fixing the root node: Efficient tracking and detection of 3D human pose through local solutions

Ben Daubney, Xianghua Xie, Jingjing Deng, Neil Mac Parthaláin, Reyer Zwiggelaar
2016 Image and Vision Computing  
We apply this approach to two problems: The first is single frame monocular 3D pose estimation, where we propose a method to directly extract 3D pose without first extracting any intermediate 2D representation  ...  These are to locally estimate pose conditioned on a fixed root node state, which defines the global position and orientation of the person.  ...  Note, commonly for pose estimation, we impose a much stricter 470 criteria for positive detection compared to object detection.  ... 
doi:10.1016/j.imavis.2016.05.010 fatcat:uip5nl7k55c55bfucacfl5545i

Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective

Wu Liu, Tao Mei
2022 ACM Computing Surveys  
Although there have been some works to summarize different approaches, it still remains challenging for researchers to have an in-depth view of how these approaches work from 2D to 3D.  ...  Especially, we provide insightful analyses for the intrinsic connections and methods evolution from 2D to 3D pose estimation.  ...  On one hand, the 3D recovery of general objects from a monocular image has not been well solved. More statistical models for general objects would bring a significant boost.  ... 
doi:10.1145/3524497 fatcat:4pbvntngrnfp7lqhcpjmy7p2fq

Monocular Object and Plane SLAM in Structured Environments [article]

Shichao Yang, Sebastian Scherer
2018 arXiv   pre-print
We present a monocular Simultaneous Localization and Mapping (SLAM) using high level object and plane landmarks, in addition to points.  ...  We first propose a high order graphical model to jointly infer the 3D object and layout planes from single image considering occlusions and semantic constraints.  ...  For example in autonomous driving, vehicles need to be detected in 3D space to keep safety and in AR application, 3D objects and layout planes also need to be localized for more realistic physical interactions  ... 
arXiv:1809.03415v1 fatcat:eveldwolp5gfjgmmvyq3qznzqq

Autonomous Driving with Deep Learning: A Survey of State-of-Art Technologies [article]

Yu Huang, Yue Chen
2020 arXiv   pre-print
Due to the limited space, we focus the analysis on several key areas, i.e. 2D and 3D object detection in perception, depth estimation from cameras, multiple sensor fusion on the data, feature and task  ...  [143] first run foreground-background separated monocular depth estimation (ForeSeE) and then apply depth of foreground objects (like LiDAR) in 3D object detection and localization.  ...  after depth completion/colorization and 3D object detection.  ... 
arXiv:2006.06091v3 fatcat:nhdgivmtrzcarp463xzqvnxlwq

Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective [article]

Wu Liu, Qian Bao, Yu Sun, Tao Mei
2021 arXiv   pre-print
Although there have been some works to summarize the different approaches, it still remains challenging for researchers to have an in-depth view of how these approaches work.  ...  Recently, benefited from the deep learning technologies, a significant amount of research efforts have greatly advanced the monocular human pose estimation both in 2D and 3D areas.  ...  We live in a dynamic 3D world where people and objects interact with the environment.  ... 
arXiv:2104.11536v1 fatcat:tdag2jq2vjdrjekwukm5nu7l6a
« Previous Showing results 1 — 15 out of 390 results