A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection
[article]
2021
arXiv
pre-print
The objective of this paper is to learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection. ...
We make following contributions: (i) rather than appealing to the complicated pseudo-LiDAR based approach, we propose a depth-conditioned dynamic message propagation (DDMP) network to effectively integrate ...
Conclusion We have presented a depth-conditioned dynamic message propagation (DDMP-3D) network, a novel graphbased approach that learns context-and depth-aware feature representation for 3D object detection ...
arXiv:2103.16470v1
fatcat:3zikeoajn5fwjf3xz6oipoopre
Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection
[article]
2022
arXiv
pre-print
3D object detection from multiple image views is a fundamental and challenging task for visual scene understanding. ...
Recently, DETR3D introduces a novel 3D-2D query paradigm in aggregating multi-view images for 3D object detection and achieves state-of-the-art performance. ...
images only, which can be grouped into two categories: monocular-based and multi-view based. 2.1.1 Monocular 3D object detection. ...
arXiv:2204.11582v2
fatcat:ubebw3psb5f6zkvizc76zinyci
Human Pose Tracking Using Multi-level Structured Models
[chapter]
2006
Lecture Notes in Computer Science
In the second stage, parts such as face, shoulders and limbs are estimated and estimates are combined by grid-based belief propagation to infer 2D joint positions. ...
Tracking body poses of multiple persons in monocular video is a challenging problem due to the high dimensionality of the state space and issues such as inter-occlusion of the persons' bodies. ...
In [15] , a mixture density propagation approach is used to overcome the depth ambiguities of articulated joints seen in monocular view. ...
doi:10.1007/11744078_29
fatcat:swcdjya6mbeyjaqxu3f625heai
"The Pedestrian next to the Lamppost" Adaptive Object Graphs for Better Instantaneous Mapping
[article]
2022
arXiv
pre-print
Our approach sets a new state-of-the-art in BEV estimation from monocular images across three large-scale datasets, including a 50% relative improvement for objects on nuScenes. ...
We propose a graph neural network which predicts BEV objects from a monocular image by spatially reasoning about an object within the context of other objects. ...
In contrast, the field of monocular 3D detection displays far greater object localization accuracy by taking an object-based approach. ...
arXiv:2204.02944v1
fatcat:5l3ttonp3nhc3h2imakuzjwcgy
Real-time 3D Pose Estimation with a Monocular Camera Using Deep Learning and Object Priors On an Autonomous Racecar
[article]
2018
arXiv
pre-print
Further, a priori 3D information about the object is used to match 2D-3D correspondences and accurately estimate object positions up to a distance of 15m. ...
We propose a complete pipeline that allows object detection and simultaneously estimate the pose of these multiple object instances using just a single image. ...
The object detection which is the first sub-module is able to detect cones in diverse conditions including both lighting and weather. ...
arXiv:1809.10548v1
fatcat:5takrrvz5zfc3bjvkd3bvgbtaa
Fast 3D Semantic Mapping in Road Scenes
2019
Applied Sciences
In this work, we propose a fast 3D semantic mapping system based on the monocular vision by fusion of localization, mapping, and scene parsing. ...
From visual sequences, it can estimate the camera pose, calculate the depth, predict the semantic segmentation, and finally realize the 3D semantic mapping. ...
Several recent research try to make a balance between the computing cost and the accuracy of object detection, classification, and 2D pixel labeling [6, 7] . ...
doi:10.3390/app9040631
fatcat:3iqebmhknnbfphezkanwbbuex4
Human Pose Tracking in Monocular Sequence Using Multilevel Structured Models
2009
IEEE Transactions on Pattern Analysis and Machine Intelligence
Index Terms-3D human pose estimation, multiple-human tracking, belief propagation, data driven Markov chain Monte Carlo, generative models. ...
In the second stage, parts such as face, shoulders, and limbs are detected using various cues, and the results are combined by a grid-based belief propagation algorithm to infer 2D joint positions. ...
We present a three-stage approach for 3D pose estimation and the tracking of multiple people from a monocular sequence (see Fig. 1 ). ...
doi:10.1109/tpami.2008.35
pmid:19029544
fatcat:tprs2vm2i5bbtoqd24hp6nxb6i
Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
[article]
2021
arXiv
pre-print
Monocular 3D object detection aims to localize 3D bounding boxes in an input single 2D image. ...
This paper proposes a simple yet effective formulation for monocular 3D object detection without exploiting any extra information. ...
Depth-conditioned Dynamic Message
Ouyang, W. 2021. Delving into Localization Errors for Propagation for Monocular 3D Object Detection. In CVPR,
Monocular 3D Object Detection. ...
arXiv:2112.04628v1
fatcat:e5ev2xesvjgmpe5cfm57cscu6i
Sensor and Sensor Fusion Technology in Autonomous Vehicles: A Review
2021
Sensors
The current paper, therefore, provides an end-to-end review of the hardware and software methods required for sensor fusion object detection. ...
We also summarize the three main approaches to sensor fusion and review current state-of-the-art multi-sensor fusion techniques and algorithms for object detection in autonomous driving applications. ...
3D object detection. ...
doi:10.3390/s21062140
pmid:33803889
pmcid:PMC8003231
fatcat:j52leqrvwnhu5brd7lxgozvwya
Multi-Robot Collaborative Perception with Graph Neural Networks
[article]
2022
arXiv
pre-print
We show that the proposed framework can address multi-view visual perception problems such as monocular depth estimation and semantic segmentation. ...
Several experiments both using photo-realistic and real data gathered from multiple aerial robots' viewpoints show the effectiveness of the proposed approach in challenging inference conditions including ...
We thank Jatin Palchuri and Rundong Ge for help on dataset collection. Digital Object Identifier (DOI) 10.1109/LRA.2022.3141661 ...
arXiv:2201.01760v1
fatcat:tx3w52t5crdidiaf7s5hwepiom
Fixing the root node: Efficient tracking and detection of 3D human pose through local solutions
2016
Image and Vision Computing
We apply this approach to two problems: The first is single frame monocular 3D pose estimation, where we propose a method to directly extract 3D pose without first extracting any intermediate 2D representation ...
These are to locally estimate pose conditioned on a fixed root node state, which defines the global position and orientation of the person. ...
Note, commonly for pose estimation, we impose a much stricter 470 criteria for positive detection compared to object detection. ...
doi:10.1016/j.imavis.2016.05.010
fatcat:uip5nl7k55c55bfucacfl5545i
Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective
2022
ACM Computing Surveys
Although there have been some works to summarize different approaches, it still remains challenging for researchers to have an in-depth view of how these approaches work from 2D to 3D. ...
Especially, we provide insightful analyses for the intrinsic connections and methods evolution from 2D to 3D pose estimation. ...
On one hand, the 3D recovery of general objects from a monocular image has not been well solved. More statistical models for general objects would bring a significant boost. ...
doi:10.1145/3524497
fatcat:4pbvntngrnfp7lqhcpjmy7p2fq
Monocular Object and Plane SLAM in Structured Environments
[article]
2018
arXiv
pre-print
We present a monocular Simultaneous Localization and Mapping (SLAM) using high level object and plane landmarks, in addition to points. ...
We first propose a high order graphical model to jointly infer the 3D object and layout planes from single image considering occlusions and semantic constraints. ...
For example in autonomous driving, vehicles need to be detected in 3D space to keep safety and in AR application, 3D objects and layout planes also need to be localized for more realistic physical interactions ...
arXiv:1809.03415v1
fatcat:eveldwolp5gfjgmmvyq3qznzqq
Autonomous Driving with Deep Learning: A Survey of State-of-Art Technologies
[article]
2020
arXiv
pre-print
Due to the limited space, we focus the analysis on several key areas, i.e. 2D and 3D object detection in perception, depth estimation from cameras, multiple sensor fusion on the data, feature and task ...
[143] first run foreground-background separated monocular depth estimation (ForeSeE) and then apply depth of foreground objects (like LiDAR) in 3D object detection and localization. ...
after depth completion/colorization and 3D object detection. ...
arXiv:2006.06091v3
fatcat:nhdgivmtrzcarp463xzqvnxlwq
Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective
[article]
2021
arXiv
pre-print
Although there have been some works to summarize the different approaches, it still remains challenging for researchers to have an in-depth view of how these approaches work. ...
Recently, benefited from the deep learning technologies, a significant amount of research efforts have greatly advanced the monocular human pose estimation both in 2D and 3D areas. ...
We live in a dynamic 3D world where people and objects interact with the environment. ...
arXiv:2104.11536v1
fatcat:tdag2jq2vjdrjekwukm5nu7l6a
« Previous
Showing results 1 — 15 out of 390 results