Dense tracking, mapping and scene labeling using a depth camera

Andrés Alejandro Díaz-Toro, Lina María Paz-Pérez, Pedro Piniés-Rodríguez, Eduardo Francisco Caicedo-Bravo
2018 Revista Facultad de Ingeniería  
Reconstrucción densa, localización de la cámara, sensor de profundidad, representación volumétrica, detección de objetos, etiquetamiento de múltiples instancias ABSTRACT: We present a system for dense tracking, 3D reconstruction, and object detection of desktop-like environments, using a depth camera; the Kinect sensor. The camera is moved by hand meanwhile its pose is estimated, and a dense model, with evolving color information of the scene, is constructed. Alternatively, the user can couple
more » ... he object detection module (YOLO: you only look once [1]) for detecting and propagating to the model information of categories of objects commonly found over desktops, like monitors, keyboards, books, cups, and laptops, getting a model with color associated to object categories. The camera pose is estimated using a model-to-frame technique with a coarse-to-fine iterative closest point algorithm (ICP), achieving a drift-free trajectory, robustness to fast camera motion and to variable lighting conditions. Simultaneously, the depth maps are fused into the volumetric structure from the estimated camera poses. For visualizing an explicit representation of the scene, the marching cubes algorithm is employed. The tracking, fusion, marching cubes, and object detection processes were implemented using commodity graphics hardware for improving the performance of the system. We achieve outstanding results in camera pose, high quality of the model's color and geometry, and stability in color from the detection module (robustness to wrong detections) and successful management of multiple instances of the same category.
doi:10.17533/udea.redin.n86a07 fatcat:sda3oc6pmvaexbgjr3ooy5o6cq