Joint Attention Mechanisms for Monocular Depth Estimation with Multi-Scale Convolutions and Adaptive Weight Adjustment

Peng Liu, Zonghua Zhang, Zhaozong Meng, Nan Gao
2020 IEEE Access  
Monocular depth estimation is a fundamental problem for various vision applications, and is therefore gaining increasing attention in the field of computer vision. Though a great improvement has been made thanks to the rapid progress of deep convolutional neural networks, depth estimation of the object at finer details remains an unsatisfactory issue, especially in complex scenes that has rich structure information. In this paper, we proposed a deep end-to-end learning framework with the
more » ... ork with the combination of multi-scale convolutions and joint attention mechanisms to tackle this challenge. Specifically, we firstly elaborately designed a lightweight up-convolution to generate multi-scale feature maps. Then we introduced an attention-based residual block to aggregate different feature maps in joint channel and spatial dimension, which could enhance the discriminant ability of feature fusion at finer details. Furthermore, we explored an effective adaptive weight adjustment strategy for the loss function to further improve the performance, which adjusts the weight of each loss term during training without additional hyperparameters. The proposed framework was evaluated using challenging NYU Depth v2 and KITTI datasets. Experimental results demonstrated that the proposed approach is superior to most of the state-of-the-art methods.
doi:10.1109/access.2020.3030097 fatcat:yaei4lacpbaldfqyignca63rv4