Deep Residual Autoencoder with Multiscaling for Semantic Segmentation of Land-Use Images

Lianfa Li
2019 Remote Sensing  
Semantic segmentation is a fundamental means of extracting information from remotely sensed images at the pixel level. Deep learning has enabled considerable improvements in efficiency and accuracy of semantic segmentation of general images. Typical models range from benchmarks such as fully convolutional networks, U-Net, Micro-Net, and dilated residual networks to the more recently developed DeepLab 3+. However, many of these models were originally developed for segmentation of general or
more » ... al images and videos, and are not directly relevant to remotely sensed images. The studies of deep learning for semantic segmentation of remotely sensed images are limited. This paper presents a novel flexible autoencoder-based architecture of deep learning that makes extensive use of residual learning and multiscaling for robust semantic segmentation of remotely sensed land-use images. In this architecture, a deep residual autoencoder is generalized to a fully convolutional network in which residual connections are implemented within and between all encoding and decoding layers. Compared with the concatenated shortcuts in U-Net, these residual connections reduce the number of trainable parameters and improve the learning efficiency by enabling extensive backpropagation of errors. In addition, resizing or atrous spatial pyramid pooling (ASPP) can be leveraged to capture multiscale information from the input images to enhance the robustness to scale variations. The residual learning and multiscaling strategies improve the trained model's generalizability, as demonstrated in the semantic segmentation of land-use types in two real-world datasets of remotely sensed images. Compared with U-Net, the proposed method improves the Jaccard index (JI) or the mean intersection over union (MIoU) by 4-11% in the training phase and by 3-9% in the validation and testing phases. With its flexible deep learning architecture, the proposed approach can be easily applied for and transferred to semantic segmentation of land-use variables and other surface variables of remotely sensed images.
doi:10.3390/rs11182142 fatcat:fcwid63pqfampp5ryrvlsfdbde