Residential land extraction from high spatial resolution optical images using multifeature hierarchical method

Zhongliang Fu, Xiaoli Liang
2019 Journal of Applied Remote Sensing  
Residential land extraction from high spatial resolution optical images using multifeature hierarchical method," Abstract. Residential land (RL), as a typical kind of urban functional zone, plays an important role in urban planning and land census. Recent years have witnessed frequent changes in RL via the process of urbanization. The extraction of RL from high spatial resolution optical images can reflect the status quo of land use/land cover to a certain extent, which is of great significance
more » ... to land census and urban planning. We adopt a scene classification strategy to extract RL and mainly focus on the extraction of four common types of RL in China: old-style village, lowdensity high-rise, medium-density low-rise, and low-density low-rise. We design a multifeature hierarchical (MFH) algorithm for RL extraction. First, RL is extracted based on the gray level concurrence matrix and a fuzzy classification algorithm. Then an improved bag-of-visual-words algorithm is introduced to further realize the extraction of RL. The effectiveness of our proposed method is analyzed with a sample dataset and large images. We also analyze the separability among different kinds of RL. We compare the experimental results with those of three other algorithms, and the results demonstrate that the MFH algorithm performs better in terms of the accuracy and efficiency of the RL extraction. The results can provide services for land surveying and urban planning, and the technological processes and experimental design in the algorithm can provide a reference for the research in related fields. © The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI. Terms of Use: Recently, numerous efforts have been made to extract proper features from scene images and build effective classification models. In terms of feature extraction, the visual features, including spectral, textural, and geometrical features, are usually used to characterize the scene images. [26] [27] [28] [29] [30] [31] [32] [33] [34] However, only the visual features were used in these studies, and the semantic features that represent the special geographic information were ignored. Therefore, these methods are only effective in classifying simple scenes, rather than heterogeneous scenes with diverse kinds of objects. [35] [36] [37] To solve this issue, scale invariant feature transform (SIFT) features were introduced. 38 Unlike some visual features that are variant to affined transformations, SIFT features overcome the variability of scale and affinity issues and are widely used in image classification, scene recognition, and target detection. [39] [40] [41] In terms of classification methods, they mainly use techniques for measuring the feature similarity between scene images and labeling scenes using various classifiers, such as the K-nearest neighbor, maximum likelihood, support vector machine (SVM), artificial neural network, and random forests. 42 However, these classifiers are only capable of dealing with the visual features and are easily affected by feature changes. Therefore, more effective models, such as the latent Dirichlet allocation 43 and bag of words, 44 are gradually being introduced to improve the classification accuracy. Nevertheless, the previous methods are not capable of tackling the urban RL recognition and classification task by solely using simple features and classification models, as the residential scenes are often heterogeneous with complex components and various semantic categories. In addition, other issues, such as the classification of RL, the universality of the dataset, and the generalization of the method, also impede RL extraction in Chinese urban areas. There are great differences in the morphological structures and geographical distributions of different types of RL, so it is more practically significant for further classification of RL. However, in the study by Yang and Newsam, 45 RL was not subdivided, and Xia et al. 46 divided RL into dense residential and rural residential. These classification systems are too general to distinguish different types of RL. In other relevant studies, 47,48 the housing types in RL are quite different from those in China. In Chinese towns, residential areas are dominated by high-rise housings, whereas in the United States, residential areas mainly include single-family housing, multifamily residential, and mobile homes. These three residential types also appear in the frequently used scene classification dataset, the UC merced land use dataset, which can be downloaded from the United States Geological Survey National Map. 49 Other commonly used datasets such as the SIRI-WHU 50 and WHU-RS19 datasets 51 are selected from Chinese areas, and the RL is taken as a class in the scene classification without being further subdivided. Moreover, the images of these datasets and the classification schemes were built according to land use scenes, not functional zones. Therefore, these datasets are not suitable for the extraction of RL in China. In addition, these studies only involve the classification of scene images in the dataset without considering the applicability of large-scale remote sensing images. In summary, this study aims to address its four key issues: the classification scheme, dataset, features and models, and applicability of large area images. This study built a classification scheme in line with the current status of LULC in China and subdivided RL into four types [old-style village (OSV), low-density high-rise (LDHR), medium-density low-rise (MDLR), and low-density low-rise (LDLR)] according to the characteristics of Chinese residential buildings, such as the morphological structure, distribution location, floor height, and floor spacing. In our study, the HSROIs provided by Google Earth were used to collect the samples and build the dataset. Though the Google Earth images have been preprocessed using RGB renderings from the original optical aerial images, there is no significant difference between the Google Earth images and the real optical aerial images, even in the pixel-level LULC mapping. 52 Thus Google Earth images can also be used as aerial images for scene classification. Many datasets, such as the AID, 45 SIRI-WHU 50 , WHU-RS19, 51 and RSSCN7 53 datasets, are collected from Google Earth images. For the features and models, we proposed a multifeature hierarchical (MFH) method to extract RL, and the validity of our algorithm was verified by large area images. At present, many works on information extraction adopt the idea of deep learning, but it requires Fu and Liang: Residential land extraction from high spatial resolution optical images.
doi:10.1117/1.jrs.13.026515 fatcat:sieu5bd3svb63gwz2pwt66zdd4