Assessing Different Feature Sets' Effects on Land Cover Classification in Complex Surface-Mined Landscapes by ZiYuan-3 Satellite Imagery

Weitao Chen, Xianju Li, Haixia He, Lizhe Wang
2017 Remote Sensing  
Land cover classification (LCC) in complex surface-mined landscapes has become very important for understanding the influence of mining activities on the regional geo-environment. There are three characteristics of complex surface-mined areas limiting LCC: significant three-dimensional terrain, strong temporal-spatial variability of surface cover, and spectral-spatial homogeneity. Thus, determining effective feature sets are very important as input dataset to improve detailed extent of
more » ... ation schemes and classification accuracy. In this study, data such as various feature sets derived from ZiYuan-3 stereo satellite imagery, a feature subset resulting from a feature selection (FS) procedure, training data polygons, and test sample sets were firstly obtained; then, feature sets' effects on classification accuracy was assessed based on different feature set combination schemes, a FS procedure, and random forest algorithm. The following conclusions were drawn. (1) The importance of feature set could be divided into three grades: the vegetation index (VI), principal component bands (PCs), mean filters (Mean), standard deviation filters (StDev), texture measures (Textures), and topographic variables (TVs) were important; the Gaussian low-pass filters (GLP) was just positive; and none were useless. The descending order of their importance was TVs, StDev, Textures, Mean, PCs, VI, and GLP. (2) TVs and StDev both significantly outperformed VI, PCs, GLP, and Mean; Mean outperformed GLP; all other pairs of feature sets had no difference. In general, the study assessed different feature sets' effects on LCC in complex surface-mined landscapes. Keywords: remote sensing; land cover classification; importance of feature set; complex landscape; surface mining Introduction Land cover datasets are basic components for global change studies and various applications [1,2]. Currently, researchers are mainly focusing on land cover classification (LCC) at fine scales [3] [4] [5] in complex landscapes such as agricultural [6-9], surface-mined land [10] [11] [12] [13] [14] , and Mediterranean [15] by using high spatial resolution satellite imagery. In general, there are also other landscapes in surface-mined areas, such as agricultural, forest, and cities. Thus, they can be considered as complex surface-mined landscape together for LCC. LCC in surface-mined landscapes (LCCSML) can help with the planning and management of mines. Classification technology based on machine learning algorithms and high spatial resolution imagery has achieved more accurate results for urban environments, precision agriculture, Remote Sens. 2018, 10, 23 2 of 20 transportation, forestry surveys, and so on. However, LCCSML differs from other fields in three specific characteristics: significant three-dimensional terrain, strong temporal-spatial variability of surface cover, and spectral-spatial homogeneity. These characteristics increase difficulty of obtaining high accuracy results for the LCCSML [5, 14] . As a result, besides powerful classification algorithm, one of the key solutions is to derive beneficial feature sets from helpful satellite sensors. The importance of single features has been examined in our former study [14] . However, the importance of different feature sets for LCCSML has not been investigated. Some studies attempted to find out the most effective features for classification by assessing the importance of single features. For example, some studies utilized feature selection (FS) procedure as [14] , e.g., landslide identification [16] [17] [18] , LCC in arid regions [19] , and object-based image analysis LCC [20] . Besides, some others have used different feature combinations by including or excluding specific features for classifications to assess the effects of a single feature, e.g., red-edge band for land-use classification [21] ; classifying insect defoliation levels [22] ; classification of paddy rice crops [23] ; LCC in arid region [19] ; and normalized difference vegetation index (NDVI) for classification of tea and hazelnut plantation areas [24] . However, determining effective feature sets is more beneficial than single features. As a result, some studies also used the feature combination method to evaluate the importance of feature sets. For example, Fassnacht et al. [25] aimed to find out which spectral regions were consistently effective for classifying tree species. Akar and Güngör [24] evaluated the contribution of the gray level co-occurrence matrix and Gabor filter texture sets for detecting tea and hazelnut plantation areas. Aguilar et al. [26] grouped different object feature sets such as spectral information, elevation data, band index data and ratios, textures, and shape geometry into 10 strategies for greenhouse extraction and assessed their importance. Wright and Gallant [27] investigated the addition of image texture and digital elevation model-derived terrain variables to Landsat Thematic Mapper variables for wetland discrimination. Similarly, for agricultural and surface-mined landscapes, Hurni et al. [7] assessed the inclusion of texture measures for the delineation of shifting cultivation landscape. Okubo et al. [8] explored the effectiveness of gray level co-occurrence matrix texture measures for land-use/cover classification in a complex agricultural landscape. Maxwell and Warner [11] investigated the use of multi-temporal terrain data for differentiating mine-reclaimed grasslands from non-mining grasslands. Maxwell et al. [12] assessed RapidEye image-and light detection and ranging (LiDAR)-derived variable sets for geographic object-based image analysis classification of mining and mine reclamation. Maxwell et al. [13] examined the incorporation of LiDAR-derived data for mapping of mining and mine reclamation area by making comparison to data derived by using only RapidEye imagery bands. However, those studies just examined whether the feature sets were effective. There is little research that grades and ranks the importance of feature sets, which might be more beneficial than that of single features for LCCSML. Only few studies have analyzed the relative importance between different feature sets, e.g., the comparison of co-occurrence-, Gabor-, and Markov random fields-based textures for sea-ice classification [28] . Similarly, there is little research that grades the relative importance. As shown in [14] , the random forest (RF) algorithm is easy to implement and can significantly outperform support vector machine and artificial neural network algorithms for the LCCSML. Furthermore, the RF algorithm is known to be less sensitive to the proposed feature set compared to other algorithms, such as support vector machine [14, 18] . Thus, using RF to rank and grade importance of feature sets is more reliable than other algorithms. The objective of this study is to reveal how different feature sets the affect accuracy of LCCSML to rank the importance of feature sets. First, based on our former study [14] , the feature sets derived from ZiYuan-3 stereo satellite imagery (ZY-3), the feature subset resulting from a FS procedure, the training data polygons, and the test sample sets were directly obtained. Then, three types of feature set combination schemes were evaluated by combining FS and the RF algorithm.
doi:10.3390/rs10010023 fatcat:csmpf4udznggdh7famdx5kqfc4