Land Cover Classification in an Ecuadorian Mountain Geosystem Using a Random Forest Classifier, Spectral Vegetation Indices, and Ancillary Geographic Data

Johanna Ayala-Izurieta, Carmen Márquez, Víctor García, Celso Recalde-Moreno, Marcos Rodríguez-Llerena, Diego Damián-Carrión
2017 Geosciences  
We presented a methodology to accurately classify mountainous regions in the tropics. These landscapes are complex in terms of their geology, ecosystems, climate and land use. Obtaining accurate maps to assess land cover change is essential. The objectives of this study were to (1) map vegetation using the Random Forest Classifier (RFC), spectral vegetation index (SVI), and ancillar geographic data (2) identify important variables that help differentiate vegetation cover, and (3) assess the
more » ... racy of the vegetation cover classification in hard-to-reach Ecuadorian mountain region. We used Landsat 7 ETM+ satellite images of the entire scene, a RFC algorithm, and stratified random sampling. The altitude and the two band enhanced vegetation index (EVI2) provide more information on vegetation cover than the traditional and often use normalized difference vegetation index (NDVI) in other settings. We classified the vegetation cover of mountainous areas within the 1016 km 2 area of study, at 30 m spatial resolution, using RFC that yielded a land cover map with an overall accuracy of 95%. The user's accuracy and the half-width of the confidence interval for 95% of the basic map units, forest (FOR), páramo (PAR), crop (CRO) and pasture (PAS) were 95.85% ± 2.86%, 97.64% ± 1.24%, 91.53% ± 3.35% and 82.82% ± 7.74%, respectively. The overall disagreement was 4.47%, which results from adding 0.43% of quantity disagreement and 4.04% of allocation disagreement. The methodological framework presented in this paper and the combined use of SVIs, ancillary geographic data, and the RFC allowed the accurate mapping of hard-to-reach mountain landscapes as well as uncovering the underlying factors that help differentiate vegetation cover in the Ecuadorian mountain geosystem. Geosciences 2017, 7, 34 2 of 21 and biological aspects. Thus, vegetation cover can be a measurable indicator of the functionalities of a mountain ecosystem. Changes in vegetation cover reflect alterations in natural factors that impact vegetation growth and its performance, as well as the occurrence of anthropogenic factors [1]. The information obtained from monitoring changes in vegetation cover and land use can quantify the effects of primary sources of soil degradation such as deforestation, and the dynamic alteration and transformation of land use over time. Also, this information can serve as essential input in an early warning system for the possible occurrence of potential and irreversible changes in the functionalities of a mountain ecosystem. The ability to remotely map the vegetation cover of mountain geosystems makes it possible to perform environmental analyses that cannot be easily conducted in the field while monitoring changes in land use. In this context, the additional information provided by geographical information systems and the SVIs are a fundamental requirement for understanding both natural and human-induced changes to vegetation cover and their implications [2] . Vegetation cover analysis is based on defining a classification scheme and a categorization method that allows the identification of primary units. Categorization is a standard application of satellite images. Current attention is shifting from methods of statistical classification (parametric) to methods based on machine learning (ML) or other non-parametric methods. This shifting is because ML methods do not have data distribution assumptions and can handle complex feature spaces and non-normal data. Existing literature suggests that the RFC offers great potential and achieves better results in the categorization of complex scenes [3]. In general, the RFC makes very precise classifications, provides information about the importance of the predictors (variables), classifies outliers, estimates missing data, and gives an estimate of the error rate associated with the prediction. The accuracy and importance of the predictors (variables) are automatically generated, and overfitting is not a problem. Also, RFC is not sensitive to outlier data values and contains a set of parameters that is easy to initialize. Images from the Landsat satellites are useful for detecting vegetation cover changes, with sufficient detail to provide relevant information for understanding processes and supporting decision-making if appropriate categorization schemes and classifiers are used. Unfortunately, the mapping of vast areas in complex landscapes is difficult due to abrupt environmental changes of humidity, altitude, temperature, and topographic. Páramos typically dominate Ecuadorian mountain geosystems, and anthropogenic factors are the primary source of degradation. Cultivation rapidly reduces the functionality of páramos [4] . This accelerated soil degradation causes a rapid advancement of the agricultural frontier and an accelerated and "irreversible" deterioration of the proper functioning of the páramo soils. It is, therefore, necessary to evaluate vegetation cover in Ecuadorian mountain geosystems efficiently and rapidly to gain a better understanding of the underlying factors that influence vegetation cover, monitor vegetation cover changes, and improve our ability to address emerging incidents promptly. The objectives of this study were to (1) map vegetation using the RFC, spectral vegetation index, and ancillar geographic data (2) identify important variables that help differentiate vegetation cover, and (3) assess the accuracy of the vegetation cover classification in hard-to-reach Ecuadorian mountain region. To achieve these objectives , we combined spectral vegetation indices derived from Landsat 7 ETM+ satellite with ancillary geographic data to configure a subset of data to train the RFC algorithm. We subsequently mapped the spatial distribution of the vegetative land cover. Materials and Methods The land cover assessment is from March 2011, and the land cover changes are not estimated. Post classification accuracy assessment was conducted by generating stratified random points within each vegetation class; points were attributed to their actual land cover type through the expert opinion and to their class by intersecting with the classified image. Geosciences 2017, 7, 34 3 of 21 Studied Area The studied area is a mountainous and rugged area with very irregular topography in the Ecuadorian Andes, situated in the Achupallas parish in the southwest of Sangay National Park, province of Chimborazo, Ecuador. It is located 300 km south of the city of Quito and covers an area of 1016 km 2 , which is in the rectangle defined by the UTM coordinates x = 743,089.8; y = 9,760,133.5 and x = 782,504.2; y = 9,715,844.
doi:10.3390/geosciences7020034 fatcat:s26wdxggenfvpc7chnqnhv4c7a