A Combined Random Forest and OBIA Classification Scheme for Mapping Smallholder Agriculture at Different Nomenclature Levels Using Multisource Data (Simulated Sentinel-2 Time Series, VHRS and DEM)

Valentine Lebourgeois, Stéphane Dupuy, Élodie Vintrou, Maël Ameline, Suzanne Butler, Agnès Bégué
2017 Remote Sensing  
Sentinel-2 images are expected to improve global crop monitoring even in challenging tropical small agricultural systems that are characterized by high intra-and inter-field spatial variability and where satellite observations are disturbed by the presence of clouds. To overcome these constraints, we analyzed and optimized the performance of a combined Random Forest (RF) classifier/object-based approach and applied it to multisource satellite data to produce land use maps of a smallholder
more » ... a smallholder agricultural zone in Madagascar at five different nomenclature levels. The RF classifier was first optimized by reducing the number of input variables. Experiments were then carried out to (i) test cropland masking prior to the classification of more detailed nomenclature levels, (ii) analyze the importance of each data source (a high spatial resolution (HSR) time series, a very high spatial resolution (VHSR) coverage and a digital elevation model (DEM)) and data type (spectral, textural or other), and (iii) quantify their contributions to classification accuracy levels. The results show that RF classifier optimization allowed for a reduction in the number of variables by 1.5-to 6-fold (depending on the classification level) and thus a reduction in the data processing time. Classification results were improved via the hierarchical approach at all classification levels, achieving an overall accuracy of 91.7% and 64.4% for the cropland and crop subclass levels, respectively. Spectral variables derived from an HSR time series were shown to be the most discriminating, with a better score for spectral indices over the reflectances. VHSR data were only found to be essential when implementing the segmentation of the area into objects and not for the spectral or textural features they can provide during classification. of staple food production in developing countries [2] . A sustainable improvement of food security for these farmers and populations requires better monitoring of agricultural systems and of their production at regional and global scales. However, satellite observations of these systems are subject to several constraints such as small field sizes, landscape fragmentation, vast within plot and cultivation practices heterogeneity, cloudy conditions, synchronized agro-system and ecosystem phenologies related to rainfall, etc. Many attempts have been made to use remote sensing to objectively characterize and monitor agricultural systems at different scales. Remote sensing approaches have been developed to identify cropping systems and practices, such as crop type and cropping intensity, across large spatial and temporal scales [3] [4] [5] [6] . Given their ability to observe cultivated areas on a uniform timescale and cover large areas, time series composed of low spatial resolution (LSR) satellite images that record phenological changes in crop reflectance characteristics have been identified as a particularly appropriate source of information for the estimation of such data [7] . However, existing satellite sources may not be appropriate for mapping cropping practices of smallholder farms, in which fields are typically smaller than the spatial resolution of readily available LSR satellite data, such as MODIS (Moderate Resolution Imaging Spectroradiometer, 250 m resolution) and even medium spatial resolution (MSR) Landsat data (30 m resolution). Using the Pareto Boundary method [8], one study analyzed the optimal accuracy of cropland maps that could theoretically be reached for a broad range of West African agricultural systems [9] . The authors quantified the expected accuracy of different spatial resolutions (from 500 m to 10 m) and showed that a resolution of 10 m allows one to produce very accurate cropland maps, even in smallholder agriculture regions. After the launch of the second Sentinel-2B satellite in March 2017, the Sentinel-2 mission proposed by the European Space Agency will provide significant improvements from existing Landsat-type sensors with an unprecedented combination of spectral (13 bands), spatial (from 10 to 60 m) and temporal (five day) resolutions over a swath of 290 km (see [10] for more information on the mission). The mission may spur major advances in the mapping of agricultural systems, particularly when methods involve the use of Landsat 8 to increase acquisition frequency levels, which may be needed in countries characterized by frequent cloud coverage. However, such high spatial resolution time series with multiple bands and possible derivations constitute a large volume of data that remains a significant challenge for the automated mapping of agricultural land [11] . An emerging machine learning technique based on the use of ensemble methods (e.g., neural network ensembles, random forests, bagging and boosting) is currently receiving increasing interest [12] . Ensemble classifiers are based on the theory that a set of classifiers gives a more robust outcome than an individual classifier [12] . The ensemble learning technique referred to as Random Forests (RF) [13] is increasingly being applied in land-cover classification using multispectral and hyperspectral satellite sensor imagery [14] [15] [16] [17] [18] [19] [20] . The approach presents many advantages in its application to remote sensing: it is non-parametric, it can manage a large volume of data and variables (even those that are highly correlated), it can measure degrees of variable importance, etc. [12] . A preparatory work of the Sentinel-2 mission based on such a technique was carried out for the crop type classification of 12 contrasting agricultural sites of the JECAM network (Joint Experiment for Crop Assessment and Monitoring, www.jecam.org), including the Antsirabe site in Madagascar [21, 22] . The project was based on a Random Forest analysis of a set of reflectances and spectral indices (Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI) and brightness) extracted at the pixel level, mainly from HRS SPOT4 times series (from the SPOT4 Take5 experiment [23] simulating Sentinel-2). This study highlights the difficulties associated with mapping smallholder agriculture. For sites of intensive farming (France, China, Argentina, Ukraine, etc.), the overall accuracy of classifications of main crop types were always higher than 80%, whereas for sites characterized by smallholder agriculture (JECAM sites in Burkina Faso and Madagascar), the overall accuracies were found to be approximately 50%.
doi:10.3390/rs9030259 fatcat:7pgrnwnejbhb5kvjs3gjzxj7y4