Estimating total leaf chlorophyll content of Gannan navel orange leaves using hyperspectral data based on partial least squares regression

Zhongzheng Peng, Lixin Guan, Yubo Liao, Suyun Lian
2019 IEEE Access  
The goal of this study was to model the total leaf chlorophyll content (LCC tot ) of Gannan navel orange leaves using a field imaging spectroscopy system in the visible and near-infrared domain. The spectral range from 400 to 1000 nm with 176 wavebands (a wavelength interval of 3.41 nm) or 360 wavebands (a wavelength interval of 1.67 nm), labeled as "Datasets_1.67" and "Datasets_3.41", respectively, were used. Although different spectral data types were used, better prediction results for LCC
more » ... t were based on Datasets_1.67 for LCC tot prediction. Several prediction models of LCC tot were built based on partial least squares regression (PLSR), artificial neural networks (ANN), ordinary least squares regression (OLSR), and stepwise linear regression (SLR) using full spectral and effective wavelength (EW) data (raw spectral (RS), first derivative spectral (FDS) and second derivative spectral (SDS) data). The determination coefficient (R 2 ), the root mean square error (RMSE) and the residual predictive deviation (RPD) were used to evaluate the reliability and accuracy of the predicted LCC tot values. As a result, 14 (7 obtained from Datasets_1.67, 7 obtained from Datasets_3.41), 39 (21 obtained from Datasets_1.67, 18 obtained from Datasets_3.41) and 50 (27 obtained from Datasets_1.67, 23 obtained from Datasets_3.41) wavebands were selected from the RS data, FDS data and SDS data, respectively, as the EWs for LCC tot prediction of navel orange leaves. After that, PLSR and ANN predictive models were established using full spectra, and OLSR and SLR predictive models were built using the selected EWs. The experimental results demonstrated that these various regression methods were useful for estimating LCC tot in the order of PLSR models established using full spectra from RS data (F-RS-PLSR) > PLSR models established using full spectra from SDS data (F-SDS-PLSR) > PLSR models established using full spectra from FDS data (F-FDS-PLSR) > SLR models established using EWs by RS data (EWs-RS-SLR). However, models built with ANN and OLSR, where the RPD values were less than 3, cause the models to be inaccurate. Finally, in comparison, the F-RS-PLSR model exhibited the best performance of LCC tot estimation; with the number of principal components (Pcs) = 5, this model provided high values of the R 2 of calibration (C-R 2 ) = 0.92 and the R 2 of validation (V-R 2 ) = 0.96, small values of the RMSE of calibration (C-RMSE)=0.05 mg/g and the RMSE of validation (V-RMSE) = 0.19 mg/g, and sufficient the RPD of calibration (C-RPD)=17.00 and the RPD of validation (V-RPD)=3.63 values. Overall, the best modeling method was PLSR. Hence, the PLSR applicability for assessing chlorophyll content in navel orange leaves was demonstrated. INDEX TERMS Chlorophyll, hyperspectral data, navel oranges, partial least squares. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ VOLUME 7, 2019
doi:10.1109/access.2019.2949866 fatcat:ggktqxy2vbaofaj2n22pypyqhm