Hyperspectral remote sensing of plant biochemistry using Bayesian model averaging with variable and band selection

Kaiguang Zhao, Denis Valle, Sorin Popescu, Xuesong Zhang, Bani Mallick
2013 Remote Sensing of Environment  
Nitrogen Carbon Band selection Bayesian model averaging MCMC Model misspecification Model selection Model uncertainty Model specification remains challenging in spectroscopy of plant biochemistry, as exemplified by the availability of various spectral indices or band combinations for estimating the same biochemical. This lack of consensus in model choice across applications argues for a paradigm shift in hyperspectral methods to address model uncertainty and misspecification. We demonstrated
more » ... such method using Bayesian model averaging (BMA), which performs variable/band selection and quantifies the relative merits of many candidate models to synthesize a weighted average model with improved predictive performances. The utility of BMA was examined using a portfolio of 27 foliage spectral-chemical datasets representing over 80 species across the globe to estimate multiple biochemical properties, including nitrogen, hydrogen, carbon, cellulose, lignin, chlorophyll (a or b), carotenoid, polar and nonpolar extractives, leaf mass per area, and equivalent water thickness. We also compared BMA with partial least squares (PLS) and stepwise multiple regression (SMR). Results showed that all the biochemicals except carotenoid were accurately estimated from hyerspectral data with R 2 values> 0.80. Compared to PLS and SMR, BMA substantially reduced overfitting and enhanced model generalization; BMA also yielded error estimation better indicative of true uncertainties in predictions, when evaluated using a statistic called "prediction interval coverage probability". The relative band importance, which was quantified by band selection probability, differed markedly between BMA and SMR, cautioning the use of SMR for band selection. Computationally, the model calibration with datasets of moderate sizes (>100) was faster for BMA via a hybrid reversible-jump Monte Carlo Markov Chain sampler than for PLS via literal optimization of a cross-validation criterion. Our BMA scheme also provides a generic hierarchical Bayesian framework to assimilate prior knowledge of diverse forms, as illustrated by its use to account for nonlinearity in spectral-chemical relationships. We emphasize that BMA is a competitive, paradigm-shifting alternative to conventional statistical methods and it will find wide use as the virtue of Bayesian inference is increasingly appreciated by the remote sensing community.
doi:10.1016/j.rse.2012.12.026 fatcat:nfib3obnuzdevitawjnman4oti