Investigation of the random forest framework for classification of hyperspectral data

J. Ham, Yangchi Chen, M.M. Crawford, J. Ghosh
2005 IEEE Transactions on Geoscience and Remote Sensing  
Statistical classification of byperspectral data is challenging because the inputs are high in dimension and represent multiple classes that are sometimes quite mixed, while the amount and quality of ground truth in the form of labeled data is typically limited. The resulting classifiers are often unstable and have poor generalization. This paper investigates two approaches based on the concept of random forests of classifiers implemented within a binary hierarchical multiclassifier system,
more » ... the goal of achieving improved generalization of the classifier in analysis of hyperspectral data, particularly when the quantity of training data is limited. A new classifier is proposed that incorporates bagging of training samples and adaptive random subspace feature selection within a binary hierarchical classifier (BHC), such that the number of features that is selected at each node of the tree is dependent on the quantity of associated training data. Results are compared to a random forest implementation based on the framework of classification and regression trees. For both methods, classification results obtained from experiments on data acquired by the National Aeronautics and Space Administration (NASA) Airborne Visible/Infrared Imaging Spectrometer instrument over the Kennedy Space Center, Florida, and by Hyperion on the NASA Earth Observing 1 satellite over the Okavango Delta of Botswana are superior to those from the original best basis BHC algorithm and a random subspace extension of the BHC.
doi:10.1109/tgrs.2004.842481 fatcat:t6gxpls2srfxfmmhzm5yuv7lda