Feature-free Explainable Data Mining in SAR Images Using Latent Dirichlet Allocation

Chandrabali Karmakar, Corneliu Octavian Dumitru, Gottfried Schwarz, Mihai Datcu
2020 IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing  
In this article, we propose a promising approach for the application-oriented content classification of spaceborne radar imagery that presents an interesting alternative to popular current machine learning algorithms. In the following, we consider the problem of unsupervised feature-free satellite image classification with already known classes as an explainable data mining problem for regions with no prior information. Three important issues are addressed here: explainability, feature
more » ... nce, and unsupervision. There is an increasing demand toward explainable machine learning models as they strive to meet the "right to explanation." The importance of feature-free classification stems from the problem that different classification outcomes are obtained from using different features and the complexity of computing sophisticated image primitive features. Developing unsupervised discovery techniques helps overcome the limitations in object discovery due to the lack of labeled data and the dependence on features. In this article, we demonstrate the applicability of a latent Dirichlet allocation (LDA) model, one of the most established unsupervised probabilistic methods, in discovering the latent structure of synthetic aperture radar data. The idea is to use LDA as an explainable data mining tool to discover scientifically explainable semantic relations. The suitability of the approach as an explainable model is discussed and interpretable topic representation maps are produced, which practically demonstrate the idea of "interpretability" in an explainable machine learning paradigm. LDA discovers the latent structures in the data as a set of topics. We create the interpretable visualizations of the data utilizing these topics and compute the topic distributions for each land-cover class. Our results show that each class has a distinct topic distribution that represents that particular class. Then these classes can be grouped based on their similarity of topic composition. Both the topic composition and grouping are explainable by domain experts. Index Terms-Bag of words technique, discovery, explainable machine learning, interpretability, latent Dirichlet allocation (LDA), synthetic aperture radar (SAR), unsupervised image classification.
doi:10.1109/jstars.2020.3039012 fatcat:eynukzau7jgebnxxquh35c4axa