RFEX: Simple Random Forest Model and Sample Explainer for non-Machine Learning experts [article]

Dragutin Petkovic, Ali Alavi, DanDan Cai, Jizhou Yang, Sabiha Barlaskar
2019 bioRxiv   pre-print
Machine Learning (ML) is becoming an increasingly critical technology in many areas. However, its complexity and its frequent non-transparency create significant challenges, especially in the biomedical and health areas. One of the critical components in addressing the above challenges is the explainability or transparency of ML systems, which refers to the model (related to the whole data) and sample explainability (related to specific samples). Our research focuses on both model and sample
more » ... lainability of Random Forest (RF) classifiers. Our RF explainer, RFEX, is designed from the ground up with non-ML experts in mind, and with simplicity and familiarity, e.g. providing a one-page tabular output and measures familiar to most users. In this paper we present significant improvement in RFEX Model explainer compared to the version published previously, a new RFEX Sample explainer that provides explanation of how the RF classifies a particular data sample and is designed to directly relate to RFEX Model explainer, and a RFEX Model and Sample explainer case study from our collaboration with the J. Craig Venter Institute (JCVI). We show that our approach offers a simple yet powerful means of explaining RF classification at the model and sample levels, and in some cases even points to areas of new investigation. RFEX is easy to implement using available RF tools and its tabular format offers easy-to-understand representations for non-experts, enabling them to better leverage the RF technology. Keywords: Machine Learning, Random Forest; explainability; user-in-the-loop
doi:10.1101/819078 fatcat:lcg5lvq4fna2bmu6zck6by2adq