Deep learning generates custom-made logistic regression models for explaining how breast cancer subtypes are classified [article]

Takuma Shibahara, Chisa Wada, Yasuho Yamashita, Kazuhiro Fujita, Masamichi Sato, Atsushi Okamoto, Yoshimasa Ono
2021 bioRxiv   pre-print
Breast cancer is the most frequently found cancer in women and the one most often subjected to genetic analysis. Nonetheless, it has been causing the largest number of women's cancer-related deaths. PAM50, the intrinsic subtype assay for breast cancer, is beneficial for diagnosis and stratified treatment but does not explain each subtype's mechanism. Nowadays, deep learning can predict the subtypes from genetic information more accurately than conventional statistical methods. However, the
more » ... ous studies did not directly use deep learning to examine which genes associate with the subtypes. Ours is the first study on a deep-learning approach to reveal the mechanisms embedded in the PAM50-classified subtypes. We developed an explainable deep learning model called a point-wise linear model, which uses a meta-learning approach to generate a custom-made logistic regression model for each sample. Logistic regression is familiar to physicians and medical informatics researchers, and we can use it to analyze which genes are important for subtype prediction. The custom-made logistic regression models generated by the point-wise linear model for each subtype used the specific genes selected in other subtypes compared to the conventional logistic regression model: the overlap ratio is less than twenty percent. And analyzing the point-wise linear model's inner state, we found that the point-wise linear model used genes relevant to the cell cycle-related pathways. The results of this study suggest the potential of our explainable deep learning to play a vital role in cancer treatment.
doi:10.1101/2021.05.10.443518 fatcat:cdkyetmyqvgpzbaqdu5ftrsula