Deep learning-based pan-cancer classification model reveals cancer-specific gene expression signatures [article]

Mayur Divate, Aayush Tyagi, Derek J. Richard, Prathosh A Prasad, Harsha Gowda, Shivashankar H Nagaraj
2021 bioRxiv   pre-print
The identification of cancer-specific biomarkers and therapeutic targets is one of the primary goals of cancer genomics. Thousands of cancer genomes, exomes, and transcriptomes have been sequenced to date. In this study, we conducted a pan-cancer analysis of transcriptome datasets from 37 cancer types provided by The Cancer Genome Atlas (TCGA) in an effort to identify cancer-specific gene expression signatures. We employed deep neural networks to train a model on the transcriptome profile
more » ... ts for all cancer types. The model was validated, and its predictive accuracy was determined using an independent dataset, achieving > 97% prediction accuracy across cancer types. This strongly suggests that there are distinct gene expression signatures associated with various cancer types. We interpreted the model using SHapley Additive exPlanations (SHAP) to identify specific gene signatures that significantly contributed to the classification of cancer types. In addition to known biomarkers, we identified several novel biomarkers in different cancer types. These cancer-specific gene signatures are valuable candidates for future studies of their potential utility as cancer biomarkers and putative therapeutic targets.
doi:10.1101/2021.03.15.435283 fatcat:stasu6xmm5hv3m55mttrtjpwcy