ELMO: An Efficient Logistic Regression-based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes

Yexian Zhang, Ruoyao Shi, Chaorong Chen, Meiyu Duan, Shuai Liu, Yanjiao Ren, Lan Huang, Xiaofeng Dai, Fengfeng Zhou
2019 IEEE Access  
Breast cancer is one of the most frequently occurring female cancer types and represents a major cause of death among women worldwide. Breast cancer is heterogeneous in both molecular characteristics and clinical outcomes for its different molecular subtypes. High-throughput technologies facilitated the fast accumulations of the multiple Omic data for cancer patients. These data sources posed a computational challenge for the efficient integrated multi-Omic analysis. The existing studies
more » ... investigated the differential representation or machine learning problems using a single type of Omic data. This study hypothesized that different Omic types contributed complementary information to each other, and their integrated analysis may improve the single-Omic models. An efficient logistic regression-based multi-Omic integrated analysis method (ELMO) was proposed to integrate the RNA-seq and DNA methylation data to detect the breast cancer intrinsic subtypes. ELMO achieved the highest accuracy with a smaller number of features compared with the existing filter and wrapper feature selection methods in this study. The experimental data supported our hypothesis that multi-Omic models outperformed the single-Omic ones. INDEX TERMS Breast cancer, intrinsic subtypes, multi-omics, feature selection.
doi:10.1109/access.2019.2960373 fatcat:mwpzs2g7wjfzvc4j4r732du4o4