Multi-platform Genomic Data Analysis Using Multimodal Autoencoder

2017 DEStech Transactions on Computer Science and Engineering  
Analyzing genomic expression profile is a significant way to identify cancer subtypes which can reveal insightful information of cancer pathogenesis and thus improve the prediction of patients' survival time. Many approaches have been developed to analyze the genomic data of different cancer subtypes; most of them, however, are only capable of analyzing the genomic data from a single platform, e.g., gene expression, miRNA expression, or DNA methylation. In this paper, we propose an unsupervised
more » ... deep learning method based on the multimodal autoencoder (MMAE) that is capable of analyzing the crossplatform genomic data for cancer subtype identification. The method starts with applying entropy information to the ultra-high-dimensional genomic data to select genomic variables with good classification capabilities. The MMAE networks are then introduced to extract and fuse the features from different genomic platforms. The MMAE networks are thus capable of capturing both the intra-structures of genomic data from a single platform and the inter-correlations among different platforms. Finally, the K-means method is performed to cluster the patients into subtypes. Experiments on the ovarian (OV) cancer patient dataset show that the proposed method effectively extracts the latent features of the genetic data and successfully clusters the patients into different subtypes with distinct survival characters.
doi:10.12783/dtcse/cib2015/16153 fatcat:tobzn25rozalpdntu4nlwikqyu