Deep Principal Correlated Auto-encoders With Application to imaging and Genomics Data Integration

Gang Li, Chao Wang, De-Peng Han, Yi-Pu Zhang, Peng Peng, Vince D. Calhoun, Yu-Ping Wang
2020 IEEE Access  
In terms of complex diseases like schizophrenia, more and more studies are beginning to treat genetic variants and brain imaging phenotypes as an important factor. In this paper, a competent optimization model is exploited to overcome the weakness of deep canonical correlation analysis (DCCA). The model consists of principal component analysis (PCA) on the multi-modality linear features learning and multilayer belief networks on multi-modality nonlinear features learning. In order to complete a
more » ... better result of correlation analysis and classification, the output nodes of multi-layer belief network are used for back propagation (BP) network training. Previous works on solving canonical correlation analysis (CCA) had proposed several models based on deep neural network or regularization, typically involving either some form of norm or auto-encoders with a reconstruction objective. Many existing advanced models had been developed to find the maximal correlation in multi-modality data. However, these multi-modality data tend to have the number of feature dimensions which more than that of samples. Differ from these advanced models, our proposed model is applied to analyze the real set of multi-modality data and test several previous models, then comparing them experimentally on fMRI imaging and SNPs genomics. In experiments, the results show that our model, deep principal correlated auto-encoders (DPCAE), learns features with effectively higher correlation and better performance of classification than those previous models. In terms of classification accuracy, the classification accuracy of the datasets exceeds 90%, but that of the CCA-based model are about 65%, and that of the DNN-based model are about 80%, the classification accuracy of the DPCAE is significantly improved obviously. In the experiment of clustering performance evaluation, the DPCAE further verified its superior classification performance with an average normalized mutual information index of 93.75% and an average classification error rate index of 3.8%. In terms of maximal correlation analysis, the model outperforms other advanced models with a maximal correlation of 0.926, showing excellent performance in high-dimensional data analysis. INDEX TERMS Classification, data fusion, dimensionality reduction, belief network, optimization algorithm, principal component analysis.
doi:10.1109/access.2020.2968634 fatcat:pcvqm2edkbhkfmz6qhbnbxwkai