Similarity Measures between Proteomic and Transcriptomic Data as a Tool to Highlight Phenotypical Differences in 33 Glioma Stem Cell Lines
MOJ Proteomics & Bioinformatics
With the advances in high-throughput genome/transcriptome sequencing technologies and mass spectrometry (MS)-based proteomics, thousands of gene-protein pairs can be matched and merged in a single experiment. It is of interest to perform a correlative analysis of gene and protein expression data and investigate the nature of their similarity/dissimilarity as it could harbour potential biomarkers or drug targets. Manual determination of data points of interest quickly becomes a very complex and
... a very complex and laborious process. Thus, there is a high demand for automated 'omics' data integration tools that can not only routinely match and combine gene and protein expression values but also provide a measure to highlight meaningful biological insights. In this work, we applied a fast and easy approach to integrate large proteomic and transcriptomic data derived from the deep analysis of glioma cancer stem cells (GSCs). The proposed algorithm provides a mathematical distance between two data sets and asignes a direction of their interrelation based on the abundancies. We distinguished three types of the data correlation: concordant, anticoncordant where protein abundance was higher than that of the corresponding RNA and anticoncordant where protein abundance was lower. We investigated the nature of the observed discordances and were able to separate different, phenotypically divergent, classes of GSC lines.