Integrating multi-attribute similarity networks for robust representation of the protein space

Orhan Çamoğlu, Tolga Can, Ambuj K. Singh
2006 Computer applications in the biosciences : CABIOS  
Motivation: A global view of the protein space is essential for functional and evolutionary analysis of proteins. In order to achieve this, a similarity network can be built using pairwise relationships among proteins. However, existing similarity networks employ a single similarity measure and therefore their utility depends highly on the quality of the selected measure. A more robust representation of the protein space can be realized if multiple sources of information are used. Results: We
more » ... opose a novel approach for analyzing multi-attribute similarity networks by combining random walks on graphs with Bayesian theory. A multi-attribute network is created by combining sequence and structure based similarity measures. For each attribute of the similarity network, one can compute a measure of affinity from a given protein to every other protein in the network using random walks. This process makes use of the implicit clustering information of the similarity network, and we show that it is superior to naive, local ranking methods. We then combine the computed affinities using a Bayesian framework. In particular, when we train a Bayesian model for automated classification of a novel protein, we achieve high classification accuracy and outperform single attribute networks. In addition, we demonstrate the effectiveness of our technique by comparison with a competing kernel-based information integration approach. Availability: Source code is available upon request from the primary
doi:10.1093/bioinformatics/btl130 pmid:16595556 fatcat:z2w24bmub5bshczaspxbtbglxy