Spectral affinity in protein networks

Konstantin Voevodski, Shang-Hua Teng, Yu Xia
2009 BMC Systems Biology  
Protein-protein interaction (PPI) networks enable us to better understand the functional organization of the proteome. We can learn a lot about a particular protein by querying its neighborhood in a PPI network to find proteins with similar function. A spectral approach that considers random walks between nodes of interest is particularly useful in evaluating closeness in PPI networks. Spectral measures of closeness are more robust to noise in the data and are more precise than simpler methods
more » ... ased on edge density and shortest path length. Results: We develop a novel affinity measure for pairs of proteins in PPI networks, which uses personalized PageRank, a random walk based method used in context-sensitive search on the Web. Our measure of closeness, which we call PageRank Affinity, is proportional to the number of times the smaller-degree protein is visited in a random walk that restarts at the larger-degree protein. PageRank considers paths of all lengths in a network, therefore PageRank Affinity is a precise measure that is robust to noise in the data. PageRank Affinity is also provably related to cluster comembership, making it a meaningful measure. In our experiments on protein networks we find that our measure is better at predicting co-complex membership and finding functionally related proteins than other commonly used measures of closeness. Moreover, our experiments indicate that PageRank Affinity is very resilient to noise in the network. In addition, based on our method we build a tool that quickly finds nodes closest to a queried protein in any protein network, and easily scales to much larger biological networks. Conclusion: We define a meaningful way to assess the closeness of two proteins in a PPI network, and show that our closeness measure is more biologically significant than other commonly used methods. We also develop a tool, accessible at http://xialab.bu.edu/resources/pnns, that allows the user to quickly find nodes closest to a queried vertex in any protein network available from BioGRID or specified by the user. Background Networks are often used to represent a system where the nodes are a set of agents, and the edges are the relationships/interactions between those agents. We can then use the network topology to find out more about the nodes and the relationships between them. For example, we can find vertices central to the network, which is useful for biological [1,2] and social networks [3] . In addition, we can use the network topology to find communities, in the context of the Internet [4-6], and social and biological net-
doi:10.1186/1752-0509-3-112 pmid:19943959 pmcid:PMC2797010 fatcat:dmfgkdxnonecjfpv3nmj4ceecm