Modeling cellular machinery through biological network comparison

Roded Sharan, Trey Ideker
2006 Nature Biotechnology  
Molecular networks represent the backbone of molecular activity within the cell. Recent studies have taken a comparative approach toward interpreting these networks, contrasting networks of different species and molecular types, and under varying conditions. In this review, we survey the field of comparative biological network analysis and describe its applications to elucidate cellular machinery and to predict protein function and interaction. We highlight the open problems in the field as
more » ... as propose some initial mathematical formulations for addressing them. Many of the methodological and conceptual advances that were important for sequence comparison will likely also be important at the network level, including improved search algorithms, techniques for multiple alignment, evolutionary models for similarity scoring and better integration with public databases. Data on molecular interactions are increasing exponentially. Just five years ago, no more than several hundred molecular interactions had been measured for any organism. Nowadays, spurred on by advances in technologies such as mass spectrometry 1,2 , genome-wide chromatin immunoprecipation 3,4 , yeast two-hybrid assays 5-8 , combinatorial reverse genetic screens 9 and rapid literature mining techniques 10,11 , data on thousands of interactions in humans and most model species have become available. This flood of information parallels that seen for genome sequencing efforts in the recent past, and presents exciting new opportunities for understanding cellular biology and disease in the future. Given this landscape, the challenge is to develop new strategies and theoretical frameworks to filter, interpret and organize interaction data into models of cellular function. As with biological sequence analysis, a comparative or evolutionary view provides a powerful base from which to address this challenge. However, and although sequence comparison has long been a staple of biological research, the development of a similar toolbox for comparing biological networks is still in its infancy. Nonetheless, a number of recent advances have made it possible to begin to define this field in terms of the computational methodology it requires and the biological questions it may be able to answer. Conceptually, network comparison is the process of contrasting two or more interaction networks, representing different species, conditions, interaction types or time points. This process aims to answer a number of fundamental biological questions: which proteins, protein interactions and groups of interactions are likely to have equivalent functions across species? Based on these similarities, can we predict new functional information about proteins and interactions that are poorly characterized? What do these relationships tell us about the evolution of proteins, networks and whole species? A final question relates to noise. Given that systematic screens for protein interactions may report large numbers of false-positive measurements 12 , which interactions represent true binding events? On the one hand, confidence measures on interactions can and should be taken into account before network comparison 13-17 . On the other hand, because a false-positive interaction is unlikely to be reproduced across the interaction maps of multiple species, network comparison itself increases confidence in the set of molecular interactions found to be conserved. Such questions have motivated three types, or modes, of comparative methods ( Table 1) . Network alignment is the process of globally comparing two networks, identifying regions of similarity and dissimilarity. Network alignment is commonly applied to detect subnetworks that are conserved across species and, hence, likely to represent true functional modules 18 . Network integration is the process of combining several networks, encompassing interactions of different types over the same set of elements, to study their interrelations. Network integration can assist in predicting protein interactions 19 and uncovering protein modules that are supported by interactions of different types 20,21 . The main conceptual difference from network alignment is that the integrated networks are defined on the same set of elements. The final mode of comparison is network querying, in which a given network is searched for subnetworks that are similar to a subnetwork query of interest 18 . This basic database search operation is aimed at transferring biological knowledge within and across species. In this review, we survey the key analytical techniques that have served to define each mode of analysis along with the open problems they present. We then describe one possible road ahead, inspired by analogous developments in the history of sequence comparison. Pairwise network alignment In basic pairwise network alignment, homologous pairs of interactions, one from each of two molecular interaction networks, are identified. Studies by Matthews et al. 22 and Yu et al. 23 compared protein-protein interaction networks and regulatory networks across species, identifying pairs of interactions, called interologs and regulogs respectively,
doi:10.1038/nbt1196 pmid:16601728 fatcat:4dlkuaeapvcgfjlbaf2rie5jza