From digital genetics to knowledge discovery: Perspectives in genetic network understanding

Guillaume Beslon, David P. Parsons, Jose-María Peña, Christophe Rigotti, Yolanda Sanchez-Dehesa, Evgenii Evgenii, José-María Peña
2010 Intelligent Data Analysis  
In this paper, we propose an original computational approach to assist knowledge discovery in complex biological networks. First, we present an integrated model of the evolution of regulation networks that can be used to uncover organization principles of such networks. Then, we propose to use the results of our model as a benchmark for knowledge discovery algorithms. We describe a first experiment of such benchmarking by using gene knock-out data generated from the modeled organisms. File:
more » ... rganisms. File: ida415.tex; BOKCTP/llx p. 2 2 G. Beslon et al. / From digital genetics to knowledge discovery: Perspectives in genetic network understanding behavior of specific organisms under some specific conditions?" (e.g., carbon starvation response in E. coli [36] ). On the other hand, one can raise the question of general principles of genetic regulation: Does the complex gene network structure correspond to a particular -i.e., integrated -way of controlling the metabolic activity of biological organisms? Various hypotheses have been proposed to explain the structure of these networks in terms of evolutionary forces: Mutational patterns (i.e., neutral evolution), direct selective forces or indirect selection for robustness or evolvability. To cite but a few, it was proposed (i.) that the power-law connectivity of regulation networks could be a consequence of gene duplication-divergence [28], (ii.) that the modularity of regulation networks could be selected for because it allows fast adaptation to variations of the environment [20] or (iii.) that overrepresented Feed-Forward Loop motifs are selected for because they allow a fine tuning of the response delays in regulation networks [4] . Every engineer knows how to design large systems by building a hierarchy of modules in order to reduce the inter-dependence between modules and to rationalize the system's conception. Yet the modular organization of our engineered systems is a consequence of our intellectual limits as system designers. But evolution is not an engineer. It acts by trial and error, keeping the structures that are most effective, whatever their complexity and intricacy, ignoring our attempts to understand its product. Systems biology is often considered as a reverse engineering process applied to biological entities: Given the observations of the behavior of the biological system (and possibly, its response to man-made perturbations [18]), biologists are to decipher the organizational principles of the system. However, in the case of reverse engineering, it can be assumed that the observed system was conceived by an engineer, following reasonable conception rules and reasoning. In the case of biological systems, we cannot suppose evolution to be rational at all (at least in our common -i.e., human -sense). Yet, this does not mean that there are no rules in evolution: Under some specific conditions (e.g., cyclic environments) it has been shown that evolution can behave as an engineer and produce organized systems [2, 20] . The existence of other general laws that could govern the organization of biological networks depending on external conditions is an open question. Our grand challenge is hence to identify the "language" that evolution has created for regulation networks and how it can be translated from a structural description (i.e., the set of weighted links, motifs and modules) to a functional description (the cell behavior) [41] . These questions are very difficult to tackle with real organisms, either because they require long and complex experimental setups or because results are difficult to analyze given the little knowledge available. In particular, it is difficult to trace changes in genomes and to identify selected traits in real organisms. A possible solution is then to develop in silico models of evolution -i.e., digital genetic models [1] -and to test how, within these models, organisms evolve depending on the environmental conditions. Such models have already shown to be interesting to study how organisms have their structure and complexity adapted, resulting in an increased robustness [40] . As far as regulation networks are concerned, computational evolution has been used to investigate the evolvability of networks [9] or the development of modular structures under cyclic environmental conditions [20] . One of the best-known models -namely the GRN model -has been proposed by Wolgang Banzhaf [5] and used to investigate the emergence of specific topological properties in regulatory networks [28] . More recently, Claudio Mattiussi and Dario Floreano have proposed the "Analog Genetic Encoding" framework [32] which has been latter on used to investigate the modular structure of regulation networks [31] . Other authors have used computational evolution in order to evolve small networks performing predefined functions such as oscillators [12, 29] , clocks [23] or switches [12] . In this paper, we propose an original computational approach to assist the understanding of complex biological networks. First, we describe the RAevol model, an integrated model of the evolution of Galley Proof
doi:10.3233/ida-2010-0415 fatcat:uwhtdxcklrg7pm3aqk56ytkuqy