Mining information for functional genomics

L. Bernardi, E. Ratsch, R. Kania, J. Saric, I. Rojas, J.H. Park, B.R. Schatz, C. Blaschke, A. Valencia, C. Nedellec
2002 IEEE Intelligent Systems  
Functional genomics is a relatively new field of molecular biology that studies how genomic information defines the functions of proteins in living organisms. The complexity of the domain that functional genomics examines and the amount of data this field produces demand that we adopt powerful computational techniques for data analysis. Advanced natural language processing (NLP) and data mining methods, enriched by ontologies that specify the domain's semantics, seem to be a promising approach.
more » ... However, interdisciplinary work will play a key role in this field's development. Functional genomics The human genome project was the catalyst for develop-ing several high-throughput technologies, making it possible to map and sequence complex genomes. Nevertheless, knowing an organism's genome sequence is only an initial step in the quest to understand life's essential processes. The research paradigm is shifting from genome sequencing and mapping toward the description of genomic and proteomic functions. The genome describes an organism's genetic content, whereas the proteome defines the totality of a cell's proteins. Proteomics includes identifying proteins, characterizing their physicochemical properties, and describing their functions. Functional genomics combines highthroughput experimental methodologies with statistical and computational analysis of the results to study genes or proteins in a systematic and systemic fashion. 1 The innovation in functional genomics is to extend the approach from studying single genes and proteins to examining, in a systemic way, the functional networks that genes and proteins form in a cell. Using high-throughput techniques, functional genomics generates large amounts of data on how genes are expressed, Mining Information for Functional Genomics As a grad student, you are told, "One day in the library saves you one month in the lab." But what if the books are written in a language you don't understand? What if the literature on decryption constitutes a huge library of its own? This is exactly the situation you now find in genomics. We have found the letters for the human genome, but we have yet to understand their meaning. Functional genomics tries to uncover the meaning of the language in which the human genome is written. For this, you (of course) must exploit your lab experiments, but taking advantage of computational models and of the related work described in the literature becomes an enormous, important task. Mining information from genome sequences, interaction networks, or, to a very large extent, texts about functional genomics becomes preeminent. In this installation, contributors describe some of the principal challenges in this area and pathways to approaching them.
doi:10.1109/mis.2002.1005634 fatcat:bpvtgxfojre5xc2ndy62ho5df4