Computational Identification of Operons in Microbial Genomes

Yu Zheng, Joseph D. Szustakowski, Lance Fortnow, Richard J. Roberts, Simon Kasif
2001 Genome Research  
By applying graph representations to biochemical pathways, a new computational pipeline is proposed to find potential operons in microbial genomes. The algorithm relies on the fact that enzyme genes in operons tend to catalyze successive reactions in metabolic pathways. We applied this algorithm to 42 microbial genomes to identify putative operon structures. The predicted operons from Escherichia coli were compared with a selected metabolism-related operon dataset from the RegulonDB database,
more » ... elding a prediction sensitivity (89%) and specificity (87%) relative to this dataset. Several examples of detected operons are given and analyzed. Modular gene cluster transfer and operon fusion are observed. A further use of predicted operon data to assign function to putative genes was suggested and, as an example, a previous putative gene (MJ1604) from Methanococcus jannaschii is now annotated as a phosphofructokinase, which was regarded previously as a missing enzyme in this organism. GC content changes in the operon region and nonoperon region were examined. The results reveal a clear GC content transition at the boundaries of putative operons. We looked further into the conservation of operons across genomes. A trp operon alignment is analyzed in depth to show gene loss and rearrangement in different organisms during operon evolution.
doi:10.1101/gr.200602 pmid:12176930 pmcid:PMC186635 fatcat:pdxyu2uzofhexoubmxk4et4x24