Gene family assignment-free comparative genomics

Daniel Doerr, Annelyse Thévenin, Jens Stoye
2012 BMC Bioinformatics  
The comparison of relative gene orders between two genomes offers deep insights into functional correlations of genes and the evolutionary relationships between the corresponding organisms. Methods for gene order analyses often require prior knowledge of homologies between all genes of the genomic dataset. Since such information is hard to obtain, it is common to predict homologous groups based on sequence similarity. These hypothetical groups of homologous genes are called gene families.
more » ... s: This manuscript promotes a new branch of gene order studies in which prior assignment of gene families is not required. As a case study, we present a new similarity measure between pairs of genomes that is related to the breakpoint distance. We propose an exact and a heuristic algorithm for its computation. We evaluate our methods on a dataset comprising 12 γ-proteobacteria from the literature. Conclusions: In evaluating our algorithms, we show that the exact algorithm is suitable for computations on small genomes. Moreover, the results of our heuristic are close to those of the exact algorithm. In general, we demonstrate that gene order studies can be improved by direct, gene family assignment-free comparisons.
doi:10.1186/1471-2105-13-s19-s3 pmid:23281826 pmcid:PMC3526435 fatcat:ydnaay67w5bc7geypzikzhfbv4