New approaches for reconstructing phylogenies from gene order data

B. M.E. Moret, L.-S. Wang, T. Warnow, S. K. Wyman
2001 Bioinformatics  
We report on new techniques we have developed for reconstructing phylogenies on whole genomes. Our mathematical techniques include new polynomial-time methods for bounding the inversion length of a candidate tree and new polynomial-time methods for estimating genomic distances which greatly improve the accuracy of neighbor-joining analyses. We demonstrate the power of these techniques through an extensive performance study based on simulating genome evolution under a wide range of model
more » ... ns. Combining these new tools with standard approaches (fast reconstruction with neighborjoining, exploration of all possible refinements of strict consensus trees, etc.) has allowed us to analyze datasets that were previously considered computationally impractical. In particular, we have conducted a complete phylogenetic analysis of a subset of the Campanulaceae family, confirming various conjectures about the relationships among members of the subset and about the principal mechanism of evolution for their chloroplast genome. We give representative results of the extensive experimentation we conducted on both real and simulated datasets in order to validate and characterize our approaches. We find that our techniques provide very accurate reconstructions of the true tree topology even when the data are generated by processes that include a significant fraction of transpositions and when the data are close to saturation.
doi:10.1093/bioinformatics/17.suppl_1.s165 pmid:11473006 fatcat:k3tkrhnmcbctnlbf2nkfua3sdq