Probability models for genome rearrangement and linear invariants for phylogenetic inference

David Sankoff, Mathieu Blanchette
1999 Proceedings of the third annual international conference on Computational molecular biology - RECOMB '99  
We review the combinatorial optimization problems in calculating edit distances between genomes and phylogenetic inference based on minimizing gene order changes. With a view to avoiding the computational cost and the "long branches attract" artifact of some tree-building methods, we explore the probabiization of genome rearrangment models prior to developing a methodology based on branch-length invariants. We characterize probabilistically the evolution of the structure of the gene adjacency
more » ... he gene adjacency set for inversions on unsigned circular genomes and, using a non-trivial recurrence relation, inversions on signed genomes. Concepts from the theory of invariants developed for the phylogenetics of ho mologous gene sequences can be used to derive a complete set of linear invariants for unsigned inversions, as well as for a mixed rearrangement model for signed genomes, though not for pure transposition nor pure signed inversion models. The invariants are based on an extended Jukes-Cantor semigroup. We ilhrstrate the use of these invariants to relate mitochondrial genomes from a number of invertebrate animals. 'Centre de recberches mathematiques, Universit6 de Mont&al, CP 6128 Succursale Centre-ville,
doi:10.1145/299432.299506 dblp:conf/recomb/SankoffB99 fatcat:dy7gcyf2xjfkxi6rwkrdygiriq