On the complexity and approximation of syntenic distance

B. DasGupta, T. Jiang, S. Kannan, M. Li, Z. Sweedyk
1997 Proceedings of the first annual international conference on Computational molecular biology - RECOMB '97  
The paper studies the computational complexity and approximation algorithms for a new evolutionary distance between multi-chromosomal genomes introduced recently by F erretti, Nadeau and Sanko . Here, a chromosome is represented as a set of genes and a genome is a collections of chromosomes. The syntenic distance between two genomes is de ned as the minimum number of translocations, fusions and ssions required to transform one genome into the other. We prove that computing the syntenic distance
more » ... e syntenic distance is NP-hard and give a simple approximation algorithm with performance ratio 2. For the case when an upper bound d on the syntenic distance is known, we s h o w that an an optimal syntenic sequence can be found in O(nk + 2 O(d 2 ) ) time, where n and k are the numb e r o f c hromosomes in the two g i v en genomes. Next, we s h o w that if the set of operations for transforming a genome is signi cantly restricted, we can nevertheless nd a solution that performs at most O(log d) additional moves, where d is the numb e r o f m o ves performed by the unrestricted optimum. This result should help in the design of approximation algorithms. Finally, w e i n vestigate the median problem: Given three genomes, construct a genome minimizing the total syntenic distance to the three given genomes and compute the corresponding median distance. The problem has application in the inference of phylogenies based on the syntenic distance. We p r o ve that the problem is NP-hard and design a polynomial time approximation algorithm with a performance ratio of 4 + for any constant > 0.
doi:10.1145/267521.267536 dblp:conf/recomb/DasGuptaJKLS97 fatcat:66edrttemnbwthk6v7xbqzgzc4