On the Linear-Cost Subtree-Transfer Distance between Phylogenetic Trees

B. DasGupta, X. He, T. Jiang, M. Li, J. Tromp
1999 Algorithmica  
Di erent p h ylogenetic trees for the same group of species are often produced either by procedures that use diverse optimality criteria 16] or from di erent genes 12] in the study of molecular evolution. Comparing these trees to nd their similarities and dissimilarities (i.e. distance) i s thus an important issue in computational molecular biology. Several distance metrics including the nearest neighbor interchange (nni) distance and the subtree-transfer distance have been proposed and
more » ... ely studied in the literature. This article considers a natural extension of the subtreetransfer distance, called the linear-cost subtree-transfer distance, and studies the complexity a n d e cient approximation algorithms for this distance as well as its relationship to the nni distance. The linear-cost subtree-transfer model seems more suitable than the (unit-cost) subtree-transfer model in some applications. The following is a list of our results. 1. The linear-cost subtree-transfer distance is in fact identical to the nni distance on unweighted phylogenies. 2. There is an algorithm to compute an optimal linear-cost subtree-transfer sequence between unweighted phylogenies in O(n 2 O(d) ) time, where d denotes the linear-cost subtree-transfer distance. Such an algorithm is useful when d is small. 3. Computing the linear-cost subtree-transfer distance between two w eighted phylogenetic trees is NP-hard, provided we a l l o w m ultiple leaves of a tree to share the same label (i.e. the trees are not necessarily uniquely labeled). 4. There is an e cient approximation algorithm for computing the linear-cost subtree-transfer distance between weighted phylogenies with performance ratio 2.
doi:10.1007/pl00008273 fatcat:a55jxp6dofbw3eh56vdrqpcoue