### Computing Phylogenetic Roots with Bounded Degrees and Errors [chapter]

Zhi-Zhong Chen, Tao Jiang, Guo-Hui Lin
2001 Lecture Notes in Computer Science
Given a set of species and their similarity data, an important problem in evolutionary biology is how to reconstruct a phylogeny (also called evolutionary tree) so that species are close in the phylogeny if and only if they have high similarity. Assume that the similarity data are represented as a graph G = (V, E) where each vertex represents a species and two vertices are adjacent if they represent species of high similarity. The phylogeny reconstruction problem can then be abstracted as the
more » ... oblem of finding a (phylogenetic) tree T from the given graph G such that (1) T has no degree-2 internal nodes, (2) the external nodes (i.e. leaves) of T are exactly the elements of V , and (3) u, v) denotes the distance between u and v in tree T . This is called the Phylogenetic kth Root Problem (PRk), and such a tree T , if exists, is called a phylogenetic kth root of graph G. The computational complexity of PRk is open, except for k ≤ 4. In this paper, we investigate PRk under a natural restriction that the maximum degree of the phylogenetic root is bounded from above by a constant. Our main contribution is a linear-time algorithm that determines if G has such a phylogenetic kth root, and if so, demonstrates one. On the other hand, as in practice the collected similarity data are usually not perfect and may contain errors, we propose to study a generalized version of PRk where the output phylogeny is only required to be an approximate root of the input graph. We show that this and other related problems are computationally intractable.