Hybridization in Nonbinary Trees
S. Linz, C. Semple
2009
IEEE/ACM Transactions on Computational Biology & Bioinformatics
Reticulate evolution-the umbrella term for processes like hybridization, horizontal gene transfer, and recombination-plays an important role in the history of life of many species. Although the occurrence of such events is widely accepted, approaches to calculate the extent to which reticulation has influenced evolution are relatively rare. In this paper, we show that the NP-hard problem of calculating the minimum number of reticulation events for two (arbitrary) rooted phylogenetic trees
more »
... terized by this minimum number is fixed-parameter tractable. Index Terms Rooted phylogenetic tree, reticulate evolution, hybridization network. July 18, 2008 DRAFT IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. XXX, NO. XXX, XXX 4 Since simultaneous speciation events only occur rarely, we typically assume that all polytomies in a phylogenetic tree are soft. The reconstruction of a strictly bifurcating (binary) tree may consequently force refinements that are not necessarily optimal in terms of the hybridization number. An example for that is depicted in Fig. 1 , where two binary refinements S 1 and S 2 of the tree T ′ are shown. While the hybridization number for S 1 and T is 0, this number for S 2 and T is 1. In this paper, we show that the decision problem of asking whether the minimum number of hybridization events to explain two (arbitrary) rooted phylogenetic trees is at most k is fixedparameter tractable. We now describe the above-mentioned problem formally beginning with several definitions. A rooted phylogenetic X-tree T is a rooted tree with no degree-2 vertices except possibly the root which has degree at least two, and with leaf set X. The set X is called the label set of T and is denoted by L(T ). In addition, T is binary if, apart from the root which has degree two, all interior vertices have degree three. Let Y be a subset of X. We call Y an (edge) cluster of T if there is an edge e, or equivalently a vertex v, whose set of descendants in X is precisely Y . We denote this cluster by C T (v), or simply C(v) if there is no ambiguity. The set of clusters of T is denoted by C(T ). Furthermore, Hybridization networks are a generalization of evolutionary trees that allow for a simultaneous visualization of several conflicting or alternating histories of life. Such a network embeds a collection of gene trees representing a set of present-day species, where each vertex whose indegree is greater than 1 represents a hybrid species. Mathematically speaking, a hybridization network H (on X) is a rooted acyclic digraph with root ρ in which (i) X is the set of vertices of out-degree zero, (ii) the out-degree of ρ is at least 2, and (iii) for each vertex with out-degree 1, its in-degree is at least 2.
doi:10.1109/tcbb.2008.86
pmid:19179697
fatcat:y6mafdya4bdbril6gpzpf3ckvu