Age-Dependent Speciation Can Explain the Shape of Empirical Phylogenies

Oskar Hagen, Klaas Hartmann, Mike Steel, Tanja Stadler
2015 Systematic Biology  
Tens of thousands of phylogenetic trees, describing the evolutionary relationships between hundreds of thousands of taxa, are readily obtainable from various databases. From such trees, inferences can be made about the underlying macroevolutionary processes, yet remarkably these processes are still poorly understood. Simple and widely used evolutionary null models are problematic: Empirical trees show very different imbalance between the sizes of the daughter clades of ancestral taxa compared
more » ... what models predict. Obtaining a simple evolutionary model that is both biologically plausible and produces the imbalance seen in empirical trees is a challenging problem, to which none of the existing models provide a satisfying answer. Here we propose a simple, biologically plausible macroevolutionary model in which the rate of speciation decreases with species age, whereas extinction rates can vary quite generally. We show that this model provides a remarkable fit to the thousands of trees stored in the online database TreeBase. The biological motivation for the identified age-dependent speciation process may be that recently evolved taxa often colonize new regions or niches and may initially experience little competition. These new taxa are thus more likely to give rise to further new taxa than a taxon that has remained largely unchanged and is, therefore, well adapted to its niche. We show that age-dependent speciation may also be the result of different within-species populations following the same laws of lineage splitting to produce new species. As the fit of our model to the tree database shows, this simple biological motivation provides an explanation for a long standing problem in macroevolution. [Birth-death process; diversification; macroevolution; Stochastic models.] Macroevolutionary models generate phylogenetic trees representing processes by which an ancestor species evolves a diversity of species through speciation and extinction. Exploring the behavior of such models contributes toward explaining how present biodiversity evolved (Mooers and Heard 1997). Every model is a simplification of a complex system, but such abstractions may help identify patterns and raise new hypotheses. Comparing these models with empirical data enables us to test such hypotheses, and thus helps to understand evolutionary processes and identify particular deterministic forces (Hey 1992). The macroevolutionary models range from simple to complex, and even from the behavior and properties of the simplest ones much can be understood and learned (Hartmann et al. 2010; Stadler 2013) . The most basic macroevolutionary null model is the Yule model (Yule 1924) , under which all extant species at a particular point in time are equally likely to undergo a speciation event. The Yule model is appealing for its simplicity but fails to reproduce empirical data, as empirical phylogenies of extant species are generally far less balanced (meaning that sister clades are of very different sizes) (Yule 1924; Losos Blum and François 2006). Likewise, all species-speciationexchangeable models (Stadler 2013), which include, in particular, environmental-dependent or diversitydependent diversification models, produce the same distribution of tree shapes (i.e., a phylogeny ignoring branch lengths) as the Yule model (Stadler 2013). Therefore, these models are clearly missing important macroevolutionary features. Identifying more general macroevolutionary models that give rise to empirical tree balance will indicate which macroevolutionary dynamics may play major roles in shaping biodiversity. Under the Yule model, each "ranked" labeled tree shape (i.e., a tree shape with an ordering of internal vertices and unique leaf labels) is equally likely (Aldous 2001) , and this leads to highly balanced trees. By contrast, the Proportional to Distinguishable Arrangements model (PDA) (Aldous 1996 (Aldous , 2001 Semple and Steel 2003) assumes that each labeled tree shape (disregarding the order of speciation events) is equally likely. The PDA model produces trees that are highly unbalanced and has been biologically motivated by explosive radiation events and the colonization of new niches (Steel and McKenzie 2001) . For an example of a perfectly balanced and unbalanced tree see Supplementary Material for Figures 1a and b available on Dryad at http://dx.doi.org/10.5061/dryad.31227, respectively. The Yule and PDA models lie at opposite ends of the tree balance spectrum with empirical trees generally somewhere in between. Aldous (2001) introduced -splitting models which span and extend the range of tree balance. In these models, the tree balance can be selected by altering a single parameter, . Aldous (1996) found evidence that empirical trees support a value of ≈−1, however, no biological explanation supports this value. Mechanistic models that vary speciation rates across species have been suggested before, however, most of them rely on problematic assumptions. Trait-dependent speciation models can match empirical trees, but no obvious trait has been linked to tree shape. Models that 432
doi:10.1093/sysbio/syv001 pmid:25575504 pmcid:PMC4395845 fatcat:qxjfwqixabfvhls3bl24njnave