On Defining and Finding Islands of Trees and Mitigating Large Island Bias

Ana Serra Silva, Mark Wilkinson
2021 Systematic Biology  
How best can we summarise sets of phylogenetic trees? Systematists have relied heavily on consensus methods, but if tree distributions can be partitioned into distinct subsets it may be helpful to provide separate summaries of these rather than relying entirely upon a single consensus tree. How sets of trees can most helpfully be partitioned and represented leads to many open questions, but one natural partitioning is provided by the islands of trees found during tree searches. Islands that are
more » ... of dissimilar size have been shown to yield majority-rule consensus trees dominated by the largest sets. We illustrate this large island bias and approaches that mitigate its impact by revisiting a recent analysis of phylogenetic relationships of living and fossil amphibians. We introduce a revised definition of tree islands based on any tree-to-tree pairwise distance metric that usefully extends the notion to any set or multiset of trees, as might be produced by, for example, Bayesian or bootstrap methods, and that facilitates finding tree islands a posteriori. We extract islands from a tree distribution obtained in a Bayesian analysis of the amphibian data to investigate their impact in that context, and we compare the partitioning produced by tree islands with those resulting from some alternative approaches. Distinct subsets of trees, such as tree islands, should be of interest because of what they may reveal about evolution and/or our attempts to understand it, and are an important, sometimes overlooked, consideration when building and interpreting consensus trees.
doi:10.1093/sysbio/syab015 pmid:33749752 pmcid:PMC8513764 fatcat:vngz7rd5tndk5okc32jitztpl4