A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit <a rel="external noopener" href="https://www.biorxiv.org/content/biorxiv/early/2021/01/13/2020.10.16.343293.full.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
<i title="Cold Spring Harbor Laboratory">
<span class="release-stage" >pre-print</span>
Assessing the phylogenetic compatibility between individual gene families is a crucial and often computationally demanding step in many phylogenomics analyses. Here we describe the Evolutionary Similarity Index (IES) to assess shared evolution between gene families using a weighted Orthogonal Distance Regression applied to sequence distances. This approach allows for straightforward pairing of paralogs between co-evolving gene families without resorting to multiple tests, or a priori<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1101/2020.10.16.343293">doi:10.1101/2020.10.16.343293</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/2shqy7vpe5aubd3cjqgasdiykq">fatcat:2shqy7vpe5aubd3cjqgasdiykq</a> </span>
more »... of molecular interactions between protein products from assessed genes. The utilization of pairwise distance matrices, while less informative than phylogenies, circumvents error-prone comparisons between trees whose topologies are inherently uncertain. Analyses of simulated tree datasets showed that I_ES was more accurate and less susceptible to phylogenetic noise than existing tree-based methods (Robinson-Foulds and geodesic distance) for assessing evolutionary signal compatibility. Applying IES to a real dataset of 1,322 genes from 42 archaeal genomes identified eight major clusters of co-evolving gene families. Four of these clusters included genes with a taxonomic distribution across all archaeal phyla, while other clusters included a subset of taxa that do not map to generally accepted archaeal clades, indicating possible shared horizontal transfers by co-evolving gene families. We identify one strongly connected set of 62 co-evolving genes occurring as both single-copy and multiple homologs per genome, with compatible evolutionary histories closely matching previously published species trees for Archaea. An I_ES implementation is available at https://github.com/lthiberiol/evolSimIndex.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210429010359/https://www.biorxiv.org/content/biorxiv/early/2021/01/13/2020.10.16.343293.full.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/de/41/de41517895aa589fbd2d49864b3e0d2b5739669a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1101/2020.10.16.343293"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> biorxiv.org </button> </a>