Consensus Phylogenetic trees of Fifteen Prokaryotic Aminoacyl-tRNA Synthetase Polypeptides based on Euclidean Geometry of All-Pairs Distances and Concatenation [article]

Rhishikesh R Bargaje, Milner Kumar, Sohan P Modak
2016 bioRxiv   pre-print
Most molecular phylogenetic trees depict the relative closeness or the extent of similarity among a set of taxa based on comparison of sequences of homologous genes or proteins. Since the tree topology for individual monogenic traits varies among the same set of organisms and does not overlap taxonomic hierarchy, hence there is a need to generate multidimensional phylogenetic trees. Phylogenetic trees were constructed for 119 prokaryotes representing 2 phyla under Archaea and 11 phyla under
more » ... 11 phyla under Bacteria after comparing multiple sequence alignments for 15 different aminoacyl-tRNA synthetase polypeptides. The topology of Neighbor Joining (NJ) trees for individual tRNA synthetase polypeptides varied substantially. We use Euclidean geometry to estimate all-pairs distances in order to construct phylogenetic trees. Further, we used a novel 'Taxonomic fidelity' algorithm to estimate clade by clade similarity between the phylogenetic tree and the taxonomic tree. We find that, as compared to trees for individual tRNA synthetase polypeptides and rDNA sequences, the topology of our Euclidean tree and that for aligned and concatenated sequences of 15 proteins are closer to the taxonomic trees and offer the best consensus. We have also aligned sequences after concatenation, and find that by changing the order of sequence joining prior to alignment, the tree topologies vary. In contrast, changing the types of polypeptides in the grouping for Euclidean trees does not affect the tree topologies. We show that a consensus phylogenetic tree of 15 polypeptides from 14 aminoacyl-tRNA synthetases for 119 prokaryotes using Euclidean geometry exhibits better taxonomic fidelity than trees for individual tRNA synthetase polypeptides as well as 16S rDNA. We have also examined Euclidean N-dimensional trees for 15 tRNA synthetase polypeptides which give the same topology as that constructed after amalgamating 3-dimensional Euclidean trees for groups of 3 polypeptides. Euclidean N-dimensional trees offer a reliable future to multi-genic molecular phylogenetics.
doi:10.1101/051623 fatcat:xz4q2ktqgzd6nawls6enjzs3u4