Phylogenetics and homology modeling

Allen Watkins Smith
Phylogenetics uses nucleotide and/or amino acid sequences to construct evolutionary trees and reconstruct the sequences (or other characteristics) of ancestral organisms. Proteins function almost entirely in their folded form, but phylogenetic work typically does not directly consider the structures into which protein sequences fold. Homology modeling uses a known protein structure to model the structure of a similar sequence, with the similarity arising from an evolutionary relationship - thus
more » ... "homology". However, homology modeling typically does not explicitly use evolutionary data, even though the modeled proteins are part of evolved biological systems. Combining these fields is likely to be fruitful: since proteins are the product of organismal evolution, an examination of evolution is needed to understand them; since proteins are a vital component of all known organisms, an examination of protein evolution is needed to understand organismal evolution. Protein structure is more conserved than protein sequence, especially for vital proteins. Therefore, the structure of a putative ancestral protein is likely to be close enough to modern-day structures to be modeled, especially if done in short evolutionary stages with each step having few sequence differences. It should therefore be possible to go down a tree, homology modeling the structure of a protein at each stage, then go back up again to a modern-day sequence to derive a structure for said sequence (usable as a test if already experimentally known). While the latter point has not been reached, considerable progress has been made. Ways in which structural data can assist in phylogenetics, such as whether predicted ancestral sequences are structurally realistic, have been found. A database of manually reviewed structural alignments of a variety of interesting proteins (with additional sequence alignments) has been created, as has a database of structures versus species. Some interesting phylogenetic findings have been made and a supertree construction techn [...]
doi:10.7282/t3d79bs3 fatcat:q2u5ua3vdjfnfkia7rsyhjydiq