Modelling structural constraints on protein evolution via side-chain conformational states: Supplementary files [article]

Umberto Perron, Alexey M Kozlov, Alexandros Stamatakis, Nick Goldman, Iain Moal
2019 bioRxiv   pre-print
Few models of sequence evolution incorporate parameters describing protein structure, despite its high conservation, essential functional role and increasing availability. We present a structurally-aware empirical substitution model for amino acid sequence evolution in which proteins are expressed using an expanded alphabet that relays both amino acid identity and structural information. Each character specifies an amino acid as well a rotamer state: the discrete geometric pattern of permitted
more » ... ide-chain atomic positions. By assigning rotamer states in 251,194 protein structures and identifying 4,508,390 substitutions between closely related sequences, we generate a 55-state model that shows that the evolutionary properties of amino acids depend strongly upon side-chain geometry. The model performs as well as or better than traditional 20-state models for divergence time estimation, tree inference and ancestral state reconstruction. We conclude that the concomitant evolution of sequence and structure is a valuable source of phylogenetic information.
doi:10.1101/530634 fatcat:nb3b75ibzjadrpmay7bpjvcbay