SESCA: Predicting the Circular Dichroism Spectra of Proteins from Molecular Structure
Circular dichroism spectroscopy is a highly sensitive, but low-resolution technique to study the structure of proteins. Combed with molecular modelling and other complementary techniques, CD spectroscopy can also provide essential information at higher resolution. To this aim, we introduce a new computational method to calculate the electronic circular dichroism spectra of proteins from a three dimensional-model structure or structural ensemble. The method determines the CD spectrum from the
... rage secondary structure composition of the protein using a pre-calculated set of basis spectra. We derived several basis spectrum sets obtained from the experimental CD spectra and secondary structure information of 71 reference proteins and tested the prediction accuracy of these basis spectrum sets through cross-validation. Furthermore, we investigated how prediction accuracy is affected by contributions from amino acid side chain groups and protein flexibility, potential experimental errors of the reference protein spectra, as well as the choice of the secondary structure classification algorithm and the number of basis spectra. We compared the predictive power of our method to previous spectrum prediction algorithms — such as DichroCalc and PDB2CD — and found that SESCA predicts the CD spectra with up to 50% smaller deviation. Our results indicate that SESCA basis sets are robust to experimental error in the reference spectra, and the choice of the secondary structure classification algorithm. For over 80% of the globular reference proteins, SESCA basis sets could accurately predict the experimental spectrum solely from their secondary structure composition. To improve SESCA predictions for the remaining proteins, we applied corrections to account for intensity normalization, contributions from the amino side chains, and conformational flexibility. For globular proteins only intensity scaling improved the prediction accuracy significantly, but our models indicate that side chain contributions and structural flexibility are pivotal for the prediction of shorter peptides and intrinsically disordered proteins.