Improving decoy databases for protein folding algorithms

Aaron Lindsey, Hsin-Yi (Cindy) Yeh, Chih-Peng Wu, Shawna Thomas, Nancy M. Amato
2014 Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics - BCB '14  
Predicting protein structures and simulating protein folding are two of the most important problems in computational biology today. Simulation methods rely on a scoring function to distinguish the native structure (the most energetically stable) from non-native structures. Decoy databases are collections of non-native structures used to test and verify these functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and removing redundant
more » ... uctures. We test our approach on 17 different decoy databases of varying size and type and show significant improvement across a variety of metrics. We also test our improved databases on a popular modern scoring function and show that they contain a greater number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions.
doi:10.1145/2649387.2660839 dblp:conf/bcb/LindseyYWTA14 fatcat:cspilf6k3bawbo7trzn6jrdhpi