Improved method for predicting protein fold patterns with ensemble classifiers

W. Chen, X. Liu, Y. Huang, Y. Jiang, Q. Zou, C. Lin
2012 Genetics and Molecular Research  
Protein folding is recognized as a critical problem in the field of biophysics in the 21st century. Predicting protein-folding patterns is challenging due to the complex structure of proteins. In an attempt to solve this problem, we employed ensemble classifiers to improve prediction accuracy. In our experiments, 188-dimensional features were extracted based on the composition and physical-chemical property of proteins and 20-dimensional features were selected using a coupled position-specific
more » ... coring matrix. Compared with traditional prediction methods, these methods were superior in terms of prediction accuracy. The 188-dimensional feature-based method achieved 71.2% accuracy in five cross-validations. The accuracy rose to 77% when we used a 20-dimensional feature vector. These methods were used on recent data, with 54.2% accuracy. Source codes and dataset, together with web server and software tools for prediction, are available at:
doi:10.4238/2012.january.27.4 pmid:22370884 fatcat:dvvjoiwfgjhxjd6sg7pgdwsk54