Protein secondary structure prediction with a neural network

L. H. Holley, M. Karplus
1989 Proceedings of the National Academy of Sciences of the United States of America  
A method is presented for protein secondary structure prediction based on a neural network. A training phase was used to teach the network to recognize the relation between secondary structure and amino acid sequences on a sample set of 48 proteins of known structure. On a separate test set of 14 proteins of known structure, the method achieved a maximum overall predictive accuracy of 63% for three states: helix, sheet, and coil. A numerical measure of helix and sheet tendency for each residue
more » ... as obtained from the calculations. When predictions were riltered to include only the strongest 31% of predictions, the predictive accuracy rose to 79%. Accurate prediction of protein secondary structure is a step toward the goal of understanding protein folding. A variety of methods have been proposed that make use of the physicochemical characteristics of the amino acids (1), sequence homology (2-4), pattern matching (5), and statistical analyses (6-11) of proteins of known structure. In a recent assessment (12) of three widely used methods (1, 6, 9), accuracy was found to range from 49% to 56% for predictions of three states: helix, sheet, and coil. The limited accuracy of the predictions is believed to be due to the small size of the data base and/or the fact that secondary structure is determined by tertiary interactions not included in the local sequence. In this paper* we describe a secondary structure prediction method that makes use of neural networks. The neural network technique has its origins in efforts to produce a computer model of the information processing that takes place in the nervous system (13-16). A large number of simple, highly interconnected computational units (neurons) operate in parallel. Each unit integrates its inputs, which may be both excitatory and inhibitory, and according to some threshold generates an output, which is propagated to other units. In many applications, including the present work, the biological relevance of neural networks to nervous system function is unimportant. Rather, a neural network may simply be viewed as a highly parallel computational device.
doi:10.1073/pnas.86.1.152 pmid:2911565 pmcid:PMC286422 fatcat:c4dlphxxojdojdorfcx6uv6wr4