Prediction of Protein Structure Classes with Pseudo Amino Acid Composition and Fuzzy Support Vector Machine Network

Yong-Sheng Ding, Tong-Liang Zhang, Kuo-Chen Chou
2007 Protein Peptide Letters  
It is a critical challenge to develop automated methods for fast and accurately determining the structures of proteins because of the increasingly widening gap between the number of sequence-known proteins and that of structureknown proteins in the post-genomic age. The knowledge of protein structural class can provide useful information towards the determination of protein structure. Thus, it is highly desirable to develop computational methods for identifying the structural classes of newly
more » ... und proteins based on their primary sequence. In this study, according to the concept of Chou's pseudo amino acid composition (PseAA), eight PseAA vectors are used to represent protein samples. Each of the PseAA vectors is a 40-D (dimensional) vector, which is constructed by the conventional amino acid composition (AA) and a series of sequence-order correlation factors as original introduced by Chou. The difference among the eight PseAA representations is that different physicochemical properties are used to incorporate the sequence-order effects for the protein samples. Based on such a framework, a dual-layer fuzzy support vector machine (FSVM) network is proposed to predict protein structural classes. In the first layer of the FSVM network, eight FSVM classifiers trained by different PseAA vectors are established. The 2 nd layer FSVM classifier is applied to reclassify the outputs of the first layer. The results thus obtained are quite promising, indicating that the new method may become a useful tool for predicting not only the structural classification of proteins but also their other attributes.
doi:10.2174/092986607781483778 pmid:17979824 fatcat:h7p26rwc55es7k4jeg4nuc7zly