Support vector machine approach for protein subcellular localization prediction

S. Hua, Z. Sun
2001 Bioinformatics  
Motivation: Subcellular localization is a key functional characteristic of proteins. A fully automatic and reliable prediction system for protein subcellular localization is needed, especially for the analysis of large-scale genome sequences. Results: In this paper, Support Vector Machine has been introduced to predict the subcellular localization of proteins from their amino acid compositions. The total prediction accuracies reach 91.4% for three subcellular locations in prokaryotic organisms
more » ... nd 79.4% for four locations in eukaryotic organisms. Predictions by our approach are robust to errors in the protein N-terminal sequences. This new approach provides superior prediction performance compared with existing algorithms based on amino acid composition and can be a complementary method to other existing methods based on sorting signals.
doi:10.1093/bioinformatics/17.8.721 pmid:11524373 fatcat:bnvkqa7sdzf3hgz47lqi3yeb5u