BioPPISVMExtractor: A protein–protein interaction extractor for biomedical literature using SVM and rich feature sets

Zhihao Yang, Hongfei Lin, Yanpeng Li
2010 Journal of Biomedical Informatics  
Protein-protein interactions play a key role in various aspects of the structural and functional organization of the cell. Knowledge about them unveils the molecular mechanisms of biological processes. However, the amount of biomedical literature regarding protein interactions is increasing rapidly and it is difficult for interaction database curators to detect and curate protein interaction information manually. This paper presents a SVM-based system, named BioPPISVMExtractor, to identify
more » ... in-protein interactions in biomedical literature. This system uses rich feature sets including word features, keyword feature, protein names distance feature and Link path feature for SVM classification. In addition, the Link Grammar extraction result feature is introduced to improve the precision rate. Experimental evaluations with other state-of-the-art PPI extraction systems tested on the DIP corpus indicate that BioPPISVMExtractor can substantially improve recall at the cost of a moderate decline in precision.
doi:10.1016/j.jbi.2009.08.013 pmid:19706337 fatcat:rwb6ukl535dozaxtp3asbfrjna