Struct-NB: predicting protein-RNA binding sites using structural features

Fadi Towfic, Cornelia Caragea, David C. Gemperline, Drena Dobbs, Vasant Honavar
2010 International Journal of Data Mining and Bioinformatics  
We explore whether protein-RNA interfaces differ from non-interfaces in terms of their structural features and whether structural features vary according to the type of the bound RNA (e.g., mRNA, siRNA, etc.), using a non-redundant dataset of 147 protein chains extracted from protein-RNA complexes in the Protein Data Bank. Furthermore, we use machine learning algorithms for training classifiers to predict protein-RNA interfaces using information derived from the sequence and structural
more » ... We develop the Struct-NB classifier that takes into account structural information. We compare the performance of Naïve Bayes and Gaussian Naïve Bayes with that of Struct-NB classifiers on the 147 protein-RNA dataset using sequence and structural features respectively as input to the classifiers. The results of our experiments show that Struct-NB outperforms Naïve Bayes and Gaussian Naïve Bayes on the problem of predicting the protein-RNA binding interfaces in a protein sequence in terms of a range of standard measures for comparing the performance of classifiers.
doi:10.1504/ijdmb.2010.030965 pmid:20300450 pmcid:PMC2840657 fatcat:6kcslp2m4bfc7gnfx7ocprezs4