Towards site-based protein functional annotations

Seak Fei Lei, Jun Huan
2010 International Journal of Data Mining and Bioinformatics  
The exact relationship between protein active centers and protein functions is unclear even after decades of intensive study. To improve the functional prediction ability based on the local protein structures, we proposed three different methods. 1) We used statistical model (known as Markov Random Field) to describe protein active region based on the structure motifs. 2) We developd a filter that considers the local environment around the active sites to remove the false positives. 3) we
more » ... d multiple structure motifs by extending the motif to neighboring residues for delineating their functions. Our experimental results, as evaluated in five sets of enzyme families with less than 40% sequence identity, demonstrated that our methods can obtain more remote homologs that could not be detected by traditional sequence-based methods. At the same time, our method could reduce large amount of random matches. Our methods could improve up to 70 % of the functional annotation ability (measured by their Area under the ROC curve) in extended motif method.
doi:10.1504/ijdmb.2010.034200 pmid:20815142 pmcid:PMC2936724 fatcat:uyx5m4lvl5dnzgok4l72xkmivu