Novelty Indicator for Enhanced Prioritization of Predicted Gene Ontology Annotations
IEEE/ACM Transactions on Computational Biology & Bioinformatics
Biomolecular controlled annotations have become pivotal in computational biology, because they allow scientists to analyze large amounts of biological data to better understand their test results, and to infer new knowledge. Yet, biomolecular annotation databases are incomplete by definition, like our knowledge of biology, and may contain errors and inconsistent information. In this context, machine-learning algorithms able to predict and prioritize new biomolecular annotations are both
... e and efficient, especially if compared with the time-consuming trials of biological validation. To limit the possibility that these techniques predict obvious and trivial high-level features, and to help prioritizing their results, we introduce here a new element that can improve the accuracy and relevance of the results of an annotation prediction and prioritization pipeline. We propose a novelty indicator able to state the level of "newness" (or "originality") of the annotations predicted for a specific gene to Gene Ontology terms, and to help prioritizing the most novel and interesting annotations predicted. We performed a thorough biological functional analysis of the prioritized annotations predicted with high accuracy by using this indicator and our previously proposed prediction algorithms. The relevance of our biological findings proves the effectiveness and trustworthiness of our proposed indicator and of its prioritization of annotation prediction pipeline results.