LitPathExplorer: a confidence-based visual text analytics tool for exploring literature-enriched pathway models

Axel J Soto, Chrysoula Zerva, Riza Batista-Navarro, Sophia Ananiadou, Jonathan Wren
2017 Bioinformatics  
Motivation: Pathway models are valuable resources that help us understand the various mechanisms underpinning complex biological processes. Their curation is typically carried out through manual inspection of published scientific literature to find information relevant to a model, which is a laborious and knowledge-intensive task. Furthermore, models curated manually cannot be easily updated and maintained with new evidence extracted from the literature without automated support. Results: We
more » ... e developed LitPathExplorer, a visual text analytics tool that integrates advanced text mining, semi-supervised learning and interactive visualization, to facilitate the exploration and analysis of pathway models using statements (i.e., events) extracted automatically from the literature and organized according to levels of confidence. LitPathExplorer supports pathway modellers and curators alike by: 1) extracting events from the literature that corroborate existing models with evidence; 2) discovering new events which can update models; and 3) providing a confidence value for each event that is automatically computed based on linguistic features and article metadata. Our evaluation of event extraction showed a precision of 89% and a recall of 71%. Evaluation of our confidence measure, when used for ranking sampled events, showed an average precision ranging between 61% and 73%, which can be improved to 95% when the user is involved in the semi-supervised learning process. Qualitative evaluation using pair analytics based on the feedback of three domain experts confirmed the utility of our tool within the context of pathway model exploration. Availability: LitPathExplorer is available at http://nactem.ac.uk/LitPathExplorer_BI/ Contact: sophia.ananiadou@manchester.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btx774 pmid:29228271 fatcat:gx33npwxprd6rn3jk6coe677oa