Domain Informational Vocabulary Extraction Experiences with Publication Pipeline Integration and Ontology Curation

Amit Gupta, Weijia Xu, Pankaj Jaiswal, Crispin Taylor, Jennifer Regala
2018 International Conference on Biomedical Ontology  
We will present updates on an ongoing project DIVE (Domain Informational Vocabulary Extraction), a system designed for extracting domain information from scientific publications. DIVE implements an ensemble of text mining methods for biological entity extraction from article text. DIVE also attempts use the co-occurrence patterns of these entities to establish probable relationships between them. DIVE also features an improved web interface for expert user curation of extracted information,
more » ... eby providing a means for a constantly growing and expert curated body of domain information for an article corpus. We also discuss our experiences from successful integration of DIVE with the publishing pipeline for two prominent Plant Biology Journals (The Plant Cell and Plant Physiology ) from ASPB (American Society of Plant Biologists). The extracted results are embedded at the end of the final proof of the published article to enhance its accessibility and discoverability. Furthermore, DIVE tracks expert user curation actions on its web interface for future training and improvement of the entity detection algorithm.
dblp:conf/icbo/GuptaXJTR18 fatcat:4ctwu7meefgkxjkx3coztavbui