v3NLP Framework: Tools to Build Applications for Extracting Concepts from Clinical Text

Guy Divita, Marjorie Carter, Le-Thuy Tran, Doug Redd, Qing T. Zeng, Scott Duvall, Matthew H. Samore, Adi V. Gundlapalli
2016 eGEMs  
Substantial amounts of clinically significant information are contained only within the narrative of the clinical notes in electronic medical records. v3NLP Framework is a set of best of breed functionalities developed to transform this information into structured data for use in quality improvement, research, population health surveillance, and decision support. Background: MetaMap, cTAKES and similar well known NLP tools do not have sufficient scalability out of the box. v3NLP Framework
more » ... d out of the necessity to scale these tools up and provide a framework to customize and tune techniques to fit a variety of tasks, including document classification, tuned concept extraction for specific conditions, patient classification, and information retrieval. Innovation: Beyond scalability, several v3NLP Framework developed projects have been efficacy tested and benchmarked. While v3NLP Framework includes annotators, pipelines and applications, the functionalities enable developers to create novel annotators, put annotators into pipelines and scaled applications. Discussion: v3NLP Framework has been successfully utilized in many projects including general concept extraction, risk factors for homelessness among veterans, and identification of mentions of the presence of an indwelling urinary catheter. Projects as diverse as predicting colonization with methicillin resistant Staphylococcus aureus and extracting references to military sexual trauma are being built using v3NLP Framework components. Conclusion: v3NLP Framework is a set of functionalities and components that provide Java developers the ability to create novel annotators, place annotators into pipelines, and applications to extract concepts from clinical text. There are scale-up and scale-out functionalities to process large numbers of records.
doi:10.13063/2327-9214.1228 pmid:27683667 pmcid:PMC5019303 fatcat:hbsndtmntvgxnd3mhvkl3y2vga