A Study of Concept Extraction Across Different Types of Clinical Notes

Youngjun Kim, Ellen Riloff, John F Hurdle
2015 AMIA Annual Symposium Proceedings  
Our research investigates methods for creating effective concept extractors for specialty clinical notes. First, we present three new "specialty area" datasets consisting of Cardiology, Neurology, and Orthopedics clinical notes manually annotated with medical concepts. We analyze the medical concepts in each dataset and compare with the widely used i2b2 2010 corpus. Second, we create several types of concept extraction models and examine the effects of training supervised learners with
more » ... area data versus i2b2 data. We find substantial differences in performance across the datasets, and obtain the best results for all three specialty areas by training with both i2b2 and specialty data. Third, we explore strategies to improve concept extraction on specialty notes with ensemble methods. We compare two types of ensemble methods (Voting/Stacking) and a domain adaptation model, and show that a Stacked ensemble of classifiers trained with i2b2 and specialty data yields the best performance.
pmid:26958209 pmcid:PMC4765588 fatcat:zeki4hfkongllaw5owsb744m4y