Bleeding Event Detection in EHR Notes Using CNN Models Enhanced with RNN Autoencoders (Preprint) [post]

RUMENG LI, Baotian HU, Feifan LIU, Hong YU
2018 unpublished
BACKGROUND Bleeding events are common and critical which may cause significant morbidity and mortality. Studies show that high incidences of bleeding events are associated with cardiovascular disease (CVD) patients on anticoagulant therapy. Prompt and accurate detection of bleeding events are essential for preventing serious consequences. As bleeding events are often described in clinical notes, automatic detection of bleeding events from Electronic Health Record (EHR) narratives has the
more » ... al to improve drug safety surveillance and pharmacovigilance. OBJECTIVE We developed a natural language processing (NLP) system to automatically classify whether an EHR note sentence contains a bleeding event. METHODS We expert-annotated 878 EHR notes (76,577 sentences and 562,630 word tokens) for identifying bleeding events at the sentence-level. This annotated corpus was then used to train and validate our NLP systems. We developed an innovative hybrid CNN and LSTM Autoencoder model (HCLA), which integrates a convolutional neural network architecture (CNN) with a bidirectional Long-short term memory (BiLSTM) autoencoder model to leverage large unlabeled EHR data. RESULTS HCLA achieved an F-score of 93.79% for identifying whether a sentence contains a bleeding event, surpassing the strong baseline SVM and other CNN models. CONCLUSIONS By incorporating a supervised CNN model with a pre-trained unsupervised BiLSTM Autoencoder, HCLA achieved a high performance in detecting bleeding events.
doi:10.2196/preprints.10788 fatcat:33u4e6ftmfchfotf6ycrsjmw6a