A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
De-Identifying Swedish EHR Text Using Public Resources in the General Domain
2020
Studies in Health Technology and Informatics
Sensitive data is normally required to develop rule-based or train machine learning-based models for de-identifying electronic health record (EHR) clinical notes; and this presents important problems for patient privacy. In this study, we add non-sensitive public datasets to EHR training data; (i) scientific medical text and (ii) Wikipedia word vectors. The data, all in Swedish, is used to train a deep learning model using recurrent neural networks. Tests on pseudonymized Swedish EHR clinical
doi:10.3233/shti200140
pmid:32570364
fatcat:7o3gyfiuhzen5cqjuwncnxmsq4