Recognition and Pseudonymization of Personal Data in Paper-Based Health Records [chapter]

Stefan Fenz, Johannes Heurix, Thomas Neubauer
2012 Lecture Notes in Business Information Processing  
E-health requires the sharing of patient-related data when and where necessary. Electronic health records (EHR) allow the structured and expandable collection of medical data needed for clinical research studies and thereby not only enable the optimization of clinical studies, but also results in higher statistical significance due to a larger number of samples. While the digitization of medical data and the organization of this data within EHRs have been introduced in some areas, massive
more » ... s of paper-based health records are still produced on a daily basis. This data has to be stored for decades due to legal reasons but is of no benefit for research organizations, as the unstructured medical data in paper-based health records cannot be efficiently used for clinical studies. Furthermore, legal regulations prohibit the use of documents containing both personal and medical data for clinical studies, which leads to expensive data acquisition phases and limited samples. This paper presents the MEDSEC system for the recognition and pseudonymization of personal data in paper-based health records. MEDSEC integrates unique methods for (i) automatically identifying personal and medical data, (ii) automatically annotating the optical character recognition (OCR) output data of paper-based health records with standard-compliant metadata, and (iii) automatically pseudonymizing the personal data. With MEDSEC, health care organizations profit by (i) strengthening clinical research resulting in faster and more reliable results and reduced costs, and (ii) providing an environment of trust for its patients and employees that guarantees privacy.
doi:10.1007/978-3-642-30359-3_14 fatcat:in7gvmsphfa4nb5xagxlmjn5gq