ANONYMIZATION OF SENSITIVE DATA IN UNSTRUCTURED DOCUMENTS USING NLP

Anushree Raj, Rio D'Souza
2021 International Journal of Mechanical Engineering & Technology (IJMET)  
Lot of researchers have worked for the progress of anonymization of structured data through spread-sheets and database tools. Masking of sensitive information in structured data and data anonymization is possible through algorithms or techniques. But anonymizing unstructured data is a real challenge since data currently exists in different form. The study which ensures to cope with the interactions between human language and computers is called NLP. Natural Language Processing is the sub-field
more » ... f AI which focuses on enabling computers to understand and process human languages. Further, we provide the deeper insight on how NLP works and show how a system can understand unstructured text, extract sensitive data and perform anonymization. The proposed anonymization procedure provides a system to apply text anonymization on unstructured original medical-records of an individual and release the anonymized document to help researchers for further study or investigation by preserving the privacy of the concerned individual.
doi:10.34218/ijmet.12.4.2021.002 fatcat:ustj3kkngrctxpjpuhllpifchm