Machine learning and features selection for semi-automatic ICD-9-CM encoding

Julia Medori, Cédrick Fairon
2010 International Workshop on Health Text Mining and Information Analysis  
This paper describes the architecture of an encoding system which aim is to be implemented as a coding help at the Cliniques universtaires Saint-Luc, a hospital in Brussels. This paper focuses on machine learning methods, more specifically, on the appropriate set of attributes to be chosen in order to optimize the results of these methods. A series of four experiments was conducted on a baseline method: Naïve Bayes with varying sets of attributes. These experiments showed that a first step
more » ... sting in the extraction of information to be coded (such as diseases, procedures, aggravating factors, etc.) is essential. It also demonstrated the importance of stemming features. Restraining the classes to categories resulted in a recall of 81.1 %.
dblp:conf/acl-louhi/MedoriF10 fatcat:inaol2szmrhuhlaejdhxjsdjna