Exploiting semantic structure for mapping user-specified form terms to SNOMED CT concepts
Proceedings of the 2nd ACM SIGHIT symposium on International health informatics - IHI '12
The elements of clinical databases are usually named after the clinical terms used in various design artifacts. These terms are instinctively supplied by the users, and hence, different users often use different terms to describe the same clinical concept. This term diversity makes future database integration and analysis a huge challenge. In this paper, we study the problem of standardization of the terms used in a specific kind of user-designed artifact, the encounter forms or templates,
... a popular clinical terminology, the SNOMED CT. In particular, we focus on the problem of mapping the terms on an encounter form to SNOMED CT concepts. Existing term mapping techniques are solely based on syntactic string similarity. Such techniques are unable to disambiguate among the terms that resemble one another linguistically, and yet differ semantically. To improve existing techniques, we consider the context of a term in the mapping process and propose a hybrid approach relying on linguistics as well as structural information. For a given form term, this approach (i) exploits the semantic structure of the form to derive the term's context, and (ii) maps the term to a linguistically-matching SNOMED CT concept that is compatible with the derived context. We test the approach on over 900 clinician-specified terms used in 26 forms. This method achieves 23% improvement in precision and 38% improvement in recall, over a pure linguistic-based approach. Our first contribution is that we introduce and address a new problem of mapping form terms to standard concepts. The second contribution is that the experimental evaluation confirms that structural information has a major role in improving mapping performance, and in addressing the key challenges associated with semantic mapping.