Record linkage: making the most out of errors in linking variables

M Tromp, J B Reitsma, A C J Ravelli, N Méray, G J Bonsel
2006 AMIA Annual Symposium Proceedings  
This paper presents a refinement of the probabilistic medical record linking algorithm. We introduced "close agreement" to account for typical errors in administrative variables used for record linkage. Linking data on early pregnancy determinants with data on late child outcomes was used as a case study. We analyzed whether the addition of close agreement resulted in a higher discriminating power of the linking key reflected ina reduction of the number of links with an uncertain linking
more » ... Incorporating close agreement for postal code and date of birth in the record linking algorithm resulted in a reduction of 95% of the number of pairs in the uncertain region. We showed that the extension of a third outcome"close" when comparing values of corresponding linking variables led to a major improvement in our probabilistic record linkage study. Similar improvements are likely in other studies because the frequency, nature, and type of errors in other large databases will not be substantially different.
pmid:17238447 pmcid:PMC1839331 fatcat:acfpwsbsh5hmtp54nzf755tiye