Evaluation of entity resolution approaches on real-world match problems

Hanna Köpcke, Andreas Thor, Erhard Rahm
2010 Proceedings of the VLDB Endowment  
Despite the huge amount of recent research efforts on entity resolution (matching) there has not yet been a comparative evaluation on the relative effectiveness and efficiency of alternate approaches. We therefore present such an evaluation of existing implementations on challenging real-world match tasks. We consider approaches both with and without using machine learning to find suitable parameterization and combination of similarity functions. In addition to approaches from the research
more » ... nity we also consider a state-of-the-art commercial entity resolution implementation. Our results indicate significant quality and efficiency differences between different approaches. We also find that some challenging resolution tasks such as matching product entities from online shops are not sufficiently solved with conventional approaches based on the similarity of attribute values.
doi:10.14778/1920841.1920904 fatcat:yxmus33jcnhs5kf6hqsara44ye