A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
The importance of probability-based approaches for duplicate detection has been recognized in both research and practice. However, existing approaches do not aim to consider the underlying real-world events resulting in duplicates (e.g., that a relocation may lead to the storage of two records for the same customer, once before and after the relocation). Duplicates resulting from real-world events exhibit specific characteristics. For instance, duplicates resulting from relocations tend to havedoi:10.5283/epub.37587 fatcat:jqgryj3grjhbtl7v3wrifaimhy