A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2006; you can also visit the original URL.
The file type is
The VLDB journal
Many database applications have the emerging need to support fuzzy queries that ask for strings that are similar to a given string, such as "name similar to smith" and "telephone number similar to 412-0964." Query optimization needs the selectivity of such a fuzzy predicate, i.e., the fraction of records in the database that satisfy the condition. In this paper, we study the problem of estimating selectivities of fuzzy string predicates. We develop a novel technique, called Sepia, to solve thedoi:10.1007/s00778-007-0061-2 fatcat:axxg7w7tkjdbhjkus4vzeu5acu