A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is
Lecture Notes in Computer Science
Algorithms for computing similarity joins in MapReduce were offered in . Similarity joins ask to find input pairs that are within a certain distance d according to some distance measure. Here we explore the "anchor-points algorithm" of . We continue looking at Hamming distance, and show that the method of that paper can be improved; in particular, if we want to find strings within Hamming distance d, and anchor points are chosen so that every possible input is within Hamming distance k ofdoi:10.1007/11402763_10 fatcat:f4torhyuyzg6rjs5k5fblc2mw4