Boyer-Moore strategy to efficient approximate string matching [chapter]

Nadia El-Mabrouk, Maxime Crochemore
1996 Lecture Notes in Computer Science  
We propose a simple but e cient algorithm for searching all occurrences of a pattern or a class of patterns (length m) in a text (length n) with at most k mismatches. This algorithm relies on the Shift-Add algorithm of Baeza-Yates and Gonnet 6], which involves representing by a bit number the current state of the search and uses the ability of programming languages to handle bit words. State representation should not, therefore, exceeds the word size !, that is, m(dlog 2 (k + 1)e + 1) !. This
more » ... gorithm consists in a preprocessing step and a searching step. It is linear and performs 3n operations during the searching step. Notions of shift and character skip found in the Boyer-Moore (BM) 9] approach, are introduced in this algorithm. Provided that the considered alphabet is large enough (compared to the Pattern length), the average number of operations performed by our algorithm during the searching step becomes n(2 + k+4 m k ).
doi:10.1007/3-540-61258-0_2 fatcat:acxrjtdsafao3jmwc5tbuall5a