A four-stage algorithm for updating a Burrows–Wheeler transform

M. Salson, T. Lecroq, M. Léonard, L. Mouchard
2009 Theoretical Computer Science  
We present a four-stage algorithm that updates the Burrows-Wheeler Transform of a text T , when this text is modified. The Burrows-Wheeler Transform is used by many text compression applications and some self-index data structures. It operates by reordering the letters of a text T to obtain a new text bwt(T ) which can be better compressed. Even if recent advances are offering this structure new applications, a major bottleneck still exists: bwt(T ) has to be entirely reconstructed from scratch
more » ... whenever T is modified. We are studying how standard edit operations (insertion, deletion, substitution of a letter or a factor) that are transforming a text T into T are impacting bwt(T ). Then we are presenting an algorithm that directly converts bwt(T ) into bwt(T ). Based on this algorithm, we also sketch a method for converting the suffix array of T into the suffix array of T . We finally show, based on the experiments we conducted, that this algorithm, whose worst-case time complexity is O(|T | log |T |(1 + log σ/ log log |T |), performs really well in practice and replaces advantageously the traditional approach.
doi:10.1016/j.tcs.2009.07.016 fatcat:7epl7gq3lzatnf75hc7jb47qai