Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic [article]

Pawel Gawrychowski
<span title="2011-04-21">2011</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression methods: given an uncompressed pattern s[1..m] and a Lempel-Ziv representation of a string t[1..N], does s occur in t? Farach and Thorup gave a randomized O(nlog^2(N/n)+m) time solution for this problem, where n is the size of the compressed representation of
more &raquo; ... We improve their result by developing a faster and fully deterministic O(nlog(N/n)+m) time algorithm with the same space complexity. Note that for highly compressible texts, log(N/n) might be of order n, so for such inputs the improvement is very significant. A (tiny) fragment of our method can be used to give an asymptotically optimal solution for the substring hashing problem considered by Farach and Muthukrishnan.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="">arXiv:1104.4203v1</a> <a target="_blank" rel="external noopener" href="">fatcat:szdm2ymgg5boddghl74d5xoj6q</a> </span>
<a target="_blank" rel="noopener" href="" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="" title=" access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> </button> </a>