A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
A Modification of the Landau-Vishkin Algorithm Computing Longest Common Extensions via Suffix Arrays
[chapter]
2005
Lecture Notes in Computer Science
We present a variation of the Landau-Vishkin algorithm which instead of suffix trees uses suffix arrays for computing the longest common extensions, thereby improving actual space usage. ...
Suffix trees are used for preprocessing the sequences allowing an O(1) running time computation of the longest common extensions between substrings. ...
We have shown that it is possible to change the Landau-Vishkin approximate string matching algorithm to use enhanced suffix arrays instead of suffix trees for its computation of longest common extensions ...
doi:10.1007/11532323_25
fatcat:dyplnkcednbohfqttzzdwmqfe4
Algorithmic Advances for Searching Biosequence Databases
[chapter]
1994
Computational Methods in Genome Research
The asymptotically most efficient are the suffix tree [22, 23] and suffix array [26] arrays are a particularly space efficient alternative to suffix trees, requiring only 2N integers, but they take ...
In a scan of the database, quickly eliminate regions that can't possibly match via some easily computed criterion. ...
doi:10.1007/978-1-4615-2451-9_10
fatcat:cvl4eygyovcynebnmqtnezgx2a
Parallel Construction and Query of Index Data Structures for Pattern Matching on Square Matrices
1999
Journal of Complexity
The main data structure is the Lsuffix tree, which is a generalization of the classical suffix tree for strings. ...
The query algorithms are work optimal while the construction algorithm is work optimal only for arbitrary and large alphabets. ...
The Lstrings representing the matrix``suffixes'' A ij allow us to re-use some of the ideas presented for strings and suffix trees in the algorithm by Apostolico, Iliopoulos, Landau, Schieber, and Vishkin ...
doi:10.1006/jcom.1998.0496
fatcat:li3m4yo2s5hcxnm5nmaafy35iq
Orthogonal Range Searching for Text Indexing
[article]
2013
arXiv
pre-print
Text indexing, the problem in which one desires to preprocess a (usually large) text for future (shorter) queries, has been researched ever since the suffix tree was invented in the early 70's. ...
Initially, in the mid 90's there were a couple of results recognizing this connection. ...
I wanted to thank my numerous colleagues who were kind enough to provide insightful comments on an earlier version and pointers to work that I was unaware of. ...
arXiv:1306.0615v1
fatcat:g4nztbapzna3bhuj2nazlyw6re
Full-text and Keyword Indexes for String Searching
[article]
2015
arXiv
pre-print
The first contribution is the FM-bloated index, which is a modification of the well-known FM-index (a compressed, full-text index) that trades space for speed. ...
Query times in the order of 1 microsecond were reported for one mismatch for a few-megabyte natural language dictionary on a medium-end PC. ...
The enhanced suffix array (ESA) is a variant where additional information in the form of a longest common prefix (LCP) table is stored [AKO02]. ...
arXiv:1508.06610v1
fatcat:5pmce2d72veuxpw3s5u6hbidim
Pattern matching in pseudo real-time
2011
Journal of Discrete Algorithms
The resulting online algorithms bound the worst case running time per input character to within a log factor of their comparable offline counterpart. ...
It has recently been shown how to construct online, non-amortised approximate pattern matching algorithms for a class of problems whose distance functions can be classified as being local. ...
Acknowledgements The authors would like to thank Benny and Ely Porat for many helpful discussions at an early stage of this work. ...
doi:10.1016/j.jda.2010.09.005
fatcat:sr3p42h2vjfmzjc5rs7ippqpb4
Semi-local string comparison: algorithmic techniques and applications
[article]
2013
arXiv
pre-print
A classical measure of string comparison is given by the longest common subsequence (LCS) problem on a pair of strings. ...
The same approach can also be applied to permutation strings, providing efficient solutions for local versions of the longest increasing subsequence (LIS) problem, and for the problem of computing a maximum ...
Acknowledgement This work was conceived in a discussion with Gad Landau in Haifa. The imaginative term "seaweeds" was coined by Yuri Matiyasevich during a presentation by the author in St. ...
arXiv:0707.3619v21
fatcat:ufmpjbkmsvbvhf6l6zxugdcyc4
Faster Approximate Pattern Matching: A Unified Approach
[article]
2020
arXiv
pre-print
with a common period. ...
Exact occurrences of P in T have a very simple structure: If we assume for simplicity that |T| ≤ 3|P|/2 and trim T so that P occurs both as a prefix and as a suffix of T, then both P and T are periodic ...
We proceed, as in Main Theorem 8, by separately considering each of the three possible outcomes of Analyze (P, k). Consider Algorithm 18 for a visualization of the whole algorithm as pseudo-code. ...
arXiv:2004.08350v2
fatcat:zfgicxdvgjadribq3ep4knuqhu
Faster Approximate Pattern Matching: A Unified Approach
2020
2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS)
LCP R (S, T ): Compute the length of the longest common suffix of S and T . ...
Again, a classic algorithm by Landau and Vishkin [35] runs in O(nk) time. Subsequent research [44, 17] . ...
We proceed, as in Main Theorem 8, by separately considering each of the three possible outcomes of Analyze (P, k). Consider Algorithm 18 for a visualization of the whole algorithm as pseudo-code. ...
doi:10.1109/focs46700.2020.00095
fatcat:sm62sj3eizhybdegrr3gjz2o6a
Small-space and streaming pattern matching with $k$ edits
2022
2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS)
For any string of length at most n, the sketch is of size Õ(k 2 ) and it can be computed with an Õ(k 2 )-space streaming algorithm. ...
In order to do so, we compute the encoding for substrings of the text and of the pattern, which requires read-only access to the latter. ...
the longest common prefix of two suffixes of a string in constant time. ...
doi:10.1109/focs52979.2021.00090
fatcat:ty2zzcs3ordyph6olsxolhiaru
Approximating Text-to-Pattern Hamming Distances
[article]
2020
arXiv
pre-print
We revisit a fundamental problem in string matching: given a pattern of length m and a text of length n, both over an alphabet of size σ, compute the Hamming distance between the pattern and the text at ...
Several (1+ϵ)-approximation algorithms have been proposed in the literature, with running time of the form O(ϵ^-O(1)nlog nlog m), all using fast Fourier transform (FFT). ...
If C does not contain two such positions, then |C| = O( n k ), and the algorithm spends O(d i ) = O(k) time for each i ∈ C to compute d i using 1 + d i Longest Common Extension (LCE) queries. ...
arXiv:2001.00211v1
fatcat:2uatcpj7tzdzjincj7tmupcjke
Pattern Matching in Trees and Strings
[article]
2007
arXiv
pre-print
We study the design of efficient algorithms for combinatorial pattern matching. More concretely, we study algorithms for tree matching, string matching, and string matching in compressed texts. ...
The algorithm uses techniques from Ukkonen [Ukk85b] and Landau and Vishkin [LV89] . ...
Let X A be the state-array modeling the set of states reachable via a path of forward ǫ-transitions in A, and let X A be the state array modelling Close(S) in A. ...
arXiv:0708.4288v1
fatcat:55quki3onrfa3cqvmvsbfsavwq
How Compression and Approximation Affect Efficiency in String Distance Measures
[article]
2021
arXiv
pre-print
For two strings of total length N and total compressed size n, it is known that the edit distance and a longest common subsequence (LCS) can be computed exactly in time Õ(nN), as opposed to O(N^2) for ...
In contrast, for uncompressed strings, there is not even a subquadratic algorithm for LCS that has less than a polynomial gap in the approximation factor. ...
The O(N ) term in the running time of the Landau-Vishkin algorithm [LV88] is solely needed to construct a data structure efficiently answering the Longest Common Extension (LCE) queries. ...
arXiv:2112.05836v1
fatcat:rqyk3xg2gbbcjaymnarzue4qhy
Data structures and algorithms for approximate string matching Zvi Galil, Raffaele Giancarlo
2017
Special attention is given to the methods for the construction of data structures that efficiently support primitive operations needed in approximate string matching. ...
This paper surveys techniques for designing efficient sequential and parallel approximate string matching algorithms. ...
Gadi Landau. Greg vVasilkowsky and Henryk Wozniakowsky for reading an early version of this paper. ...
doi:10.7916/d8dr33kb
fatcat:pghbzueanvcj3k4v3ji4f66vti