Efficiently Approximating Edit Distance Between Pseudorandom Strings
2018
arXiv
We present an algorithm for

arXiv:1811.04300v1
fatcat:2ijf6cq2tna3vhxe52lrljdqbi
*approximating*the*edit**distance*ed(x, y)*between*two*strings*x and y in time parameterized by the degree to which one of the*strings*x satisfies a natural*pseudorandomness*property ... Given parameters p and B, our algorithm computes the*edit**distance**between*a (p, B)-*pseudorandom**string*x and an arbitrary*string*y within a factor of O(1/p) in time Õ(nB), with high probability. ...*Approximation*Algorithm for x a (p, B)-*Pseudorandom**String*In Section 3, we present our*approximation*algorithm for computing the*edit**distance**between*a (p, B)-*pseudorandom**string*x and an arbitrary*string*...##
###
Tandem repeats over the edit distance

2007
Bioinformatics
Results: In this paper we describe an

doi:10.1093/bioinformatics/btl309
pmid:17237101
fatcat:suwnfk2m75hulbmf4a5g3yw43q
*efficient*algorithm for finding all tandem repeats within a sequence, under the*edit**distance*measure. ... We present a precise definition for tandem repeats over the*edit**distance*and an*efficient*, deterministic algorithm for finding these repeats. ... Let ed(s 1 , s 2 ) denote the minimum*edit**distance**between*two*strings*, s 1 and s 2 . DEFINITION 1. ...##
###
Dynamic Time Warping in Strongly Subquadratic Time: Algorithms for the Low-Distance Regime and Approximate Evaluation
2019
arXiv
Dynamic time warping

arXiv:1904.09690v2
fatcat:jhnyu252bvbapj5n2lvzpeqnae
*distance*(DTW) is a widely used*distance*measure*between*time series. ... Extending our techniques further, we also obtain the first*approximation*algorithm for*edit**distance*to work with characters taken from an arbitrary metric space, providing an n^ϵ-*approximation*in time ...*edit**distance*and LCS. ...##
###
Block Edit Errors with Transpositions: Deterministic Document Exchange Protocols and Almost Optimal Binary Codes

2019
International Colloquium on Automata, Languages and Programming
In both problems, an upper bound is placed on the number of errors

doi:10.4230/lipics.icalp.2019.37
dblp:conf/icalp/ChengJ0W19
fatcat:xta7b5uclze5znqtg4ucxb3n4a
*between*the two*strings*or that the channel can add, and a major goal is to minimize the size of the sketch or the redundant information ... In the first problem, Alice and Bob each holds a*string*, and the goal is for Alice to send a short sketch to Bob, so that Bob can recover Alice's*string*. ... For example, Shapira and Storer [24] showed that finding the*distance**between*two given*strings*under this metric is NP-hard, and they gave an*efficient*algorithm that achieves O(log n)*approximation*...##
###
Similarity Hashing Based on Levenshtein Distances
2014
IFIP Advances in Information and Communication Technology
approaches for

doi:10.1007/978-3-662-44952-3_10
fatcat:vq57fauzo5b3vopdhcos5qvady
*approximate*matching. ... Given the hash values of two byte sequences, saHash returns a lower bound on the number of Levenshtein operations*between*the two byte sequences as their similarity score. ... We employ an*approximate*matching function based on the Levenshtein*distance*, one of the most popular*string*metrics. ...##
###
Inference Control for Privacy-Preserving Genome Matching
2014
arXiv
We combine two known cryptographic primitives -- secure computation of the

arXiv:1405.0205v1
fatcat:zbwvkj7dw5edrfoqwfy5iohmwa
*edit**distance*and fuzzy commitments -- in order to prevent submission of similar genome sequences. ... Particularly, we contribute an*efficient*zero-knowledge proof that the same input has been used in both primitives. ... The accuracy of*approximating*the*edit**distance*if not affected, as the pearson correlation*between*the*edit**distance*of the original*strings*and our*distance*measure is still at 0.997. ...##
###
Block Edit Errors with Transpositions: Deterministic Document Exchange Protocols and Almost Optimal Binary Codes
2019
arXiv
In the first problem, Alice and Bob each holds a

arXiv:1809.00725v4
fatcat:vgchuq5sgrezdhicv7vy5pspq4
*string*, and the goal is for Alice to send a short sketch to Bob, so that Bob can recover Alice's*string*. ... In a recent work CJLW18, the authors constructed explicit deterministic document exchange protocols and binary error correcting codes for*edit*errors with almost optimal parameters. ... For example, Shapira and Storer [25] showed that finding the*distance**between*two given*strings*under this metric is NP-hard, and they gave an*efficient*algorithm that achieves O(log n)*approximation*...##
###
XML stream processing using tree-edit distance embeddings

2005
*
ACM Transactions on Database Systems
*

tree-

doi:10.1145/1061318.1061326
fatcat:ikk4cndsj5fwnjaurx2tgfxgbm
*edit**distance*computations; and (2)*approximate*the result of tree-*edit*-*distance*similarity joins over continuous XML document streams. ... the*distance*distortion*between*any data trees with at most n nodes. ... In a nutshell, the treeedit*distance*metric is the natural generalization of*edit**distance*from the*string*domain; thus, the tree-*edit**distance**between*two tree structures represents the minimum number ...##
###
Page 2844 of Mathematical Reviews Vol. , Issue 94e
1994
Mathematical Reviews
Italiano,

*Efficient*algorithms for sequence analysis (225-244); Roberto Grossi, Fabrizio Luccio and Linda Pagli, Coding trees as*strings*for*approximate*tree matching (245-259); Tom Head and Andreas Weber ... Rabin, Optimal parallel pattern matching through randomization (292-299); Esko Ukkonen, Ap- proximate*string*-matching and the q-gram*distance*(300-312); Michele Elia, Some comments on the computation of ...##
###
Efficient Similarity Search over Encrypted Data

2012
*
2012 IEEE 28th International Conference on Data Engineering
*

In this paper, we propose an

doi:10.1109/icde.2012.23
dblp:conf/icde/KuzuIK12
fatcat:gtszzuv66bbajgfkffyamiqcdy
*efficient*scheme for similarity search over encrypted data. ... In such a case, we can embed*strings*into the Euclidean space by*approximately*preserving the relative*edit**distance**between*them [16] . ... Various*distance*measures such as*edit**distance*[10] and*approximation*of Hamming*distance*[9] can be computed securely. ...##
###
Synchronization strings: codes for insertions and deletions approaching the singleton bound

2017
*
Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing - STOC 2017
*

We introduce synchronization

doi:10.1145/3055399.3055498
dblp:conf/stoc/HaeuplerS17
fatcat:u4yvuqbgfjgo7bdqe4me7r2vre
*strings*as a novel way of*efficiently*dealing with synchronization errors, i.e., insertions and deletions. ... Most notably, we obtain*efficient*insdel codes which get arbitrarily close to the optimal rate-*distance*tradeoff given by the Singleton bound for the complete noise spectrum. ... Normalization follows from the fact that the*edit**distance**between*two length k*strings*can be at most 2k. ...##
###
A taxonomy of privacy-preserving record linkage techniques

2013
*
Information Systems
*

Two surveys of

doi:10.1016/j.is.2012.11.005
fatcat:3kzh22vpjbexrpcxss4nyg55je
*edit*-*distance*based*approximate**string*comparison functions can be found in [46, 47] . ... The Levenshtein*edit*-*distance*[47] is a commonly used comparison method for*approximate**string*and sequence matching. ...##
###
Computational Limitations in Robust Classification and Win-Win Results
2019
arXiv
This leads us to a win-win scenario: either we can learn an

arXiv:1902.01086v2
fatcat:7o2osvqcmbckpkc7xznpmujkny
*efficient*robust classifier, or we can construct new instances of cryptographic primitives. ... First, we demonstrate classification tasks where computationally*efficient*robust classification is impossible, even when computationally unbounded robust classifiers exist. ... In the case of LPN, this*distance*is*approximately*m · r where r is the error rate. ...##
###
Efficient Linear and Affine Codes for Correcting Insertions/Deletions
2022
arXiv
(

arXiv:2007.09075v4
fatcat:wj3bccg5obgw5iwtxaty5ip5ve
*edit*)*distance*trade-off of linear insdel codes. ... We complement our existential results with an*efficient*synchronization-*string*-based transformation that converts any asymptotically-good linear code for Hamming errors into an asymptotically-good linear ...*Edit**distance**between*two*strings*is the minimum number of insertions, deletions and replacements that can modify one*string*to be the other. ...##
###
Small-space and streaming pattern matching with $k$ edits

2022
*
2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS)
*

In this work, we revisit the fundamental and well-studied problem of

doi:10.1109/focs52979.2021.00090
fatcat:ty2zzcs3ordyph6olsxolhiaru
*approximate*pattern matching under*edit**distance*. ... Given the sketches of two*strings*, in Õ(k 3 ) time we can compute their*edit**distance*or certify that it is larger than k. ... Recall that the Hamming*distance**between*the embeddings of two*strings*X, Y ∈ Σ ≤n is bounded in terms of the*edit**distance*ed(X, Y ), which allows using Hamming*distance*sketches to*approximate**edit**distance*...
