A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Unified Compression-Based Acceleration of Edit-Distance Computation
2011
Algorithmica
For two strings of total length N having straight-line program representations of total size n, we present an algorithm running in O(nN lg(N/n)) time for computing the edit-distance of these two strings ...
The standard dynamic programming solution for this problem computes the edit-distance between a pair of strings of total length O(N ) in O(N 2 ) time. ...
This algorithm was later improved in a sequence of papers [5, 6, 10, 25] to an algorithm running in time O(nN ), for strings of total length N that encode into run-length strings of total length n. ...
doi:10.1007/s00453-011-9590-6
fatcat:stli6vovyfcwhbowchoxn34cze
Unified Compression-Based Acceleration of Edit-Distance Computation
[article]
2010
arXiv
pre-print
For two strings of total length N having straight-line program representations of total size n, we present an algorithm running in O(nN log(N/n)) time for computing the edit-distance of these two strings ...
The standard dynamic programming solution for this problem computes the edit-distance between a pair of strings of total length O(N) in O(N^2) time. ...
This algorithm was later improved in a sequence of papers [5, 6, 10, 22] to an algorithm running in time O(nN ), for strings of total length N that encode into run-length strings of total length n. ...
arXiv:1004.1194v1
fatcat:7ljidy3hgfcdxd2xgigo7n6xgu
Longest common subsequence between run-length-encoded strings: a new algorithm with improved parallelism
2004
Information Processing Letters
In this paper we address the problem of computing the length of the longest common subsequence (LCS) between run-length-encoded (RLE) strings. ...
We also discuss the application of the proposed algorithm to the related problem of edit distance (ED) computation. ...
In particular, several algorithms have been recently proposed that exploit either run-length encoding (RLE) [13] or Lempel-Ziv compression (LZ78) [14] . ...
doi:10.1016/j.ipl.2004.02.011
fatcat:q5lbsxhowzgy7nmk4dpyd6za5a
Restructuring Compressed Texts without Explicit Decompression
[article]
2011
arXiv
pre-print
algorithms including LZ77, LZ78, run length encoding, and some grammar based compression algorithms. ...
Since most of the representations we consider can achieve exponential compression, our algorithms are theoretically faster in the worst case, than any algorithm which first decompresses the string for ...
Conversions from Run Length Encoding For conversions from Run Length Encodings, we obtain the results below. ...
arXiv:1107.2729v1
fatcat:r4jlshcz4jbkxckopgunoeabfu
Solving Classical String Problems on Compressed Texts
[article]
2006
arXiv
pre-print
Then we present polynomial algorithms for computing fingerprint table and compressed representation of all covers (for the first time) and for finding periods of a given compressed string (our algorithm ...
The main result is a new algorithm for pattern matching when both a text T and a pattern P are presented by SLPs (so-called fully compressed pattern matching problem). ...
Surprisingly, while the compression methods vary in many practical algorithms of Lempel-Ziv family and run-length encoding, the decompression goes in almost the same way. ...
arXiv:cs/0604058v1
fatcat:bnahqndw6nefhc4fjwy7uss5rq
Computation over Compressed Structured Data (Dagstuhl Seminar 16431)
2017
Dagstuhl Reports
This report documents the program and the outcomes of Dagstuhl Seminar 16431 "Computation over Compressed Structured Data". ...
We plan to combine the recently proposed GLOUDS representation [1] with DSM, a technique used to compress Web and social graphs by exploiting the presence of bicliques and dense subgraphs [2]. ...
Since GLOUDS benefits from a representation with fewer edges per node and DSM reduces the number of edges from m * n to m + n when representing an (m, n)-biclique, we believe the combination can lead to ...
doi:10.4230/dagrep.6.10.99
dblp:journals/dagstuhl-reports/BilleLMN16
fatcat:jel4wyc2gje6thmu5zj7aryofu
Hardness of comparing two run-length encoded strings
2010
Journal of Complexity
In this paper, we consider a commonly used compression scheme called run-length encoding. We provide both lower and upper bounds for the problems of comparing two run-length encoded strings. ...
Given two run-length encoded strings of m and n runs, such a result implies that it is very unlikely to devise an o(mn)-time algorithm for either of them. ...
Acknowledgments The authors would like to thank the anonymous reviewers for their helpful comments. ...
doi:10.1016/j.jco.2010.03.003
fatcat:l5yh57wr7zbpfohznq2sn4bw2m
Page 1911 of Mathematical Reviews Vol. , Issue 97C
[page]
1997
Mathematical Reviews
Masek and Paterson (1980) designed an O(nm/logn)- time algorithm to compute the edit distance of two strings. ...
The problem is thus to find all segments of a given string that are within edit distance d to strings generated by the regular expression. ...
Compressed parameterized pattern matching
2016
Theoretical Computer Science
respectively denote the number of runs in the encodings for T and P . ...
Apostolico et al. [6] address the p-match in terms of fully compressed run-length encodings in O(n + (r P × r T )α(r T ) log(r T )) time, where α is the inverse of Ackermann's function and r T and r P ...
Unlike in [6, 5] where p-matching via compressed strings is studied for the run-length encodings of T and P , our work differs in that (1) p-matching is performed on an uncompressed P and T c , a compressed ...
doi:10.1016/j.tcs.2015.09.015
fatcat:3drdwykatbc4dfuks3kri5bpjm
Compressed Parameterized Pattern Matching
2013
2013 Data Compression Conference
respectively denote the number of runs in the encodings for T and P . ...
Apostolico et al. [6] address the p-match in terms of fully compressed run-length encodings in O(n + (r P × r T )α(r T ) log(r T )) time, where α is the inverse of Ackermann's function and r T and r P ...
Unlike in [6, 5] where p-matching via compressed strings is studied for the run-length encodings of T and P , our work differs in that (1) p-matching is performed on an uncompressed P and T c , a compressed ...
doi:10.1109/dcc.2013.54
dblp:conf/dcc/BealA13
fatcat:qojdw2in2jaurocwgisow4zxve
On Estimating Edit Distance: Alignment, Dimension Reduction, and Embeddings
2018
International Colloquium on Automata, Languages and Programming
Edit distance is a fundamental measure of distance between strings and has been widely studied in computer science. ...
Closely related to the study of approximation algorithms is the study of metric embeddings for edit distance. ...
An embedding of edit distance on length-𝑛 strings to edit distance on length-𝑛/𝑐 strings (with larger alphabet size) is called a dimension-reduction map with contraction 𝑐. ...
doi:10.4230/lipics.icalp.2018.34
dblp:conf/icalp/CharikarGKK18
fatcat:rxp5an5etbcfdpkr2i6jezap7i
Algorithmics on SLP-compressed strings: A survey
2012
Groups - Complexity - Cryptology
Among others, we study pattern matching for compressed strings, membership problems for compressed strings in various kinds of formal languages, and the problem of querying compressed strings. ...
Results on algorithmic problems on strings that are given in a compressed form via straightline programs are surveyed. ...
Acknowledgment The author is very grateful to Pawel Gawrychowski, Artur Jez, Sebastian Maneth, Wojciech Rytter, Marcus Schäfer, and Manfred Schmidt-Schauß for many valuable comments and reading a preliminary ...
doi:10.1515/gcc-2012-0016
fatcat:o7lrrx3cgvhqrhmf4bsulc7rsa
A Space-Optimal Grammar Compression
2017
European Symposium on Algorithms
We propose a fully-online algorithm that requires the fewest bits of working space asymptotically equal to the lower bound in O(N lg lg n) compression time. ...
Although there is an online grammar compression algorithm that directly computes the succinct encoding of its output CFG with O(lg N lg * N ) approximation guarantee, the problem of optimizing its working ...
learning [41] , edit-distance computation [14, 43] , and regularities detection [29, 15] . ...
doi:10.4230/lipics.esa.2017.67
dblp:conf/esa/TakabatakeIS17
fatcat:chxvgfhnyvalnlzoigvtma7dvu
How Compression and Approximation Affect Efficiency in String Distance Measures
[article]
2021
arXiv
pre-print
For two strings of total length N and total compressed size n, it is known that the edit distance and a longest common subsequence (LCS) can be computed exactly in time Õ(nN), as opposed to O(N^2) for ...
Many applications need to align multiple sequences simultaneously, and the fastest known exact algorithms for median edit distance and LCS of k strings run in O(N^k) time. ...
Given an instance of k-center edit distance on strings of lengths M 1 ≤ M 2 ≤ • • • ≤ M k where these strings can all be compressed into a SLP of size m, then, an algorithm for k-center edit distance that ...
arXiv:2112.05836v1
fatcat:rqyk3xg2gbbcjaymnarzue4qhy
Small-space and streaming pattern matching with $k$ edits
2022
2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS)
For any string of length at most n, the sketch is of size Õ(k 2 ) and it can be computed with an Õ(k 2 )-space streaming algorithm. ...
On strings of length at most n, the encoding occupies Õ(k 2 ) space. We use the encoding to compress substrings of the text that are close to the pattern. ...
Our results The main result of our work is a fully streaming algorithm for approximate pattern matching under the edit distance that uses Õ(k 5 ) space and Õ(k 8 ) amortized time per character of the text ...
doi:10.1109/focs52979.2021.00090
fatcat:ty2zzcs3ordyph6olsxolhiaru
« Previous
Showing results 1 — 15 out of 2,941 results