Filters








2,941 Hits in 6.1 sec

Unified Compression-Based Acceleration of Edit-Distance Computation

Danny Hermelin, Gad M. Landau, Shir Landau, Oren Weimann
2011 Algorithmica  
For two strings of total length N having straight-line program representations of total size n, we present an algorithm running in O(nN lg(N/n)) time for computing the edit-distance of these two strings  ...  The standard dynamic programming solution for this problem computes the edit-distance between a pair of strings of total length O(N ) in O(N 2 ) time.  ...  This algorithm was later improved in a sequence of papers [5, 6, 10, 25] to an algorithm running in time O(nN ), for strings of total length N that encode into run-length strings of total length n.  ... 
doi:10.1007/s00453-011-9590-6 fatcat:stli6vovyfcwhbowchoxn34cze

Unified Compression-Based Acceleration of Edit-Distance Computation [article]

Danny Hermelin, Gad M. Landau, Shir Landau, Oren Weimann
2010 arXiv   pre-print
For two strings of total length N having straight-line program representations of total size n, we present an algorithm running in O(nN log(N/n)) time for computing the edit-distance of these two strings  ...  The standard dynamic programming solution for this problem computes the edit-distance between a pair of strings of total length O(N) in O(N^2) time.  ...  This algorithm was later improved in a sequence of papers [5, 6, 10, 22] to an algorithm running in time O(nN ), for strings of total length N that encode into run-length strings of total length n.  ... 
arXiv:1004.1194v1 fatcat:7ljidy3hgfcdxd2xgigo7n6xgu

Longest common subsequence between run-length-encoded strings: a new algorithm with improved parallelism

Valerio Freschi, Alessandro Bogliolo
2004 Information Processing Letters  
In this paper we address the problem of computing the length of the longest common subsequence (LCS) between run-length-encoded (RLE) strings.  ...  We also discuss the application of the proposed algorithm to the related problem of edit distance (ED) computation.  ...  In particular, several algorithms have been recently proposed that exploit either run-length encoding (RLE) [13] or Lempel-Ziv compression (LZ78) [14] .  ... 
doi:10.1016/j.ipl.2004.02.011 fatcat:q5lbsxhowzgy7nmk4dpyd6za5a

Restructuring Compressed Texts without Explicit Decompression [article]

Keisuke Goto, Shirou Maruyama, Shunsuke Inenaga, Hideo Bannai, Hiroshi Sakamoto, Masayuki Takeda
2011 arXiv   pre-print
algorithms including LZ77, LZ78, run length encoding, and some grammar based compression algorithms.  ...  Since most of the representations we consider can achieve exponential compression, our algorithms are theoretically faster in the worst case, than any algorithm which first decompresses the string for  ...  Conversions from Run Length Encoding For conversions from Run Length Encodings, we obtain the results below.  ... 
arXiv:1107.2729v1 fatcat:r4jlshcz4jbkxckopgunoeabfu

Solving Classical String Problems on Compressed Texts [article]

Yury Lifshits
2006 arXiv   pre-print
Then we present polynomial algorithms for computing fingerprint table and compressed representation of all covers (for the first time) and for finding periods of a given compressed string (our algorithm  ...  The main result is a new algorithm for pattern matching when both a text T and a pattern P are presented by SLPs (so-called fully compressed pattern matching problem).  ...  Surprisingly, while the compression methods vary in many practical algorithms of Lempel-Ziv family and run-length encoding, the decompression goes in almost the same way.  ... 
arXiv:cs/0604058v1 fatcat:bnahqndw6nefhc4fjwy7uss5rq

Computation over Compressed Structured Data (Dagstuhl Seminar 16431)

Philip Bille, Markus Lohrey, Sebastian Maneth, Gonzalo Navarro, Marc Herbstritt
2017 Dagstuhl Reports  
This report documents the program and the outcomes of Dagstuhl Seminar 16431 "Computation over Compressed Structured Data".  ...  We plan to combine the recently proposed GLOUDS representation [1] with DSM, a technique used to compress Web and social graphs by exploiting the presence of bicliques and dense subgraphs [2].  ...  Since GLOUDS benefits from a representation with fewer edges per node and DSM reduces the number of edges from m * n to m + n when representing an (m, n)-biclique, we believe the combination can lead to  ... 
doi:10.4230/dagrep.6.10.99 dblp:journals/dagstuhl-reports/BilleLMN16 fatcat:jel4wyc2gje6thmu5zj7aryofu

Hardness of comparing two run-length encoded strings

Kuan-Yu Chen, Ping-Hui Hsu, Kun-Mao Chao
2010 Journal of Complexity  
In this paper, we consider a commonly used compression scheme called run-length encoding. We provide both lower and upper bounds for the problems of comparing two run-length encoded strings.  ...  Given two run-length encoded strings of m and n runs, such a result implies that it is very unlikely to devise an o(mn)-time algorithm for either of them.  ...  Acknowledgments The authors would like to thank the anonymous reviewers for their helpful comments.  ... 
doi:10.1016/j.jco.2010.03.003 fatcat:l5yh57wr7zbpfohznq2sn4bw2m

Page 1911 of Mathematical Reviews Vol. , Issue 97C [page]

1997 Mathematical Reviews  
Masek and Paterson (1980) designed an O(nm/logn)- time algorithm to compute the edit distance of two strings.  ...  The problem is thus to find all segments of a given string that are within edit distance d to strings generated by the regular expression.  ... 

Compressed parameterized pattern matching

Richard Beal, Donald Adjeroh
2016 Theoretical Computer Science  
respectively denote the number of runs in the encodings for T and P .  ...  Apostolico et al. [6] address the p-match in terms of fully compressed run-length encodings in O(n + (r P × r T )α(r T ) log(r T )) time, where α is the inverse of Ackermann's function and r T and r P  ...  Unlike in [6, 5] where p-matching via compressed strings is studied for the run-length encodings of T and P , our work differs in that (1) p-matching is performed on an uncompressed P and T c , a compressed  ... 
doi:10.1016/j.tcs.2015.09.015 fatcat:3drdwykatbc4dfuks3kri5bpjm

Compressed Parameterized Pattern Matching

R. Beal, D. A. Adjeroh
2013 2013 Data Compression Conference  
respectively denote the number of runs in the encodings for T and P .  ...  Apostolico et al. [6] address the p-match in terms of fully compressed run-length encodings in O(n + (r P × r T )α(r T ) log(r T )) time, where α is the inverse of Ackermann's function and r T and r P  ...  Unlike in [6, 5] where p-matching via compressed strings is studied for the run-length encodings of T and P , our work differs in that (1) p-matching is performed on an uncompressed P and T c , a compressed  ... 
doi:10.1109/dcc.2013.54 dblp:conf/dcc/BealA13 fatcat:qojdw2in2jaurocwgisow4zxve

On Estimating Edit Distance: Alignment, Dimension Reduction, and Embeddings

Moses Charikar, Ofir Geri, Michael P. Kim, William Kuszmaul, Michael Wagner
2018 International Colloquium on Automata, Languages and Programming  
Edit distance is a fundamental measure of distance between strings and has been widely studied in computer science.  ...  Closely related to the study of approximation algorithms is the study of metric embeddings for edit distance.  ...  An embedding of edit distance on length-𝑛 strings to edit distance on length-𝑛/𝑐 strings (with larger alphabet size) is called a dimension-reduction map with contraction 𝑐.  ... 
doi:10.4230/lipics.icalp.2018.34 dblp:conf/icalp/CharikarGKK18 fatcat:rxp5an5etbcfdpkr2i6jezap7i

Algorithmics on SLP-compressed strings: A survey

Markus Lohrey
2012 Groups - Complexity - Cryptology  
Among others, we study pattern matching for compressed strings, membership problems for compressed strings in various kinds of formal languages, and the problem of querying compressed strings.  ...  Results on algorithmic problems on strings that are given in a compressed form via straightline programs are surveyed.  ...  Acknowledgment The author is very grateful to Pawel Gawrychowski, Artur Jez, Sebastian Maneth, Wojciech Rytter, Marcus Schäfer, and Manfred Schmidt-Schauß for many valuable comments and reading a preliminary  ... 
doi:10.1515/gcc-2012-0016 fatcat:o7lrrx3cgvhqrhmf4bsulc7rsa

A Space-Optimal Grammar Compression

Yoshimasa Takabatake, Tomohiro I, Hiroshi Sakamoto, Marc Herbstritt
2017 European Symposium on Algorithms  
We propose a fully-online algorithm that requires the fewest bits of working space asymptotically equal to the lower bound in O(N lg lg n) compression time.  ...  Although there is an online grammar compression algorithm that directly computes the succinct encoding of its output CFG with O(lg N lg * N ) approximation guarantee, the problem of optimizing its working  ...  learning [41] , edit-distance computation [14, 43] , and regularities detection [29, 15] .  ... 
doi:10.4230/lipics.esa.2017.67 dblp:conf/esa/TakabatakeIS17 fatcat:chxvgfhnyvalnlzoigvtma7dvu

How Compression and Approximation Affect Efficiency in String Distance Measures [article]

Arun Ganesh, Tomasz Kociumaka, Andrea Lincoln, Barna Saha
2021 arXiv   pre-print
For two strings of total length N and total compressed size n, it is known that the edit distance and a longest common subsequence (LCS) can be computed exactly in time Õ(nN), as opposed to O(N^2) for  ...  Many applications need to align multiple sequences simultaneously, and the fastest known exact algorithms for median edit distance and LCS of k strings run in O(N^k) time.  ...  Given an instance of k-center edit distance on strings of lengths M 1 ≤ M 2 ≤ • • • ≤ M k where these strings can all be compressed into a SLP of size m, then, an algorithm for k-center edit distance that  ... 
arXiv:2112.05836v1 fatcat:rqyk3xg2gbbcjaymnarzue4qhy

Small-space and streaming pattern matching with $k$ edits

Tomasz Kociumaka, Ely Porat, Tatiana Starikovskaya
2022 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS)  
For any string of length at most n, the sketch is of size Õ(k 2 ) and it can be computed with an Õ(k 2 )-space streaming algorithm.  ...  On strings of length at most n, the encoding occupies Õ(k 2 ) space. We use the encoding to compress substrings of the text that are close to the pattern.  ...  Our results The main result of our work is a fully streaming algorithm for approximate pattern matching under the edit distance that uses Õ(k 5 ) space and Õ(k 8 ) amortized time per character of the text  ... 
doi:10.1109/focs52979.2021.00090 fatcat:ty2zzcs3ordyph6olsxolhiaru
« Previous Showing results 1 — 15 out of 2,941 results