The Minimum Substring Cover problem

2008
*
Information and Computation
*

In this paper we consider

doi:10.1016/j.ic.2008.06.002
fatcat:wvpjnp5fjzhphkbl37nb3btt3m
*the**problem*of*covering*a set of strings S with a set C of*substrings*in S, where C is said to*cover*S if every string in S can be written as a concatenation of*the**substrings*... We then proceed to show that this*problem*is at least as hard as*the**Minimum*Set*Cover**problem*. ... In our*covering**problem*,*the*base elements are strings and*the**covering*elements are their*substrings*. ...##
###
The Minimum Substring Cover Problem
[chapter]

*
Approximation and Online Algorithms
*

In this paper we consider

doi:10.1007/978-3-540-77918-6_14
dblp:conf/waoa/HermelinRRV07
fatcat:wpreleq5lrbztjkyk7liynwtyq
*the**problem*of*covering*a set of strings S with a set C of*substrings*in S, where C is said to*cover*S if every string in S can be written as a concatenation of*the**substrings*... We then proceed to show that this*problem*is at least as hard as*the**Minimum*Set*Cover**problem*. ... In our*covering**problem*,*the*base elements are strings and*the**covering*elements are their*substrings*. ...##
###
Generalized approximate regularities in strings

2008
*
International Journal of Computer Mathematics
*

Given a string x of length n and an integer λ,

doi:10.1080/00207160701389168
fatcat:qnrb7ega45fdjidtz6yqwwlopa
*the**minimum*approximate λ-*cover*(resp. seed)*problem*is to find a set of λ*substrings*each of equal length that*covers*x (resp. a superstring of x) with*the*... We concentrate on*the*generalized string regularities and study*the**minimum*approximate λ-*cover**problem*and*the**minimum*approximate λ-seed*problem*of a string. ... Solving*the**Minimum*Approximate λ-*cover**Problem*Now we turn to investigate*the**minimum*approximate λ-*cover**problem*. ...##
###
New complexity results for the k-covers problem

2011
*
Information Sciences
*

*The*k-

*covers*

*problem*(kCP) asks us to compute a

*minimum*cardinality set of strings of given length k > 1 that

*covers*a given string. ... It was shown in a recent paper, by reduction to 3-SAT, that

*the*k-

*covers*

*problem*is NP-complete. ... Open

*Problems*We have shown that for k ≥ 2,

*the*k-

*covers*

*problem*kCP is equivalent to RVCP k , hence that efficient algorithms can be used to approximate a

*minimum*k-

*cover*as specified in Section 3. ...

##
###
Solving the Minimum String Cover Problem
[chapter]

2012
*
2012 Proceedings of the Fourteenth Workshop on Algorithm Engineering and Experiments (ALENEX)
*

Given costs assigned to each

doi:10.1137/1.9781611972924.8
dblp:conf/alenex/CanzarMRS12
fatcat:irvnjdjjbfabjh33zjnxu7ailm
*substring*from S,*the**Minimum*String*Cover*(MSC)*problem*asks for a*cover*of*minimum*total cost. ... A string*cover*C of a set of strings S is a set of*substrings*from S such that every string in S can be written as a concatenation of*the*strings in C. ... In*the*basic*Minimum*String*Cover*(MSC)*problem*, we want to find a*cover*with minimal cardinality. In a more general version, we assign a costs to*substrings*and aim at a*cover*of minimal total cost. ...##
###
COVERING A CIRCULAR STRING WITH SUBSTRINGS OF FIXED LENGTH

1996
*
International Journal of Foundations of Computer Science
*

In this paper we consider

doi:10.1142/s0129054196000075
fatcat:zfvdma7rcjd4tpxzg2m4nlowjm
*the**problem*of determining*the**minimum*cardinality of a set U k which guarantees that every circular string C(x) of length n k can be*covered*. ... m , where u k = jU k j and is*the*size of*the*alphabet on which*the*strings are de ned.*The**problem*has application to DNA sequencing by hybridization using oligonucleotide probes. ... Clearly each such chip, or set of*substrings*, guarantees that every position in x is contained in a*substring*which matches one of*the**substrings*on*the*chip: we say then that U k*covers*(or is a*cover*...##
###
Genomic Distances under Deletions and Insertions

2004
*
Theoretical Computer Science
*

As more and more genomes are sequenced, evolutionary biologists are becoming increasingly interested in evolution at

doi:10.1016/j.tcs.2004.02.039
fatcat:6hjzacvhpbbmdbxzfyxmy2y3tu
*the*level of whole genomes, in scenarios in which*the*genome evolves through insertions ... In*the*mathematical model pioneered by Sankoff and others, a unichromosomal genome is represented by a signed permutation of a multi-set of genes; Hannenhalli and Pevzner showed that*the*edit distance ... This work is supported by*the*National Science Foundation under grants ACI 00-81404, DEB 01-20709, EIA 01-13095, EIA 01-21377, and EIA 02-03584. ...##
###
Genomic Distances under Deletions and Insertions
[chapter]

2003
*
Lecture Notes in Computer Science
*

As more and more genomes are sequenced, evolutionary biologists are becoming increasingly interested in evolution at

doi:10.1007/3-540-45071-8_54
fatcat:tfee3g6jd5fjhdtifksjaqecma
*the*level of whole genomes, in scenarios in which*the*genome evolves through insertions ... In*the*mathematical model pioneered by Sankoff and others, a unichromosomal genome is represented by a signed permutation of a multi-set of genes; Hannenhalli and Pevzner showed that*the*edit distance ... This work is supported by*the*National Science Foundation under grants ACI 00-81404, DEB 01-20709, EIA 01-13095, EIA 01-21377, and EIA 02-03584. ...##
###
Computing regularities in strings: A survey

2013
*
European journal of combinatorics (Print)
*

*The*aim of this survey is to provide insight into

*the*sequential algorithms that have been proposed to compute exact "regularities" in strings; that is,

*covers*(or quasiperiods), seeds, repetitions, runs ... After outlining and evaluating

*the*algorithms that have been proposed for their computation, I suggest possibly productive future directions of research. ... It was shown further that

*the*decision

*problem*is a special case of

*the*set

*cover*

*problem*, hence

*the*

*minimum*k-

*cover*can be computed to within a logarithmic factor by an efficient greedy algorithm. ...

##
###
Implementing approximate regularities

2005
*
Mathematical and computer modelling
*

We explore their similarities and differences and we implement algorithms for solving

doi:10.1016/j.mcm.2005.09.013
fatcat:w2cxwl2jjvc7nc4a7mhm4skmoa
*the*smallest distance approximate period/*cover*/seed*problem*and*the*restricted smallest approximate period/*cover*/seed ...*problem*in polynomial time, under a variety of distance rules (*the*Hamming distance,*the*edit distance, and*the*weighted edit distance). ... Step 1 Step Restricted Smallest Approximate Period/*Cover*/Seed*Problem*Definition 5 Given a string x of length n,*the*Restricted Smallest Approximate Period/*Cover*/Seed*problem*is to find a*substring*...##
###
Memory efficient minimum substring partitioning

2013
*
Proceedings of the VLDB Endowment
*

We propose a disk-based partition method, called

doi:10.14778/2535569.2448951
fatcat:qqjxkmuqcfexpjqcw4mhrca63m
*Minimum**Substring*Partitioning (MSP), to complete*the*task using less than 10 gigabytes memory, without runtime slowdown. ... By leveraging*the*overlaps among*the*k-mers (*substring*of length k), MSP achieves astonishing compression ratio:*The*total size of partitions is reduced from Θ(kn) to Θ(n), where n is*the*size of*the*short ... Here*the*p-*substrings*are sorted according to*the*percentage of k-mers they*cover*.*The*figure uses logarithm on both axes. ...##
###
Memory Efficient De Bruijn Graph Construction
[article]

2012
*
arXiv
*
pre-print

We propose a disk-based partition method, called

arXiv:1207.3532v1
fatcat:7f75zvbmofbfhann652zph2ztq
*Minimum**Substring*Partitioning (MSP), to complete*the*task using less than 10 gigabytes memory, without runtime slowdown. ... By leveraging*the*overlaps among*the*k-mers (*substring*of length k), MSP achieves astonishing compression ratio:*The*total size of partitions is reduced from Θ(kn) to Θ(n), where n is*the*size of*the*short ... In this study, we resort to another approach to bypass this*problem*. Definition 3 (*Minimum**Substring*[20] ). ...##
###
Approximate Periods of Strings
[chapter]

1999
*
Lecture Notes in Computer Science
*

We consider three related

doi:10.1007/3-540-48452-3_10
fatcat:gvcnmruo5zdvja4yn3r6so4qb4
*problems*, for two of which we derive polynomial-time algorithms; we then show that*the*third*problem*is NP-complete. ...*The*study of approximately periodic strings is relevant to diverse applications such as molecular biology, data compression, and computer-assisted music analysis. ... A two-dimensional variant of*the**covering**problem*was studied in [11, 14] , and*minimum**covering*by*substrings*of a given length in [19] . • Seeds: Iliopoulos et al. ...##
###
Approximate periods of strings

2001
*
Theoretical Computer Science
*

We consider three related

doi:10.1016/s0304-3975(00)00365-0
fatcat:nf6c26kezralldevgqiophx66i
*problems*, for two of which we derive polynomial-time algorithms; we then show that*the*third*problem*is NP-complete. ...*The*study of approximately periodic strings is relevant to diverse applications such as molecular biology, data compression, and computer-assisted music analysis. ... A two-dimensional variant of*the**covering**problem*was studied in [11, 14] , and*minimum**covering*by*substrings*of a given length in [19] . • Seeds: Iliopoulos et al. ...##
###
At the roots of dictionary compression: string attractors

2018
*
Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing - STOC 2018
*

CCS CONCEPTS • Theory of computation → Data compression; Packing and

doi:10.1145/3188745.3188814
dblp:conf/stoc/KempaP18
fatcat:pr67vs6oynhkdlcpbqcwrb5qmu
*covering**problems*;*Problems*, reductions and completeness; Cell probe models and lower bounds; ... In this paper, we show that these techniques are different solutions to*the*same, elegant, combinatorial*problem*: to find a small set of positions capturing all distinct text's*substrings*. ... ACKNOWLEDGMENTS We received useful suggestions from many people during*the*writeup of this paper. ...
