The Minimum Substring Cover Problem [chapter]

Danny Hermelin, Dror Rawitz, Romeo Rizzi, Stéphane Vialette
Approximation and Online Algorithms  
In this paper we consider the problem of covering a set of strings S with a set C of substrings in S, where C is said to cover S if every string in S can be written as a concatenation of the substrings in C. We discuss applications for the problem that arise in the context of computational biology and formal language theory. We then proceed to show that this problem is at least as hard as the Minimum Set Cover problem. In the main part of the paper, we focus on devising approximation algorithms
more » ... for the problem using two generic paradigms -the local-ratio technique and linear programming rounding. Introduction In a covering problem we are faced with the following situation: We are given two (not necessarily disjoint) sets of elements, the base elements and the covering elements, and the goal is to find a minimum (weight) subset of covering elements that "covers" all the base elements. The exact notion of covering differs from problem to problem, yet this abstract setting is common to many classical combinatorial problems in various application areas. Two famous examples are Minimum Set Cover -where the covering elements are subsets of the base elements and the notion of covering corresponds to set inclusion -and Minimum Vertex Cover -where the setting is graph-theoretic and the notion of covering corresponds to incidence between vertices and edges. Ever since the early days of combinatorial optimization, research on covering problems such as the two examples above proved extremely fruitful in laying down fundamental techniques and ideas. The early work of Johnson [13] and Lovász [14] on Minimum Set Cover pioneered the greedy analysis approach, while Chvátal [7] gave the first analysis based on linear programming (LP) while tackling the same problem. The first LP-rounding algorithm by Hochbaum [12] was also designed for Minimum Set Cover, while Bar-Yehuda and Even gave the first Primal-Dual [3] and Local-Ratio [4] algorithms for Minimum Vertex Cover. In this paper we introduce a new covering problem which resides in the realm of strings. A string c is a substring of a string s, if c can be obtained by deleting any number of consecutive
doi:10.1007/978-3-540-77918-6_14 dblp:conf/waoa/HermelinRRV07 fatcat:wpreleq5lrbztjkyk7liynwtyq