The Complexity of Separation for Levels in Concatenation Hierarchies

Thomas Place, Marc Zeitoun, Michael Wagner
2018 Foundations of Software Technology and Theoretical Computer Science  
We investigate the complexity of the separation problem associated to classes of regular languages. For a class C, C-separation takes two regular languages as input and asks whether there exists a third language in C which includes the first and is disjoint from the second. First, in contrast with the situation for the classical membership problem, we prove that for most classes C, the complexity of C-separation does not depend on how the input languages are represented: it is the same for
more » ... terministic finite automata and monoid morphisms. Then, we investigate specific classes belonging to finitely based concatenation hierarchies. It was recently proved that the problem is always decidable for levels 1/2 and 1 of any such hierarchy (with inefficient algorithms). Here, we build on these results to show that when the alphabet is fixed, there are polynomial time algorithms for both levels. Finally, we investigate levels 3/2 and 2 of the famous Straubing-Thérien hierarchy. We show that separation is PSpace-complete for level 3/2 and between PSpace-hard and EXPTime for level 2. ACM Subject Classification Theory of computation → Formal languages and automata theory with n states, the smallest monoid recognizing the same language may have an exponential number of elements (the standard construction yields 2 n 2 elements). This explains why the complexity of the membership problem depends on the representation of the input. For instance, for the class of star-free languages, it is PSpace-complete if one starts from NFAs (and actually, even from DFAs [2]) while it is NL when starting from monoid morphisms. Recently, another problem, called separation, has replaced membership as the cornerstone in the investigation of regular languages. It takes as input two regular langages instead of one, and asks whether there exists a third language from the class under investigation including the first input language and having empty intersection with the second one. This problem has served recently as a major ingredient in the resolution of difficult membership problems, such as the so-called dot-depth two problem [16] which remained open for 40 years (see [13, 18, 6] for recent surveys on the topic). Dot-depth two is a class belonging to a famous concatenation hierarchy which stratifies the star-free languages: the dot-depth [1]. A specific concatenation hierarchy is built in a generic way. One starts from a base class (level 0 of the hierarchy) and builds increasingly growing classes (called levels and denoted by 1/2, 1, 3/2, 2, . . . ) by alternating two standard closure operations: polynomial and Boolean closure. Concatenation hierarchies account for a significant part of the open questions in this research area. The state of the art regarding separation is captured by only three results [17, 9] : in finitely based concatenation hierarchies (i.e. those whose basis is a finite class) levels 1/2, 1 and 3/2 have decidable separation. Moreover, using specific transfer results [15] , this can be pushed to the levels 3/2 and 2 for the two most famous finitely based hierarchies: the dot-depth [1] and the Straubing-Thérien hierarchy [21, 22]. Unlike the situation for membership and despite these recent decidability results for separability in concatenation hierarchies, the complexity of the problems and of the corresponding algorithms has not been investigated so far (except for the class of piecewise testable languages [3, 11, 5] , which is level 1 in the Straubing-Thérien hierarchy). The aim of this paper is to establish such complexity results. Our contributions are the following: We present a generic reduction, which shows that for many natural classes, the way the input is given (by NFAs or finite monoids) has no impact on the complexity of the separation problem. This is proved using two LogSpace reductions from one problem to the other. This situation is surprising and opposite to that of the membership problem, where an exponential blow-up is unavoidable when going from NFAs to monoids. Building on the results of [17], we show that when the alphabet is fixed, there are polynomial time algorithms for levels 1/2 and 1 in any finitely based hierarchy. We investigate levels 3/2 and 2 of the famous Straubing-Thérien hierarchy, and we show that separation is PSpace-complete for level 3/2 and between PSpace-hard and EXPTime for level 2. The upper bounds are based on the results of [17] while the lower bounds are based on independent reductions. Organization. In Section 2, we give preliminary terminology on the objects investigated in the paper. Sections 3, 4 and 5 are then devoted to the three above points. Due to space limitations, many proofs are postponed to the full version of the paper. Preliminaries In this section, we present the key objects of this paper. We define words and regular languages, classes of languages, the separation problem and finally, concatenation hierarchies. T. Place and M. Zeitoun 47:3 Words and regular languages An alphabet is a finite set A of symbols, called letters. Given some alphabet A, we denote by A + the set of all nonempty finite words and by A * the set of all finite words over A (i.e., A * = A + ∪ {ε}). If u ∈ A * and v ∈ A * we write u · v ∈ A * or uv ∈ A * for the concatenation of u and v. A language over an alphabet A is a subset of A * . Abusing terminology, if u ∈ A * is some word, we denote by u the singleton language {u}. It is standard to extend concatenation to languages: given K, L ⊆ A * , we write KL = {uv | u ∈ K and v ∈ L}. Moreover, we also consider marked concatenation, which is less standard. Given K, L ⊆ A * , a marked concatenation of K with L is a language of the form KaL, for some a ∈ A. We consider regular languages, which can be equivalently defined by regular expressions, nondeterministic finite automata (NFAs), finite monoids or monadic second-order logic (MSO). In the paper, we investigate the separation problem which takes regular languages as input. Since we are focused on complexity, how we represent these languages in our inputs matters. We shall consider two kinds of representations: NFAs and monoids. Let us briefly recall these objects and fix the terminology (we refer the reader to [7] for details).
doi:10.4230/lipics.fsttcs.2018.47 dblp:conf/fsttcs/PlaceZ18 fatcat:wbhnpdtok5f37pwghrju7v4l2m