### Scale Reduction Techniques for Computing Maximum Induced Bicliques

Shahram Shahinpour, Shirin Shirvani, Zeynep Ertem, Sergiy Butenko
2017 Algorithms
Given a simple, undirected graph G, a biclique is a subset of vertices inducing a complete bipartite subgraph in G. In this paper, we consider two associated optimization problems, the maximum biclique problem, which asks for a biclique of the maximum cardinality in the graph, and the maximum edge biclique problem, aiming to find a biclique with the maximum number of edges in the graph. These NP-hard problems find applications in biclustering-type tasks arising in complex network analysis.
more » ... life instances of these problems often involve massive, but sparse networks. We develop exact approaches for detecting optimal bicliques in large-scale graphs that combine effective scale reduction techniques with integer programming methodology. Results of computational experiments with numerous real-life network instances demonstrate the performance of the proposed approach. called an independent set if G[C] is edgeless, and C is a clique if G[C] is a complete graph (i.e., has all possible edges). Let α(G) be the independence number of G, which is the cardinality of a maximum independent set in G. A graph is called bipartite if its set of vertices can be partitioned into two non-overlapping independent sets, referred to as parts. is a complete bipartite graph, which is a bipartite graph with all possible edges between the parts. A biclique is called maximal if it is not a subset of a larger biclique and, maximum if there is no larger biclique in G. Note that according to Definition 1, an independent set is a biclique with one of the parts being an empty set. However, from a practical perspective, one is not interested in independent set solutions when searching for large bicliques in a graph. This observation is reflected in the definitions of the corresponding optimization problems given next. Definition 2. Given a simple, undirected graph G = (V, E), the maximum biclique (MB) problem is to find a maximum cardinality biclique in G that is not an independent set. Definition 3. Given a simple, undirected graph G = (V, E), the maximum edge biclique (MEB) problem is to find a biclique with the maximum number of edges in G. Different algorithms have been proposed for biclique community detection ranging from enumeration of all maximal (non-induced) bicliques of a graph [13, [15] [16] [17] [18] , finding a maximum edge cardinality biclique [5] and maximum balanced bicliques [19] to exact, exponential-time methods [20], approximation algorithms [21] and mining quasi-bicliques [22, 23] . In particular, the MB problem has been proved to be NP-hard in general graphs, but polynomial-time solvable in bipartite graphs [21] . The maximum edge biclique (MEB) problem has been successfully used for biclustering and formal concept analysis [4, 5, 24] . It has been proved to be NP-hard in general and hard to approximate even for bipartite graphs [25, 26] , but polynomial time solvable in convex bipartite and biconvex graphs [5] (A bipartite graph G = (V 1 , V 2 , E) is called convex on V 2 if there exists an ordering of the vertices of V 2 such that for any v ∈ V 1 , N G (v) consists of vertices that are consecutive in V 2 . The graph G is biconvex if it is convex on both V 1 and V 2 ). The maximum edge weight biclique problem, an edge-weighted generalization of MEB, has also been considered in the literature [11, 27] . It should be noted that the NP-hardness of the MB problem in some of the previous publications [21, 25] has been claimed based on a more general result by Yannakakis [28], as discussed next. A graph property Π is said to be hereditary on induced subgraphs if for a graph G with property Π, the deletion of any subset of vertices does not produce a graph violating Π. A property Π is said to be nontrivial if it is true for a single-vertex graph and is not satisfied by every graph. Also a property is said to be interesting if there are arbitrarily large graphs satisfying Π. The maximum Π problem is to find the largest order induced subgraph that does not violate property Π. Yannakakis [28] proved that the maximum Π problem for nontrivial, interesting graph properties that are hereditary on induced subgraphs is NP-hard. If an independent set is accepted as a feasible solution for the MB problem, then clearly the property "a graph whose set of vertices is a biclique" is nontrivial, interesting and hereditary on induced subgraphs, which proves that this version of the MB problem (that accepts independent set solutions) is NP-hard. However, requiring bicliques to have two nonempty parts (as in this paper and in [21] ) violates the heredity property, since removing the central vertex of a star graph gives an independent set, which is infeasible under this requirement. Thus, the result of Yannakakis is not applicable for this, more practically reasonable version of the MB problem. Nonetheless, the reduction from the independent set problem used in [21] does prove that the non-hereditary version of the MB problem is indeed NP-hard. On a practical note, the lack of heredity implies that the general-purpose combinatorial branch-and bound framework known as Russian Doll Search (RDS) [29, 30] is not directly applicable to the problems considered in this paper.