Finding biclusters by random projections

Stefano Lonardi, Wojciech Szpankowski, Qiaofeng Yang
2006 Theoretical Computer Science  
Given a matrix X composed of symbols, a bicluster is a submatrix of X obtained by removing some of the rows and some of the columns of X in such a way that each row of what is left reads the same string. In this paper, we are concerned with the problem of finding the bicluster with the largest area in a large matrix X. The problem is first proved to be NP-complete. We present a fast and efficient randomized algorithm that discovers the largest bicluster by random projections. A detailed
more » ... istic analysis of the algorithm and an asymptotic study of the statistical significance of the solutions are given. We report results of extensive simulations on synthetic data.
doi:10.1016/j.tcs.2006.09.023 fatcat:moai2q3rk5gblodxq6qyglioqu