cMonkey2: Automated, systematic, integrated detection of co-regulated gene modules for any organism

David J. Reiss, Christopher L. Plaisier, Wei-Ju Wu, Nitin S. Baliga
2015 Nucleic Acids Research  
The cMonkey integrated biclustering algorithm identifies conditionally co-regulated modules of genes (biclusters). cMonkey integrates various orthogonal pieces of information which support evidence of gene co-regulation, and optimizes biclusters to be supported simultaneously by one or more of these prior constraints. The algorithm served as the cornerstone for constructing the first global, predictive Environmental Gene Regulatory Influence Network (EGRIN) model for a free-living cell, and has
more » ... now been applied to many more organisms. However, due to its computational inefficiencies, long run-time and complexity of various input data types, cMonkey was not readily usable by the wider community. To address these primary concerns, we have significantly updated the cMonkey algorithm and refactored its implementation, improving its usability and extendibility. These improvements provide a fully functioning and user-friendly platform for building co-regulated gene modules and the tools necessary for their exploration and interpretation. We show, via three separate analyses of data for E. coli, M. tuberculosis and H. sapiens, that the updated algorithm and inclusion of novel scoring functions for new data types (e.g. ChIP-seq and transcription factor over-expression [TFOE]) improve discovery of biologically informative co-regulated modules. The complete cMonkey 2 software package, including source code, is available at https://github.com/baliga-lab/cmonkey2.
doi:10.1093/nar/gkv300 pmid:25873626 pmcid:PMC4513845 fatcat:3nkbr7wbojcg5mzuwoh3wtrqd4