Past Roadblocks and New Opportunities in Transcription Factor Network Mapping

Michael R. Brent
2016 Trends in Genetics  
One of the principal mechanisms by which cells differentiate and respond to changes in external signals or conditions is by changing the activity levels of transcription factors (TFs). This changes the transcription rates of target genes via the cell's TF network, which ultimately contributes to reconfiguring cellular state. Since microarrays provided our first window into global cellular state, computational biologists have eagerly attacked the problem of mapping TF networks, a key part of the
more » ... cell's control circuitry. In retrospect, however, steady-state mRNA abundance levels were a poor substitute for TF activity levels and gene transcription rates. Likewise, mapping TF binding through chromatin immunoprecipitation proved less predictive of functional regulation and less amenable to systematic elucidation of complete networks than originally hoped. This review explains these roadblocks and the current, unprecedented blossoming of new experimental techniques built on second generation sequencing, which hold out the promise of rapid progress in TF network mapping. The development of genome sequencing technologies is the paradigm for the broader group of technologies related to genomics and systems biology. Researchers first set their sights on sequencing a viral genome, then a bacterium, yeast, invertebrate models, and human. Despite much talk of the "post-genomic" era, the publication of the human genome now appears to be a taking-off point in the demand for genome sequencing, starting with other yeasts, invertebrates, and mammals for comparative genomics. This was followed by the sequencing many individuals to sample population diversity. Now that genome sequencing is Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. binding events are not necessarily functional or sequence-specific [70, 71, 82]. 2. Binding potential Models of TF binding specificity obtained from in vitro experiments complement in vivo location methods like ChIP-seq and can provide additional information about whether a physical interaction is sequence-specific. Functional regulation Transcript abundance data on cells in which a single TF has been perturbed can be used to determine whether the TF functionally regulates each target gene. Note that functional regulation does not imply binding. 4. Functional binding. If a TF regulates a target by binding to a particular site or sites, TF perturbation should affect the target in wildtype cells, but not in cells where the site(s) have been removed. Such experiments have never been done on a genome-wide scale. A more feasible, if somewhat less definitive experiment is to synthesize pairs of promoters/enhancers, one of which matches a WT genomic sequence and the other of which has a predicted TF binding site disabled. Thousands of pairs can be synthesized in parallel. If these sequences are fused to a minimal promoter driving a reporter gene and the two members of a pair express the reporter at different levels, that supports the hypothesis that the disabled TF binding site is functional (see [83] for a review of related methods). This review focuses on systematic procedures ("algorithms") for mapping TF networks, which comprise both data generation and data analysis. We are currently in the midst of an explosion of experimental methods, each of which generates a new type of data. These new data types demand new computational approaches that can effectively analyze and integrate them for network mapping. If the field succeeds in developing TF network mapping algorithms that are as robust and scalable as genome sequencing, we can expect demand for network maps to follow the same trajectory as demand for genome sequences. Brent
doi:10.1016/j.tig.2016.08.009 pmid:27720190 pmcid:PMC5117949 fatcat:jj2k3sxne5birgn2e46psxj5eu