Moment-based inference predicts bimodality in transient gene expression

C. Zechner, J. Ruess, P. Krenn, S. Pelet, M. Peter, J. Lygeros, H. Koeppl
2012 Proceedings of the National Academy of Sciences of the United States of America  
Recent computational studies indicate that the molecular noise of a cellular process may be a rich source of information about process dynamics and parameters. However, accessing this source requires stochastic models that are usually difficult to analyze. Therefore, parameter estimation for stochastic systems using distribution measurements, as provided for instance by flow cytometry, currently remains limited to very small and simple systems. Here we propose a new method that makes use of
more » ... order moments of the measured distribution and thereby keeps the essential parts of the provided information, while still staying applicable to systems of realistic size. We demonstrate how cell-to-cell variability can be incorporated into the analysis obviating the need for the ubiquitous assumption that the measurements stem from a homogeneous cell population. We demonstrate the method for a simple example of gene expression using synthetic data generated by stochastic simulation. Subsequently, we use time-lapsed flow cytometry data for the osmo-stress induced transcriptional response in budding yeast to calibrate a stochastic model, which is then used as a basis for predictions. Our results show that measurements of the mean and the variance can be enough to determine the model parameters, even if the measured distributions are not well-characterized by low-order moments only-e.g., if they are bimodal. extrinsic variability | high-osmolarity glycerol pathway | moment dynamics | parameter inference | stochastic kinetic models B uilding predictive computational models of intracellular reaction kinetics is still a dauntingly ill-posed task (1), characterized by low-dimensional experimental readouts of the hypothesized high-dimensional process. Single-cell technologies hold promise to partly alleviate this ill-posedness by exploiting the observed variability for the calibration of stochastic kinetic models (2, 3). The same technologies, however, also reveal that isogenic cells in a single population exhibit large cell-to-cell variability (4, 5). The variation can be shown to be a convolution of two sources, namely the intrinsic molecular noise and extrinsic factors that render single cells different even in the absence of molecular noise; in many cases, the latter was reported to dominate the former (4, 5). Extrinsic factors comprise difference in cell size, cell-cycle stage, expression capacity, local growth conditionsto name but a few (6, 7). Thus, although single-cell technology offers a way out of the predicament of ill-posedness, it requires new methods to deal properly with intrinsic and extrinsic variability. The effect of extrinsic variability on the dynamics of stochastic models is studied in refs. 7 and 8, whereas first attempts have been made to address the inverse problem of quantifying the extrinsic (9) and any additional intrinsic (10) components from measurements. Because the latter is based on path sampling, its applicability remains limited to small systems. Naturally, extrinsic variability is bypassed when calibrating a stochastic model to one single cell (3, 11), for instance through live-cell imaging data. However, the extent to which such a single path observation of the hypothesized stochastic process-given the notoriously sparse acquisition times-can sufficiently confine unknown process parameters remains questionable. Stochastic kinetic models, able to capture the intrinsic noise, were proposed for modeling single-cell data and its variability (2, 12, 13). Models that track probabilities over the integer-valued state-space of molecule-counts suffer from the curse of dimensionality and are computationally prohibitive for all but the simplest systems. Similar limitations apply to approximations thereof that retain the discreteness of the state-space (14, 15). While extracting sample paths of such processes is straightforward (16), acquiring their statistics-often necessary for calibration-is hampered by the slow convergence of empirical estimates for high-dimensional models (17) . This is particularly challenging because most calibration or inference methods rely on iterative schemes, making it necessary to recompute statistics. Alternative methods set out to reduce the computational burden by tracking only low-order moments instead of the whole probability distribution. A standard scheme in this class is moment closure, which provides a means to capture the stochasticity of reactions while leveraging the scalability of ordinary differential equation models (18) (19) (20) . Here we introduce a moment-based inference scheme for calibrating stochastic models with heterogeneous single-cell measurements. We show how by extending the method of moment closure by conditional moment equations one can properly account for extrinsic factors. The proposed method requires no Monte Carlo simulations over extrinsic factors, making this approach very scalable. Moreover, besides parameter estimates and their confidence bounds, the method allows one to quantitatively characterize the cell-to-cell variability, ultimately dissecting the unspecific conglomerate of extrinsic factors (6). Every additionally accounted moment of the stochastic process can make the calibration less ill-posed; in the same way as the mean of the process contains information, so does its variance. Importantly, we show that this also holds true if the process is poorly captured by the accounted moments, for instance, if we just consider first and second order moments of a multimodal process distribution. We instantiate this computational framework by addressing a widely discussed-and we believe ubiquitous-process motif, namely the transiently induced gene expression (21, 22) . Often signaling pathways are activated for a short time window, in which the activated signaling output-such as a mitogen-activated protein kinase (MAPK)-needs to initiate transcription by translocation and interaction with possibly several intermediates. If many intermediates need to be in place, some cells do not manage to transcribe at all, ultimately giving rise to bimodal protein expression profiles. The particular case study we consider is the high-osmolarity glycerol (HOG) pathway in budding yeast (23), where for intermediate induction levels a bimodality in the induced stress proteins was observed (21). We perform time-lapsed flow cytometry measurements to calibrate a stochastic kinetic
doi:10.1073/pnas.1200161109 pmid:22566653 pmcid:PMC3361437 fatcat:egkzjrx3p5fmdncg2tqpgfmu2e