scholarly journals Bayesian estimation for stochastic gene expression using multifidelity models

2018 ◽  
Author(s):  
Huy D. Vo ◽  
Zachary Fox ◽  
Ania Baetica ◽  
Brian Munsky

AbstractThe finite state projection (FSP) approach to solving the chemical master equation has enabled successful inference of discrete stochastic models to predict single-cell gene regulation dynamics. Unfortunately, the FSP approach is highly computationally intensive for all but the simplest models, an issue that is highly problematic when parameter inference and uncertainty quantification takes enormous numbers of parameter evaluations. To address this issue, we propose two new computational methods for the Bayesian inference of stochastic gene expression parameters given single-cell experiments. We formulate and verify an Adaptive Delayed Acceptance Metropolis-Hastings (ADAMH) algorithm to utilize with reduced Krylov-basis projections of the FSP. We then introduce an extension of the ADAMH into a Hybrid scheme that consists of an initial phase to construct a reduced model and a faster second phase to sample from the approximate posterior distribution determined by the constructed model. We test and compare both algorithms to an adaptive Metropolis algorithm with full FSP-based likelihood evaluations on three example models and simulated data to show that the new ADAMH variants achieve substantial speedup in comparison to the full FSP approach. By reducing the computational costs of parameter estimation, we expect the ADAMH approach to enable efficient data-driven estimation for more complex gene regulation models.

2017 ◽  
Author(s):  
Lisa Weber ◽  
William Raymond ◽  
Brian Munsky

AbstractIn quantitative analyses of biological processes, one may use many different scales of models (e.g., spatial or non-spatial, deterministic or stochastic, time-varying or at steady-state) or many different approaches to match models to experimental data (e.g., model fitting or parameter uncertainty/sloppiness quantification with different experiment designs). These different analyses can lead to surprisingly different results, even when applied to the same data and the same model. We use a simplified gene regulation model to illustrate many of these concerns, especially for ODE analyses of deterministic processes, chemical master equation and finite state projection analyses of heterogeneous processes, and stochastic simulations. For each analysis, we employ Matlab and Python software to consider a time-dependent input signal (e.g., a kinase nuclear translocation) and several model hypotheses, along with simulated single-cell data. We illustrate different approaches (e.g., deterministic and stochastic) to identify the mechanisms and parameters of the same model from the same simulated data. For each approach, we explore how uncertainty in parameter space varies with respect to the chosen analysis approach or specific experiment design. We conclude with a discussion of how our simulated results relate to the integration of experimental and computational investigations to explore signal-activated gene expression models in yeast [1] and human cells [2]‡.PACS numbers: 87.10.+e, 87.15.Aa, 05.10.Gg, 05.40.Ca,02.50.-rSubmitted to: Phys. Biol.


Open Biology ◽  
2017 ◽  
Vol 7 (5) ◽  
pp. 170030 ◽  
Author(s):  
Peng Dong ◽  
Zhe Liu

Animal development is orchestrated by spatio-temporal gene expression programmes that drive precise lineage commitment, proliferation and migration events at the single-cell level, collectively leading to large-scale morphological change and functional specification in the whole organism. Efforts over decades have uncovered two ‘seemingly contradictory’ mechanisms in gene regulation governing these intricate processes: (i) stochasticity at individual gene regulatory steps in single cells and (ii) highly coordinated gene expression dynamics in the embryo. Here we discuss how these two layers of regulation arise from the molecular and the systems level, and how they might interplay to determine cell fate and to control the complex body plan. We also review recent technological advancements that enable quantitative analysis of gene regulation dynamics at single-cell, single-molecule resolution. These approaches outline next-generation experiments to decipher general principles bridging gaps between molecular dynamics in single cells and robust gene regulations in the embryo.


2020 ◽  
Author(s):  
Krishna Choudhary ◽  
Atul Narang

AbstractFitting the probability mass functions from analytical solutions of stochastic models of gene expression to the count distributions of mRNA and protein molecules in single cells can yield valuable insights into mechanisms of gene regulation. Solutions of chemical master equations are available for various kinetic schemes but, even for the models of regulation with a basic ON-OFF switch, they take complex forms with generating functions given as hypergeometric functions. Gene expression studies that have used these to fit the data have interpreted the parameters as burst size and frequency. However, this is consistent with the hypergeometric functions only if a gene stays active for short time intervals separated by relatively long intervals of inactivity. Physical insights into the probability mass functions are essential to ensure proper interpretations but are lacking for models of gene regulation. We fill this gap by developing urn models for regulated gene expression, which are of immense value to interpret probability distributions. Our model consists of a master urn, which represents the cytosol. We sample RNA polymerases and ribosomes from it and assign them to recipient urns of two or more colors, which represent time intervals with a homogeneous propensity for gene expression. Colors of the recipient urns represent sub-systems of the promoter states, and the assignments to urns of a specific color represent gene expression. We use elementary principles of discrete probability theory to derive the solutions for a range of kinetic models, including the Peccoud-Ycart model, the Shahrezaei-Swain model, and models with an arbitrary number of promoter states. For activated genes, we show that transcriptional lapses, which are events of gene inactivation for short time intervals separated by long active intervals, quantify the transcriptional dynamics better than bursts. Our approach reveals the physics underlying the solutions, which has important implications for single-cell data analysis.


2018 ◽  
Author(s):  
Anissa Guillemin ◽  
Ronan Duchesne ◽  
Fabien Crauste ◽  
Sandrine Gonin-Giraud ◽  
Olivier Gandrillon

AbstractBackgroundTo understand how a metazoan cell makes the decision to differentiate, we assessed the role of stochastic gene expression (SGE) during the erythroid differentiation process. Our hypothesis is that stochastic gene expression has a role in single-cell decision-making. In agreement with this hypothesis, we and others recently showed that SGE significantly increased during differentiation. However, evidence for the causative role of SGE is still lacking. Such demonstration would require being able to experimentally manipulate SGE levels and analyze the resulting impact of these variations on cell differentiation.ResultWe identified three drugs that modulate SGE in primary erythroid progenitor cells. Artemisinin and Indomethacin simultaneously decreased SGE and reduced the amount of differentiated cells. Inversely, α-methylene-γ-butyrolactone-3 (MB-3) simultaneously increased the level of SGE and the amount of differentiated cells. We then used a dynamical modelling approach which confirmed that differentiation rates were indeed affected by the drug treatment.ConclusionUsing single-cell analysis and modeling tools, we provide experimental evidence that in a physiologically relevant cellular system, control of SGE can directly modify differentiation, supporting a causal link between the two.


Author(s):  
Frits Veerman ◽  
Nikola Popović ◽  
Carsten Marr

Abstract Stochastic gene expression in regulatory networks is conventionally modelled via the chemical master equation (CME). As explicit solutions to the CME, in the form of so-called propagators, are oftentimes not readily available, various approximations have been proposed. A recently developed analytical method is based on a separation of time scales that assumes significant differences in the lifetimes of mRNA and protein in the network, allowing for the efficient approximation of propagators from asymptotic expansions for the corresponding generating functions. Here, we showcase the applicability of that method to simulated data from a ‘telegraph’ model for gene expression that is extended with an autoregulatory mechanism. We demonstrate that the resulting approximate propagators can be applied successfully for parameter inference in the non-regulated model; moreover, we show that, in the extended autoregulated model, autoactivation or autorepression may be refuted under certain assumptions on the model parameters. These results indicate that our approach may allow for successful parameter inference and model identification from longitudinal single cell data.


2017 ◽  
Author(s):  
Gustavo Valadares Barroso ◽  
Natasa Puzovic ◽  
Julien Y Dutheil

ABSTRACTBiochemical reactions within individual cells result from the interactions of molecules, typically in small numbers. Consequently, the inherent stochasticity of binding and diffusion processes generate noise along the cascade that leads to the synthesis of a protein from its encoding gene. As a result, isogenic cell populations display phenotypic variability even in homogeneous environments. The extent and consequences of this stochastic gene expression have only recently been assessed on a genome-wide scale, in particular owing to the advent of single cell transcriptomics. However, the evolutionary forces shaping this stochasticity have yet to be unraveled. We take advantage of two recently published data sets of the single-cell transcriptome of the domestic mouse Mus musculus in order to characterize the effect of natural selection on gene-specific transcriptional stochasticity. We show that noise levels in the mRNA distributions (a.k.a. transcriptional noise) significantly correlate with three-dimensional nuclear domain organization, evolutionary constraint on the encoded protein and gene age. The position of the encoded protein in biological pathways, however, is the main factor that explains observed levels of transcriptional noise, in agreement with models of noise propagation within gene networks. Because transcriptional noise is under widespread selection, we argue that it constitutes an important component of the phenotype and that variance of expression is a potential target of adaptation. Stochastic gene expression should therefore be considered together with mean expression level in functional and evolutionary studies of gene expression.


2016 ◽  
Author(s):  
Thomas Blasi ◽  
Florian Buettner ◽  
Michael K. Strasser ◽  
Carsten Marr ◽  
Fabian J. Theis

AbstractMotivation: Accessing gene expression at the single cell level has unraveled often large heterogeneity among seemingly homogeneous cells, which remained obscured in traditional population based approaches. The computational analysis of single-cell transcriptomics data, however, still imposes unresolved challenges with respect to normalization, visualization and modeling the data. One such issue are differences in cell size, which introduce additional variability into the data, for which appropriate normalization techniques are needed. Otherwise, these differences in cell size may obscure genuine heterogeneities among cell populations and lead to overdispersed steady-state distributions of mRNA transcript numbers.Results: We present cgCorrect, a statistical framework to correct for differences in cell size that are due to cell growth in single-cell transcriptomics data. We derive the probability for the cell growth corrected mRNA transcript number given the measured, cell size dependent mRNA transcript number, based on the assumption that the average number of transcripts in a cell increases proportional to the cell’s volume during cell cycle. cgCorrect can be used for both data normalization, and to analyze steady-state distributions used to infer the gene expression mechanism. We demonstrate its applicability on both simulated data and single-cell quantitative real-time PCR data from mouse blood stem and progenitor cells. We show that correcting for differences in cell size affects the interpretation of the data obtained by typically performed computational analysis.Availability: A Matlab implementation of cgCorrect is available at http://icb.helmholtz-muenchen.de/cgCorrectSupplementary information: Supplementary information are available online. The simulated data set is available at http://icb.helmholtz-muenchen.de/cgCorrect


Science ◽  
2002 ◽  
Vol 297 (5584) ◽  
pp. 1183-1186 ◽  
Author(s):  
M. B. Elowitz

Sign in / Sign up

Export Citation Format

Share Document