RESUMO
The rise of single-cell data highlights the need for a nondeterministic view of gene expression, while offering new opportunities regarding gene regulatory network inference. We recently introduced two strategies that specifically exploit time-course data, where single-cell profiling is performed after a stimulus: HARISSA, a mechanistic network model with a highly efficient simulation procedure, and CARDAMOM, a scalable inference method seen as model calibration. Here, we combine the two approaches and show that the same model driven by transcriptional bursting can be used simultaneously as an inference tool, to reconstruct biologically relevant networks, and as a simulation tool, to generate realistic transcriptional profiles emerging from gene interactions. We verify that CARDAMOM quantitatively reconstructs causal links when the data is simulated from HARISSA, and demonstrate its performance on experimental data collected on in vitro differentiating mouse embryonic stem cells. Overall, this integrated strategy largely overcomes the limitations of disconnected inference and simulation.
Assuntos
Algoritmos , Redes Reguladoras de Genes , Animais , Camundongos , Redes Reguladoras de Genes/genética , Simulação por Computador , Perfilação da Expressão Gênica/métodosRESUMO
BACKGROUND: Mature blood cells arise from hematopoietic stem cells in the bone marrow by a process of differentiation along one of several different lineage trajectories. This is often represented as a series of discrete steps of increasing progenitor cell commitment to a given lineage, but as for differentiation in general, whether the process is instructive or stochastic remains controversial. Here, we examine this question by analyzing single-cell transcriptomic data from human bone marrow cells, assessing cell-to-cell variability along the trajectories of hematopoietic differentiation into four different types of mature blood cells. The instructive model predicts that cells will be following the same sequence of instructions and that there will be minimal variability of gene expression between them throughout the process, while the stochastic model predicts a role for cell-to-cell variability when lineage commitments are being made. RESULTS: Applying Shannon entropy to measure cell-to-cell variability among human hematopoietic bone marrow cells at the same stage of differentiation, we observed a transient peak of gene expression variability occurring at characteristic points in all hematopoietic differentiation pathways. Strikingly, the genes whose cell-to-cell variation of expression fluctuated the most over the course of a given differentiation trajectory are pathway-specific genes, whereas genes which showed the greatest variation of mean expression are common to all pathways. Finally, we showed that the level of cell-to-cell variation is increased in the most immature compartment of hematopoiesis in myelodysplastic syndromes. CONCLUSIONS: These data suggest that human hematopoietic differentiation could be better conceptualized as a dynamical stochastic process with a transient stage of cellular indetermination, and strongly support the stochastic view of differentiation. They also highlight the need to consider the role of stochastic gene expression in complex physiological processes and pathologies such as cancers, paving the way for possible noise-based therapies through epigenetic regulation.
Assuntos
Epigênese Genética , Hematopoese , Diferenciação Celular/genética , Entropia , Hematopoese/genética , Células-Tronco Hematopoéticas/metabolismo , HumanosRESUMO
Differentiation is the process whereby a cell acquires a specific phenotype, by differential gene expression as a function of time. This is thought to result from the dynamical functioning of an underlying Gene Regulatory Network (GRN). The precise path from the stochastic GRN behavior to the resulting cell state is still an open question. In this work we propose to reduce a stochastic model of gene expression, where a cell is represented by a vector in a continuous space of gene expression, to a discrete coarse-grained model on a limited number of cell types. We develop analytical results and numerical tools to perform this reduction for a specific model characterizing the evolution of a cell by a system of piecewise deterministic Markov processes (PDMP). Solving a spectral problem, we find the explicit variational form of the rate function associated to a large deviations principle, for any number of genes. The resulting Lagrangian dynamics allows us to define a deterministic limit of which the basins of attraction can be identified to cellular types. In this context the quasipotential, describing the transitions between these basins in the weak noise limit, can be defined as the unique solution of an Hamilton-Jacobi equation under a particular constraint. We develop a numerical method for approximating the coarse-grained model parameters, and show its accuracy for a symmetric toggle-switch network. We deduce from the reduced model an approximation of the stationary distribution of the PDMP system, which appears as a Beta mixture. Altogether those results establish a rigorous frame for connecting GRN behavior to the resulting cellular behavior, including the calculation of the probability of jumps between cell types.
Assuntos
Fenômenos Bioquímicos , Expressão Gênica , Redes Reguladoras de Genes , Cadeias de Markov , Processos EstocásticosRESUMO
In some recent studies, a view emerged that stochastic dynamics governing the switching of cells from one differentiation state to another could be characterized by a peak in gene expression variability at the point of fate commitment. We have tested this hypothesis at the single-cell level by analyzing primary chicken erythroid progenitors through their differentiation process and measuring the expression of selected genes at six sequential time-points after induction of differentiation. In contrast to population-based expression data, single-cell gene expression data revealed a high cell-to-cell variability, which was masked by averaging. We were able to show that the correlation network was a very dynamical entity and that a subgroup of genes tend to follow the predictions from the dynamical network biomarker (DNB) theory. In addition, we also identified a small group of functionally related genes encoding proteins involved in sterol synthesis that could act as the initial drivers of the differentiation. In order to assess quantitatively the cell-to-cell variability in gene expression and its evolution in time, we used Shannon entropy as a measure of the heterogeneity. Entropy values showed a significant increase in the first 8 h of the differentiation process, reaching a peak between 8 and 24 h, before decreasing to significantly lower values. Moreover, we observed that the previous point of maximum entropy precedes two paramount key points: an irreversible commitment to differentiation between 24 and 48 h followed by a significant increase in cell size variability at 48 h. In conclusion, when analyzed at the single cell level, the differentiation process looks very different from its classical population average view. New observables (like entropy) can be computed, the behavior of which is fully compatible with the idea that differentiation is not a "simple" program that all cells execute identically but results from the dynamical behavior of the underlying molecular network.
Assuntos
Diferenciação Celular , Análise de Célula Única , Entropia , Perfilação da Expressão Gênica , Modelos Biológicos , Células-Tronco/citologia , Células-Tronco/metabolismoRESUMO
BACKGROUND: The recent development of single-cell transcriptomics has enabled gene expression to be measured in individual cells instead of being population-averaged. Despite this considerable precision improvement, inferring regulatory networks remains challenging because stochasticity now proves to play a fundamental role in gene expression. In particular, mRNA synthesis is now acknowledged to occur in a highly bursty manner. RESULTS: We propose to view the inference problem as a fitting procedure for a mechanistic gene network model that is inherently stochastic and takes not only protein, but also mRNA levels into account. We first explain how to build and simulate this network model based upon the coupling of genes that are described as piecewise-deterministic Markov processes. Our model is modular and can be used to implement various biochemical hypotheses including causal interactions between genes. However, a naive fitting procedure would be intractable. By performing a relevant approximation of the stationary distribution, we derive a tractable procedure that corresponds to a statistical hidden Markov model with interpretable parameters. This approximation turns out to be extremely close to the theoretical distribution in the case of a simple toggle-switch, and we show that it can indeed fit real single-cell data. As a first step toward inference, our approach was applied to a number of simple two-gene networks simulated in silico from the mechanistic model and satisfactorily recovered the original networks. CONCLUSIONS: Our results demonstrate that functional interactions between genes can be inferred from the distribution of a mechanistic, dynamical stochastic model that is able to describe gene expression in individual cells. This approach seems promising in relation to the current explosion of single-cell expression data.