RESUMO
Major computational challenges exist in relation to the collection, curation, processing and analysis of large genomic and imaging datasets, as well as the simulation of larger and more realistic models in systems biology. Here we discuss how a relative newcomer among programming languages-Julia-is poised to meet the current and emerging demands in the computational biosciences and beyond. Speed, flexibility, a thriving package ecosystem and readability are major factors that make high-performance computing and data analysis available to an unprecedented degree. We highlight how Julia's design is already enabling new ways of analyzing biological data and systems, and we provide a list of resources that can facilitate the transition into Julian computing.
Assuntos
Ecossistema , Linguagens de Programação , Simulação por Computador , Metodologias Computacionais , Biologia de Sistemas , SoftwareRESUMO
SUMMARY: BondGraphs.jl is a Julia implementation of bond graphs. Bond graphs provide a modelling framework that describes energy flow through a physical system and by construction enforce thermodynamic constraints. The framework is widely used in engineering and has recently been shown to be a powerful approach for modelling biology. Models are mutable, hierarchical, multiscale, and multiphysics, and BondGraphs.jl is compatible with the Julia modelling ecosystem. AVAILABILITY AND IMPLEMENTATION: BondGraphs.jl is freely available under the MIT license. Source code and documentation can be found at https://github.com/jedforrest/BondGraphs.jl.
RESUMO
SUMMARY: HyperGraphs.jl is a Julia package that implements hypergraphs. These are a generalization of graphs that allow us to represent n-ary relationships and not just binary, pairwise relationships. High-order interactions are commonplace in biological systems and are of critical importance to their dynamics; hypergraphs thus offer a natural way to accurately describe and model these systems. AVAILABILITY AND IMPLEMENTATION: HyperGraphs.jl is freely available under the MIT license. Source code and documentation can be found at https://github.com/lpmdiaz/HyperGraphs.jl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Biologia Computacional , SoftwareRESUMO
The predictive power of machine learning models often exceeds that of mechanistic modeling approaches. However, the interpretability of purely data-driven models, without any mechanistic basis is often complicated, and predictive power by itself can be a poor metric by which we might want to judge different methods. In this work, we focus on the relatively new modeling techniques of neural ordinary differential equations. We discuss how they relate to machine learning and mechanistic models, with the potential to narrow the gulf between these two frameworks: they constitute a class of hybrid model that integrates ideas from data-driven and dynamical systems approaches. Training neural ODEs as representations of dynamical systems data has its own specific demands, and we here propose a collocation scheme as a fast and efficient training strategy. This alleviates the need for costly ODE solvers. We illustrate the advantages that collocation approaches offer, as well as their robustness to qualitative features of a dynamical system, and the quantity and quality of observational data. We focus on systems that exemplify some of the hallmarks of complex dynamical systems encountered in systems biology, and we map out how these methods can be used in the analysis of mathematical models of cellular and physiological processes.
Assuntos
Modelos Teóricos , Biologia de Sistemas , Aprendizado de Máquina , Biologia de Sistemas/métodosRESUMO
Modeling and simulation of complex biochemical reaction networks form cornerstones of modern biophysics. Many of the approaches developed so far capture temporal fluctuations due to the inherent stochasticity of the biophysical processes, referred to as intrinsic noise. Stochastic fluctuations, however, predominantly stem from the interplay of the network with many other-and mostly unknown-fluctuating processes, as well as with various random signals arising from the extracellular world; these sources contribute extrinsic noise. Here, we provide a computational simulation method to probe the stochastic dynamics of biochemical systems subject to both intrinsic and extrinsic noise. We develop an extrinsic chemical Langevin equation (CLE)-a physically motivated extension of the CLE-to model intrinsically noisy reaction networks embedded in a stochastically fluctuating environment. The extrinsic CLE is a continuous approximation to the chemical master equation (CME) with time-varying propensities. In our approach, noise is incorporated at the level of the CME, and it can account for the full dynamics of the exogenous noise process, irrespective of timescales and their mismatches. We show that our method accurately captures the first two moments of the stationary probability density when compared with exact stochastic simulation methods while reducing the computational runtime by several orders of magnitude. Our approach provides a method that is practical, computationally efficient, and physically accurate to study systems that are simultaneously subject to a variety of noise sources.
Assuntos
Algoritmos , Modelos Biológicos , Simulação por Computador , Processos EstocásticosRESUMO
The complexity of biological systems, and the increasingly large amount of associated experimental data, necessitates that we develop mathematical models to further our understanding of these systems. Because biological systems are generally not well understood, most mathematical models of these systems are based on experimental data, resulting in a seemingly heterogeneous collection of models that ostensibly represent the same system. To understand the system we therefore need to understand how the different models are related to each other, with a view to obtaining a unified mathematical description. This goal is complicated by the fact that a number of distinct mathematical formalisms may be employed to represent the same system, making direct comparison of the models very difficult. A methodology for comparing mathematical models based on their underlying conceptual structure is therefore required. In previous work we developed an appropriate framework for model comparison where we represent models, specifically the conceptual structure of the models, as labelled simplicial complexes and compare them with the two general methodologies of comparison by distance and comparison by equivalence. In this article we continue the development of our model comparison methodology in two directions. First, we present a rigorous and automatable methodology for the core process of comparison by equivalence, namely determining the vertices in a simplicial representation, corresponding to model components, that are conceptually related and the identification of these vertices via simplicial operations. Our methodology is based on considerations of vertex symmetry in the simplicial representation, for which we develop the required mathematical theory of group actions on simplicial complexes. This methodology greatly simplifies and expedites the process of determining model equivalence. Second, we provide an alternative mathematical framework for our model-comparison methodology by representing models as groups, which allows for the direct application of group-theoretic techniques within our model-comparison methodology.
Assuntos
Modelos Teóricos , MatemáticaRESUMO
MOTIVATION: Approximate Bayesian computation (ABC) is an important framework within which to infer the structure and parameters of a systems biology model. It is especially suitable for biological systems with stochastic and nonlinear dynamics, for which the likelihood functions are intractable. However, the associated computational cost often limits ABC to models that are relatively quick to simulate in practice. RESULTS: We here present a Julia package, GpABC, that implements parameter inference and model selection for deterministic or stochastic models using (i) standard rejection ABC or sequential Monte Carlo ABC or (ii) ABC with Gaussian process emulation. The latter significantly reduces the computational cost. AVAILABILITY AND IMPLEMENTATION: https://github.com/tanhevg/GpABC.jl.
Assuntos
Biologia de Sistemas , Teorema de Bayes , Simulação por Computador , Funções Verossimilhança , Método de Monte Carlo , Distribuição NormalRESUMO
Cell fate decision-making events involve the interplay of many molecular processes, ranging from signal transduction to genetic regulation, as well as a set of molecular and physiological feedback loops. Each aspect offers a rich field of investigation in its own right, but to understand the whole process, even in simple terms, we need to consider them together. Here we attempt to characterise this process by focussing on the roles of noise during cell fate decisions. We use a range of recent results to develop a view of the sequence of events by which a cell progresses from a pluripotent or multipotent to a differentiated state: chromatin organisation, transcription factor stoichiometry, and cellular signalling all change during this progression, and all shape cellular variability, which becomes maximal at the transition state.
Assuntos
Diferenciação Celular/fisiologia , Transdução de Sinais , Cromatina/fisiologia , Células-Tronco Multipotentes/fisiologia , Células-Tronco Pluripotentes/fisiologia , Fatores de Transcrição/metabolismoRESUMO
The formation of spatial structures lies at the heart of developmental processes. However, many of the underlying gene regulatory and biochemical processes remain poorly understood. Turing patterns constitute a main candidate to explain such processes, but they appear sensitive to fluctuations and variations in kinetic parameters, raising the question of how they may be adopted and realised in naturally evolved systems. The vast majority of mathematical studies of Turing patterns have used continuous models specified in terms of partial differential equations. Here, we complement this work by studying Turing patterns using discrete cellular automata models. We perform a large-scale study on all possible two-species networks and find the same Turing pattern producing networks as in the continuous framework. In contrast to continuous models, however, we find these Turing pattern topologies to be substantially more robust to changes in the parameters of the model. We also find that diffusion-driven instabilities are substantially weaker predictors for Turing patterns in our discrete modelling framework in comparison to the continuous case, in the sense that the presence of an instability does not guarantee a pattern emerging in simulations. We show that a more refined criterion constitutes a stronger predictor. The similarity of the results for the two modelling frameworks suggests a deeper underlying principle of Turing mechanisms in nature. Together with the larger robustness in the discrete case this suggests that Turing patterns may be more robust than previously thought.
Assuntos
Modelos Biológicos , Difusão , CinéticaRESUMO
Turing patterns have morphed from mathematical curiosities into highly desirable targets for synthetic biology. For a long time, their biological significance was sometimes disputed but there is now ample evidence for their involvement in processes ranging from skin pigmentation to digit and limb formation. While their role in developmental biology is now firmly established, their synthetic design has so far proved challenging. Here, we review recent large-scale mathematical analyses that have attempted to narrow down potential design principles. We consider different aspects of robustness of these models and outline why this perspective will be helpful in the search for synthetic Turing-patterning systems. We conclude by considering robustness in the context of developmental modelling more generally. This article is part of the theme issue 'Recent progress and open frontiers in Turing's theory of morphogenesis'.
Assuntos
Modelos Biológicos , Biologia Sintética , MorfogêneseRESUMO
An efficient immunosurveillance of CD8+ T cells in the periphery depends on positive/negative selection of thymocytes and thus on the dynamics of antigen degradation and epitope production by thymoproteasome and immunoproteasome in the thymus. Although studies in mouse systems have shown how thymoproteasome activity differs from that of immunoproteasome and strongly impacts the T cell repertoire, the proteolytic dynamics and the regulation of human thymoproteasome are unknown. By combining biochemical and computational modeling approaches, we show here that human 20S thymoproteasome and immunoproteasome differ not only in the proteolytic activity of the catalytic sites but also in the peptide transport. These differences impinge upon the quantity of peptide products rather than where the substrates are cleaved. The comparison of the two human 20S proteasome isoforms depicts different processing of antigens that are associated to tumors and autoimmune diseases.
Assuntos
Apresentação de Antígeno , Linfócitos T CD8-Positivos/enzimologia , Simulação por Computador , Complexo de Endopeptidases do Proteassoma/química , Células A549 , Animais , Linfócitos T CD8-Positivos/imunologia , Catálise , Células HeLa , Células Endoteliais da Veia Umbilical Humana , Humanos , Camundongos , Complexo de Endopeptidases do Proteassoma/genética , Complexo de Endopeptidases do Proteassoma/imunologia , Células THP-1RESUMO
Noise in gene expression is one of the hallmarks of life at the molecular scale. Here we derive analytical solutions to a set of models describing the molecular mechanisms underlying transcription of DNA into RNA. Our ansatz allows us to incorporate the effects of extrinsic noise-encompassing factors external to the transcription of the individual gene-and discuss the ramifications for heterogeneity in gene product abundance that has been widely observed in single cell data. Crucially, we are able to show that heavy-tailed distributions of RNA copy numbers cannot result from the intrinsic stochasticity in gene expression alone, but must instead reflect extrinsic sources of variability.
Assuntos
Expressão Gênica , Modelos Genéticos , DNA/genética , RNA/genética , Processos Estocásticos , Transcrição GênicaRESUMO
Stochastic models are key to understanding the intricate dynamics of gene expression. However, the simplest models that only account for active and inactive states of a gene fail to capture common observations in both prokaryotic and eukaryotic organisms. Here, we consider multistate models of gene expression that generalize the canonical Telegraph process and are capable of capturing the joint effects of transcription factors, heterochromatin state, and DNA accessibility (or, in prokaryotes, sigma-factor activity) on transcript abundance. We propose two approaches for solving classes of these generalized systems. The first approach offers a fresh perspective on a general class of multistate models and allows us to "decompose" more complicated systems into simpler processes, each of which can be solved analytically. This enables us to obtain a solution of any model from this class. Next, we develop an approximation method based on a power series expansion of the stationary distribution for an even broader class of multistate models of gene transcription. We further show that models from both classes cannot have a heavy-tailed distribution in the absence of extrinsic noise. The combination of analytical and computational solutions for these realistic gene expression models also holds the potential to design synthetic systems and control the behavior of naturally evolved gene expression systems in guiding cell-fate decisions.
Assuntos
DNA/genética , Expressão Gênica , Modelos Genéticos , Processos EstocásticosRESUMO
One of the central tasks in systems biology is to understand how cells regulate their metabolism. Hierarchical regulation analysis is a powerful tool to study this regulation at the metabolic, gene-expression, and signaling levels. It has been widely applied to study steady-state regulation, but analysis of the metabolic dynamics remains challenging because it is difficult to measure time-dependent metabolic flux. Here, we develop a nonparametric method that uses Gaussian processes to accurately infer the dynamics of a metabolic pathway based only on metabolite measurements; from this, we then go on to obtain a dynamical view of the hierarchical regulation processes invoked over time to control the activity in a pathway. Our approach allows us to use hierarchical regulation analysis in a dynamic setting but without the need for explicitly time-dependent flux measurements.
Assuntos
Redes e Vias Metabólicas , Modelos Biológicos , Estatísticas não ParamétricasRESUMO
BACKGROUND: Reverse engineering of gene regulatory networks from time series gene-expression data is a challenging problem, not only because of the vast sets of candidate interactions but also due to the stochastic nature of gene expression. We limit our analysis to nonlinear differential equation based inference methods. In order to avoid the computational cost of large-scale simulations, a two-step Gaussian process interpolation based gradient matching approach has been proposed to solve differential equations approximately. RESULTS: We apply a gradient matching inference approach to a large number of candidate models, including parametric differential equations or their corresponding non-parametric representations, we evaluate the network inference performance under various settings for different inference objectives. We use model averaging, based on the Bayesian Information Criterion (BIC), to combine the different inferences. The performance of different inference approaches is evaluated using area under the precision-recall curves. CONCLUSIONS: We found that parametric methods can provide comparable, and often improved inference compared to non-parametric methods; the latter, however, require no kinetic information and are computationally more efficient.
Assuntos
Redes Reguladoras de Genes/genética , Algoritmos , Distribuição NormalRESUMO
Motivation: Different experiments provide differing levels of information about a biological system. This makes it difficult, a priori, to select one of them beyond mere speculation and/or belief, especially when resources are limited. With the increasing diversity of experimental approaches and general advances in quantitative systems biology, methods that inform us about the information content that a given experiment carries about the question we want to answer, become crucial. Results: PEITH(Θ) is a general purpose, Python framework for experimental design in systems biology. PEITH(Θ) uses Bayesian inference and information theory in order to derive which experiments are most informative in order to estimate all model parameters and/or perform model predictions. Availability and implementation: https://github.com/MichaelPHStumpf/Peitho. Contact: m.stumpf@imperial.ac.uk or juliane.liepe@mpibpc.mpg.de.
Assuntos
Teoria da Informação , Software , Biologia de Sistemas/métodos , Teorema de BayesRESUMO
Models describing the process of stem-cell differentiation are plentiful, and may offer insights into the underlying mechanisms and experimentally observed behaviour. Waddington's epigenetic landscape has been providing a conceptual framework for differentiation processes since its inception. It also allows, however, for detailed mathematical and quantitative analyses, as the landscape can, at least in principle, be related to mathematical models of dynamical systems. Here we focus on a set of dynamical systems features that are intimately linked to cell differentiation, by considering exemplar dynamical models that capture important aspects of stem cell differentiation dynamics. These models allow us to map the paths that cells take through gene expression space as they move from one fate to another, e.g. from a stem-cell to a more specialized cell type. Our analysis highlights the role of the transition state (TS) that separates distinct cell fates, and how the nature of the TS changes as the underlying landscape changes-change that can be induced by e.g. cellular signaling. We demonstrate that models for stem cell differentiation may be interpreted in terms of either a static or transitory landscape. For the static case the TS represents a particular transcriptional profile that all cells approach during differentiation. Alternatively, the TS may refer to the commonly observed period of heterogeneity as cells undergo stochastic transitions.
Assuntos
Diferenciação Celular , Epigênese Genética , Células-Tronco/citologia , Algoritmos , Linhagem da Célula , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Modelos Lineares , Modelos Genéticos , Distribuição Normal , Probabilidade , Transdução de Sinais , Processos EstocásticosRESUMO
Stem cells are fundamental to human life and offer great therapeutic potential, yet their biology remains incompletely-or in cases even poorly-understood. The field of stem cell biology has grown substantially in recent years due to a combination of experimental and theoretical contributions: the experimental branch of this work provides data in an ever-increasing number of dimensions, while the theoretical branch seeks to determine suitable models of the fundamental stem cell processes that these data describe. The application of population dynamics to biology is amongst the oldest applications of mathematics to biology, and the population dynamics perspective continues to offer much today. Here we describe the impact that such a perspective has made in the field of stem cell biology. Using hematopoietic stem cells as our model system, we discuss the approaches that have been used to study their key properties, such as capacity for self-renewal, differentiation, and cell fate lineage choice. We will also discuss the relevance of population dynamics in models of stem cells and cancer, where competition naturally emerges as an influential factor on the temporal evolution of cell populations. Stem Cells 2017;35:80-88.
Assuntos
Hematopoese , Células-Tronco Hematopoéticas/citologia , Animais , Células-Tronco Hematopoéticas/metabolismo , Humanos , Modelos Biológicos , Nicho de Células-TroncoRESUMO
The hematopoietic stem cell (HSC) niche provides essential microenvironmental cues for the production and maintenance of HSCs within the bone marrow. During inflammation, hematopoietic dynamics are perturbed, but it is not known whether changes to the HSC-niche interaction occur as a result. We visualize HSCs directly in vivo, enabling detailed analysis of the 3D niche dynamics and migration patterns in murine bone marrow following Trichinella spiralis infection. Spatial statistical analysis of these HSC trajectories reveals two distinct modes of HSC behavior: (a) a pattern of revisiting previously explored space and (b) a pattern of exploring new space. Whereas HSCs from control donors predominantly follow pattern (a), those from infected mice adopt both strategies. Using detailed computational analyses of cell migration tracks and life-history theory, we show that the increased motility of HSCs following infection can, perhaps counterintuitively, enable mice to cope better in deteriorating HSC-niche microenvironments following infection. Stem Cells 2017;35:2292-2304.