Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Bioinformatics ; 37(13): 1860-1867, 2021 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-33471072

RESUMEN

MOTIVATION: Longitudinal study designs are indispensable for studying disease progression. Inferring covariate effects from longitudinal data, however, requires interpretable methods that can model complicated covariance structures and detect non-linear effects of both categorical and continuous covariates, as well as their interactions. Detecting disease effects is hindered by the fact that they often occur rapidly near the disease initiation time, and this time point cannot be exactly observed. An additional challenge is that the effect magnitude can be heterogeneous over the subjects. RESULTS: We present lgpr, a widely applicable and interpretable method for non-parametric analysis of longitudinal data using additive Gaussian processes. We demonstrate that it outperforms previous approaches in identifying the relevant categorical and continuous covariates in various settings. Furthermore, it implements important novel features, including the ability to account for the heterogeneity of covariate effects, their temporal uncertainty, and appropriate observation models for different types of biomedical data. The lgpr tool is implemented as a comprehensive and user-friendly R-package. AVAILABILITY AND IMPLEMENTATION: lgpr is available at jtimonen.github.io/lgpr-usage with documentation, tutorials, test data and code for reproducing the experiments of this article. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
Bioinformatics ; 35(14): i548-i557, 2019 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-31510676

RESUMEN

MOTIVATION: Metabolic flux balance analysis (FBA) is a standard tool in analyzing metabolic reaction rates compatible with measurements, steady-state and the metabolic reaction network stoichiometry. Flux analysis methods commonly place model assumptions on fluxes due to the convenience of formulating the problem as a linear programing model, while many methods do not consider the inherent uncertainty in flux estimates. RESULTS: We introduce a novel paradigm of Bayesian metabolic flux analysis that models the reactions of the whole genome-scale cellular system in probabilistic terms, and can infer the full flux vector distribution of genome-scale metabolic systems based on exchange and intracellular (e.g. 13C) flux measurements, steady-state assumptions, and objective function assumptions. The Bayesian model couples all fluxes jointly together in a simple truncated multivariate posterior distribution, which reveals informative flux couplings. Our model is a plug-in replacement to conventional metabolic balance methods, such as FBA. Our experiments indicate that we can characterize the genome-scale flux covariances, reveal flux couplings, and determine more intracellular unobserved fluxes in Clostridium acetobutylicum from 13C data than flux variability analysis. AVAILABILITY AND IMPLEMENTATION: The COBRA compatible software is available at github.com/markusheinonen/bamfa. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Clostridium acetobutylicum , Análisis de Flujos Metabólicos , Teorema de Bayes , Redes y Vías Metabólicas , Modelos Biológicos
3.
Comput Biol Med ; 143: 105268, 2022 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-35131609

RESUMEN

High-throughput technologies produce gene expression time-series data that need fast and specialized algorithms to be processed. While current methods already deal with different aspects, such as the non-stationarity of the process and the temporal correlation, they often fail to take into account the pairing among replicates. We propose PairGP, a non-stationary Gaussian process method to compare gene expression time-series across several conditions that can account for paired longitudinal study designs and can identify groups of conditions that have different gene expression dynamics. We demonstrate the method on both simulated data and previously unpublished RNA sequencing (RNA-seq) time-series with five conditions. The results show the advantage of modeling the pairing effect to better identify groups of conditions with different dynamics. The pairing effect model displays good capabilities of selecting the most probable grouping of conditions even in the presence of a high number of conditions. The developed method is of general application and can be applied to any gene expression time series dataset. The model can identify common replicate effects among the samples coming from the same biological replicates and model those as separate components. Learning the pairing effect as a separate component, not only allows us to exclude it from the model to get better estimates of the condition effects, but also to improve the precision of the model selection process. The pairing effect that was accounted before as noise, is now identified as a separate component, resulting in more accurate and explanatory models of the data.

4.
BMC Mol Biol ; 12: 21, 2011 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-21569576

RESUMEN

BACKGROUND: Gene expression in Escherichia coli is regulated by several mechanisms. We measured in single cells the expression level of a single copy gene coding for green fluorescent protein (GFP), integrated into the genome and driven by a tetracycline inducible promoter, for varying induction strengths. Also, we measured the transcriptional activity of a tetracycline inducible promoter controlling the transcription of a RNA with 96 binding sites for MS2-GFP. RESULTS: The distribution of GFP levels in single cells is found to change significantly as induction reaches high levels, causing the Fano factor of the cells' protein levels to increase with mean level, beyond what would be expected from a Poisson-like process of RNA transcription. In agreement, the Fano factor of the cells' number of RNA molecules target for MS2-GFP follows a similar trend. The results provide evidence that the dynamics of the promoter complex formation, namely, the variability in its duration from one transcription event to the next, explains the change in the distribution of expression levels in the cell population with induction strength. CONCLUSIONS: The results suggest that the open complex formation of the tetracycline inducible promoter, in the regime of strong induction, affects significantly the dynamics of RNA production due to the variability of its duration from one event to the next.


Asunto(s)
Antibacterianos/farmacología , Escherichia coli/genética , Regulación Bacteriana de la Expresión Génica/efectos de los fármacos , Proteínas Fluorescentes Verdes/genética , Regiones Promotoras Genéticas/efectos de los fármacos , Tetraciclina/farmacología , Escherichia coli/efectos de los fármacos
5.
IEEE/ACM Trans Comput Biol Bioinform ; 16(6): 1843-1854, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-29993837

RESUMEN

Ordinary differential equations (ODEs) provide a powerful formalism to model molecular networks mechanistically. However, inferring the model structure, given a set of time course measurements and a large number of alternative molecular mechanisms, is a challenging and open research question. Existing search heuristics are designed only for finding a single best model configuration and cannot account for the uncertainty in selecting the network components. In this study, we present a novel Markov chain Monte Carlo approach for performing Bayesian model structure inference over ODE models. We formulate a Metropolis algorithm that explores the model space efficiently and is suitable for obtaining probabilistic inferences about the network structure. The method and its special parallelization possibilities are demonstrated using simulated data. Furthermore, we apply the method to a time course RNA sequencing data set to infer the structure of the transiently evolving core regulatory network that steers the T helper 17 (Th17) cell differentiation. Our results are in agreement with the earlier finding that the Th17 lineage-specific differentiation program evolves in three sequential phases. Further, the analysis provides us with probabilistic predictions on the molecular interactions that are active in different phases of Th17 cell differentiation.


Asunto(s)
Biología Computacional/métodos , Análisis de Secuencia de ARN , Algoritmos , Teorema de Bayes , Diferenciación Celular , Linaje de la Célula , Simulación por Computador , Redes Reguladoras de Genes , Humanos , Funciones de Verosimilitud , Cadenas de Markov , Modelos Estadísticos , Método de Montecarlo , Probabilidad , ARN/análisis , Transducción de Señal , Programas Informáticos , Células Th17
6.
Genome Biol ; 17: 49, 2016 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-26975309

RESUMEN

We present a generative model, Lux, to quantify DNA methylation modifications from any combination of bisulfite sequencing approaches, including reduced, oxidative, TET-assisted, chemical-modification assisted, and methylase-assisted bisulfite sequencing data. Lux models all cytosine modifications (C, 5mC, 5hmC, 5fC, and 5caC) simultaneously together with experimental parameters, including bisulfite conversion and oxidation efficiencies, as well as various chemical labeling and protection steps. We show that Lux improves the quantification and comparison of cytosine modification levels and that Lux can process any oxidized methylcytosine sequencing data sets to quantify all cytosine modifications. Analysis of targeted data from Tet2-knockdown embryonic stem cells and T cells during development demonstrates DNA modification quantification at unprecedented detail, quantifies active demethylation pathways and reveals 5hmC localization in putative regulatory regions.


Asunto(s)
5-Metilcitosina/metabolismo , Metilación de ADN/genética , Proteínas de Unión al ADN/genética , ADN/genética , Teorema de Bayes , Citosina/metabolismo , Proteínas de Unión al ADN/metabolismo , Células Madre Embrionarias/metabolismo , Humanos , Oxidación-Reducción , Análisis de Secuencia de ADN/métodos
7.
Genome Med ; 7: 122, 2015 Nov 20.
Artículo en Inglés | MEDLINE | ID: mdl-26589177

RESUMEN

BACKGROUND: Activation and differentiation of T-helper (Th) cells into Th1 and Th2 types is a complex process orchestrated by distinct gene activation programs engaging a number of genes. This process is crucial for a robust immune response and an imbalance might lead to disease states such as autoimmune diseases or allergy. Therefore, identification of genes involved in this process is paramount to further understand the pathogenesis of, and design interventions for, immune-mediated diseases. METHODS: We aimed at identifying protein-coding genes and long non-coding RNAs (lncRNAs) involved in early differentiation of T-helper cells by transcriptome analysis of cord blood-derived naïve precursor, primary and polarized cells. RESULTS: Here, we identified lineage-specific genes involved in early differentiation of Th1 and Th2 subsets by integrating transcriptional profiling data from multiple platforms. We have obtained a high confidence list of genes as well as a list of novel genes by employing more than one profiling platform. We show that the density of lineage-specific epigenetic marks is higher around lineage-specific genes than anywhere else in the genome. Based on next-generation sequencing data we identified lineage-specific lncRNAs involved in early Th1 and Th2 differentiation and predicted their expected functions through Gene Ontology analysis. We show that there is a positive trend in the expression of the closest lineage-specific lncRNA and gene pairs. We also found out that there is an enrichment of disease SNPs around a number of lncRNAs identified, suggesting that these lncRNAs might play a role in the etiology of autoimmune diseases. CONCLUSION: The results presented here show the involvement of several new actors in the early differentiation of T-helper cells and will be a valuable resource for better understanding of autoimmune processes.


Asunto(s)
Linfocitos T Colaboradores-Inductores/fisiología , Enfermedades Autoinmunes/genética , Enfermedades Autoinmunes/inmunología , Linfocitos T CD4-Positivos/inmunología , Diferenciación Celular/genética , Diferenciación Celular/inmunología , Linaje de la Célula , Células Cultivadas , Epigénesis Genética , Sangre Fetal/citología , Sangre Fetal/inmunología , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Sistemas de Lectura Abierta/genética , ARN Largo no Codificante/genética , Análisis de Secuencia de ARN/métodos , Transducción de Señal/genética , Linfocitos T Colaboradores-Inductores/citología , Linfocitos T Colaboradores-Inductores/inmunología , Células TH1/citología , Células TH1/inmunología , Células TH1/fisiología , Células Th2/citología , Células Th2/inmunología , Células Th2/fisiología
8.
EURASIP J Bioinform Syst Biol ; 2011: 572876, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21234243

RESUMEN

We propose a Markov chain approximation of the delayed stochastic simulation algorithm to infer properties of the mechanisms in prokaryote transcription from the dynamics of RNA levels. We model transcription using the delayed stochastic modelling strategy and realistic parameter values for rate of transcription initiation and RNA degradation. From the model, we generate time series of RNA levels at the single molecule level, from which we use the method to infer the duration of the promoter open complex formation. This is found to be possible even when adding external Gaussian noise to the RNA levels.

9.
BMC Syst Biol ; 5: 149, 2011 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-21943372

RESUMEN

BACKGROUND: In Escherichia coli the mean and cell-to-cell diversity in RNA numbers of different genes vary widely. This is likely due to different kinetics of transcription initiation, a complex process with multiple rate-limiting steps that affect RNA production. RESULTS: We measured the in vivo kinetics of production of individual RNA molecules under the control of the lar promoter in E. coli. From the analysis of the distributions of intervals between transcription events in the regimes of weak and medium induction, we find that the process of transcription initiation of this promoter involves a sequential mechanism with two main rate-limiting steps, each lasting hundreds of seconds. Both steps become faster with increasing induction by IPTG and Arabinose. CONCLUSIONS: The two rate-limiting steps in initiation are found to be important regulators of the dynamics of RNA production under the control of the lar promoter in the regimes of weak and medium induction. Variability in the intervals between consecutive RNA productions is much lower than if there was only one rate-limiting step with a duration following an exponential distribution. The methodology proposed here to analyze the in vivo dynamics of transcription may be applicable at a genome-wide scale and provide valuable insight into the dynamics of prokaryotic genetic networks.


Asunto(s)
Escherichia coli/fisiología , Modelos Biológicos , ARN/biosíntesis , Activación Transcripcional/fisiología , Arabinosa/genética , Arabinosa/metabolismo , Cartilla de ADN/genética , Cinética , Operón Lac/genética , Regiones Promotoras Genéticas/genética , Biología de Sistemas
10.
Phys Rev E Stat Nonlin Soft Matter Phys ; 81(1 Pt 1): 011912, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-20365404

RESUMEN

Little is known about the biological mechanisms that shape the distribution of intervals between the completion of RNA molecules (T(p)RNA) , and thus transcriptional noise. We characterize numerically and analytically how the promoter open complex delay (tau(P)) and the transcription initiation rate (k(t)) shape T(p)RNA. From this, we assess the noise and mean of transcript levels and show that these can be tuned both independently and simultaneously by tau(P) and k(t). Finally, we characterize how tau(P) affects bursting in RNA production and show that the tau(P) measured for a lac promoter best fits independent measurements of the burst distribution of the same promoter. Since tau(P) affects noise in gene expression, and given that it is sequence dependent, it is likely to be evolvable.


Asunto(s)
Expresión Génica/fisiología , Modelos Genéticos , Regiones Promotoras Genéticas , Simulación por Computador , Precursores del ARN/metabolismo , ARN de Transferencia/metabolismo , Procesos Estocásticos , Factores de Tiempo , Transcripción Genética/fisiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA