Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Bioinformatics ; 37(13): 1860-1867, 2021 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-33471072

RESUMO

MOTIVATION: Longitudinal study designs are indispensable for studying disease progression. Inferring covariate effects from longitudinal data, however, requires interpretable methods that can model complicated covariance structures and detect non-linear effects of both categorical and continuous covariates, as well as their interactions. Detecting disease effects is hindered by the fact that they often occur rapidly near the disease initiation time, and this time point cannot be exactly observed. An additional challenge is that the effect magnitude can be heterogeneous over the subjects. RESULTS: We present lgpr, a widely applicable and interpretable method for non-parametric analysis of longitudinal data using additive Gaussian processes. We demonstrate that it outperforms previous approaches in identifying the relevant categorical and continuous covariates in various settings. Furthermore, it implements important novel features, including the ability to account for the heterogeneity of covariate effects, their temporal uncertainty, and appropriate observation models for different types of biomedical data. The lgpr tool is implemented as a comprehensive and user-friendly R-package. AVAILABILITY AND IMPLEMENTATION: lgpr is available at jtimonen.github.io/lgpr-usage with documentation, tutorials, test data and code for reproducing the experiments of this article. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
Bioinformatics ; 35(14): i548-i557, 2019 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-31510676

RESUMO

MOTIVATION: Metabolic flux balance analysis (FBA) is a standard tool in analyzing metabolic reaction rates compatible with measurements, steady-state and the metabolic reaction network stoichiometry. Flux analysis methods commonly place model assumptions on fluxes due to the convenience of formulating the problem as a linear programing model, while many methods do not consider the inherent uncertainty in flux estimates. RESULTS: We introduce a novel paradigm of Bayesian metabolic flux analysis that models the reactions of the whole genome-scale cellular system in probabilistic terms, and can infer the full flux vector distribution of genome-scale metabolic systems based on exchange and intracellular (e.g. 13C) flux measurements, steady-state assumptions, and objective function assumptions. The Bayesian model couples all fluxes jointly together in a simple truncated multivariate posterior distribution, which reveals informative flux couplings. Our model is a plug-in replacement to conventional metabolic balance methods, such as FBA. Our experiments indicate that we can characterize the genome-scale flux covariances, reveal flux couplings, and determine more intracellular unobserved fluxes in Clostridium acetobutylicum from 13C data than flux variability analysis. AVAILABILITY AND IMPLEMENTATION: The COBRA compatible software is available at github.com/markusheinonen/bamfa. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Clostridium acetobutylicum , Análise do Fluxo Metabólico , Teorema de Bayes , Redes e Vias Metabólicas , Modelos Biológicos
3.
Comput Biol Med ; 143: 105268, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35131609

RESUMO

High-throughput technologies produce gene expression time-series data that need fast and specialized algorithms to be processed. While current methods already deal with different aspects, such as the non-stationarity of the process and the temporal correlation, they often fail to take into account the pairing among replicates. We propose PairGP, a non-stationary Gaussian process method to compare gene expression time-series across several conditions that can account for paired longitudinal study designs and can identify groups of conditions that have different gene expression dynamics. We demonstrate the method on both simulated data and previously unpublished RNA sequencing (RNA-seq) time-series with five conditions. The results show the advantage of modeling the pairing effect to better identify groups of conditions with different dynamics. The pairing effect model displays good capabilities of selecting the most probable grouping of conditions even in the presence of a high number of conditions. The developed method is of general application and can be applied to any gene expression time series dataset. The model can identify common replicate effects among the samples coming from the same biological replicates and model those as separate components. Learning the pairing effect as a separate component, not only allows us to exclude it from the model to get better estimates of the condition effects, but also to improve the precision of the model selection process. The pairing effect that was accounted before as noise, is now identified as a separate component, resulting in more accurate and explanatory models of the data.

4.
BMC Mol Biol ; 12: 21, 2011 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-21569576

RESUMO

BACKGROUND: Gene expression in Escherichia coli is regulated by several mechanisms. We measured in single cells the expression level of a single copy gene coding for green fluorescent protein (GFP), integrated into the genome and driven by a tetracycline inducible promoter, for varying induction strengths. Also, we measured the transcriptional activity of a tetracycline inducible promoter controlling the transcription of a RNA with 96 binding sites for MS2-GFP. RESULTS: The distribution of GFP levels in single cells is found to change significantly as induction reaches high levels, causing the Fano factor of the cells' protein levels to increase with mean level, beyond what would be expected from a Poisson-like process of RNA transcription. In agreement, the Fano factor of the cells' number of RNA molecules target for MS2-GFP follows a similar trend. The results provide evidence that the dynamics of the promoter complex formation, namely, the variability in its duration from one transcription event to the next, explains the change in the distribution of expression levels in the cell population with induction strength. CONCLUSIONS: The results suggest that the open complex formation of the tetracycline inducible promoter, in the regime of strong induction, affects significantly the dynamics of RNA production due to the variability of its duration from one event to the next.


Assuntos
Antibacterianos/farmacologia , Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica/efeitos dos fármacos , Proteínas de Fluorescência Verde/genética , Regiões Promotoras Genéticas/efeitos dos fármacos , Tetraciclina/farmacologia , Escherichia coli/efeitos dos fármacos
5.
IEEE/ACM Trans Comput Biol Bioinform ; 16(6): 1843-1854, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-29993837

RESUMO

Ordinary differential equations (ODEs) provide a powerful formalism to model molecular networks mechanistically. However, inferring the model structure, given a set of time course measurements and a large number of alternative molecular mechanisms, is a challenging and open research question. Existing search heuristics are designed only for finding a single best model configuration and cannot account for the uncertainty in selecting the network components. In this study, we present a novel Markov chain Monte Carlo approach for performing Bayesian model structure inference over ODE models. We formulate a Metropolis algorithm that explores the model space efficiently and is suitable for obtaining probabilistic inferences about the network structure. The method and its special parallelization possibilities are demonstrated using simulated data. Furthermore, we apply the method to a time course RNA sequencing data set to infer the structure of the transiently evolving core regulatory network that steers the T helper 17 (Th17) cell differentiation. Our results are in agreement with the earlier finding that the Th17 lineage-specific differentiation program evolves in three sequential phases. Further, the analysis provides us with probabilistic predictions on the molecular interactions that are active in different phases of Th17 cell differentiation.


Assuntos
Biologia Computacional/métodos , Análise de Sequência de RNA , Algoritmos , Teorema de Bayes , Diferenciação Celular , Linhagem da Célula , Simulação por Computador , Redes Reguladoras de Genes , Humanos , Funções Verossimilhança , Cadeias de Markov , Modelos Estatísticos , Método de Monte Carlo , Probabilidade , RNA/análise , Transdução de Sinais , Software , Células Th17
6.
Genome Biol ; 17: 49, 2016 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-26975309

RESUMO

We present a generative model, Lux, to quantify DNA methylation modifications from any combination of bisulfite sequencing approaches, including reduced, oxidative, TET-assisted, chemical-modification assisted, and methylase-assisted bisulfite sequencing data. Lux models all cytosine modifications (C, 5mC, 5hmC, 5fC, and 5caC) simultaneously together with experimental parameters, including bisulfite conversion and oxidation efficiencies, as well as various chemical labeling and protection steps. We show that Lux improves the quantification and comparison of cytosine modification levels and that Lux can process any oxidized methylcytosine sequencing data sets to quantify all cytosine modifications. Analysis of targeted data from Tet2-knockdown embryonic stem cells and T cells during development demonstrates DNA modification quantification at unprecedented detail, quantifies active demethylation pathways and reveals 5hmC localization in putative regulatory regions.


Assuntos
5-Metilcitosina/metabolismo , Metilação de DNA/genética , Proteínas de Ligação a DNA/genética , DNA/genética , Teorema de Bayes , Citosina/metabolismo , Proteínas de Ligação a DNA/metabolismo , Células-Tronco Embrionárias/metabolismo , Humanos , Oxirredução , Análise de Sequência de DNA/métodos
7.
Genome Med ; 7: 122, 2015 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-26589177

RESUMO

BACKGROUND: Activation and differentiation of T-helper (Th) cells into Th1 and Th2 types is a complex process orchestrated by distinct gene activation programs engaging a number of genes. This process is crucial for a robust immune response and an imbalance might lead to disease states such as autoimmune diseases or allergy. Therefore, identification of genes involved in this process is paramount to further understand the pathogenesis of, and design interventions for, immune-mediated diseases. METHODS: We aimed at identifying protein-coding genes and long non-coding RNAs (lncRNAs) involved in early differentiation of T-helper cells by transcriptome analysis of cord blood-derived naïve precursor, primary and polarized cells. RESULTS: Here, we identified lineage-specific genes involved in early differentiation of Th1 and Th2 subsets by integrating transcriptional profiling data from multiple platforms. We have obtained a high confidence list of genes as well as a list of novel genes by employing more than one profiling platform. We show that the density of lineage-specific epigenetic marks is higher around lineage-specific genes than anywhere else in the genome. Based on next-generation sequencing data we identified lineage-specific lncRNAs involved in early Th1 and Th2 differentiation and predicted their expected functions through Gene Ontology analysis. We show that there is a positive trend in the expression of the closest lineage-specific lncRNA and gene pairs. We also found out that there is an enrichment of disease SNPs around a number of lncRNAs identified, suggesting that these lncRNAs might play a role in the etiology of autoimmune diseases. CONCLUSION: The results presented here show the involvement of several new actors in the early differentiation of T-helper cells and will be a valuable resource for better understanding of autoimmune processes.


Assuntos
Linfócitos T Auxiliares-Indutores/fisiologia , Doenças Autoimunes/genética , Doenças Autoimunes/imunologia , Linfócitos T CD4-Positivos/imunologia , Diferenciação Celular/genética , Diferenciação Celular/imunologia , Linhagem da Célula , Células Cultivadas , Epigênese Genética , Sangue Fetal/citologia , Sangue Fetal/imunologia , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Fases de Leitura Aberta/genética , RNA Longo não Codificante/genética , Análise de Sequência de RNA/métodos , Transdução de Sinais/genética , Linfócitos T Auxiliares-Indutores/citologia , Linfócitos T Auxiliares-Indutores/imunologia , Células Th1/citologia , Células Th1/imunologia , Células Th1/fisiologia , Células Th2/citologia , Células Th2/imunologia , Células Th2/fisiologia
8.
EURASIP J Bioinform Syst Biol ; 2011: 572876, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21234243

RESUMO

We propose a Markov chain approximation of the delayed stochastic simulation algorithm to infer properties of the mechanisms in prokaryote transcription from the dynamics of RNA levels. We model transcription using the delayed stochastic modelling strategy and realistic parameter values for rate of transcription initiation and RNA degradation. From the model, we generate time series of RNA levels at the single molecule level, from which we use the method to infer the duration of the promoter open complex formation. This is found to be possible even when adding external Gaussian noise to the RNA levels.

9.
BMC Syst Biol ; 5: 149, 2011 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-21943372

RESUMO

BACKGROUND: In Escherichia coli the mean and cell-to-cell diversity in RNA numbers of different genes vary widely. This is likely due to different kinetics of transcription initiation, a complex process with multiple rate-limiting steps that affect RNA production. RESULTS: We measured the in vivo kinetics of production of individual RNA molecules under the control of the lar promoter in E. coli. From the analysis of the distributions of intervals between transcription events in the regimes of weak and medium induction, we find that the process of transcription initiation of this promoter involves a sequential mechanism with two main rate-limiting steps, each lasting hundreds of seconds. Both steps become faster with increasing induction by IPTG and Arabinose. CONCLUSIONS: The two rate-limiting steps in initiation are found to be important regulators of the dynamics of RNA production under the control of the lar promoter in the regimes of weak and medium induction. Variability in the intervals between consecutive RNA productions is much lower than if there was only one rate-limiting step with a duration following an exponential distribution. The methodology proposed here to analyze the in vivo dynamics of transcription may be applicable at a genome-wide scale and provide valuable insight into the dynamics of prokaryotic genetic networks.


Assuntos
Escherichia coli/fisiologia , Modelos Biológicos , RNA/biossíntese , Ativação Transcricional/fisiologia , Arabinose/genética , Arabinose/metabolismo , Primers do DNA/genética , Cinética , Óperon Lac/genética , Regiões Promotoras Genéticas/genética , Biologia de Sistemas
10.
Phys Rev E Stat Nonlin Soft Matter Phys ; 81(1 Pt 1): 011912, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20365404

RESUMO

Little is known about the biological mechanisms that shape the distribution of intervals between the completion of RNA molecules (T(p)RNA) , and thus transcriptional noise. We characterize numerically and analytically how the promoter open complex delay (tau(P)) and the transcription initiation rate (k(t)) shape T(p)RNA. From this, we assess the noise and mean of transcript levels and show that these can be tuned both independently and simultaneously by tau(P) and k(t). Finally, we characterize how tau(P) affects bursting in RNA production and show that the tau(P) measured for a lac promoter best fits independent measurements of the burst distribution of the same promoter. Since tau(P) affects noise in gene expression, and given that it is sequence dependent, it is likely to be evolvable.


Assuntos
Expressão Gênica/fisiologia , Modelos Genéticos , Regiões Promotoras Genéticas , Simulação por Computador , Precursores de RNA/metabolismo , RNA de Transferência/metabolismo , Processos Estocásticos , Fatores de Tempo , Transcrição Gênica/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA