Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
1.
Bioinformatics ; 39(5)2023 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-37099704

RESUMEN

MOTIVATION: The human microbiome, which is linked to various diseases by growing evidence, has a profound impact on human health. Since changes in the composition of the microbiome across time are associated with disease and clinical outcomes, microbiome analysis should be performed in a longitudinal study. However, due to limited sample sizes and differing numbers of timepoints for different subjects, a significant amount of data cannot be utilized, directly affecting the quality of analysis results. Deep generative models have been proposed to address this lack of data issue. Specifically, a generative adversarial network (GAN) has been successfully utilized for data augmentation to improve prediction tasks. Recent studies have also shown improved performance of GAN-based models for missing value imputation in a multivariate time series dataset compared with traditional imputation methods. RESULTS: This work proposes DeepMicroGen, a bidirectional recurrent neural network-based GAN model, trained on the temporal relationship between the observations, to impute the missing microbiome samples in longitudinal studies. DeepMicroGen outperforms standard baseline imputation methods, showing the lowest mean absolute error for both simulated and real datasets. Finally, the proposed model improved the predicted clinical outcome for allergies, by providing imputation for an incomplete longitudinal dataset used to train the classifier. AVAILABILITY AND IMPLEMENTATION: DeepMicroGen is publicly available at https://github.com/joungmin-choi/DeepMicroGen.


Asunto(s)
Microbiota , Humanos , Estudios Longitudinales , Redes Neurales de la Computación , Tamaño de la Muestra , Factores de Tiempo
2.
PLoS Comput Biol ; 18(1): e1009847, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-35089921

RESUMEN

The cell cycle of Caulobacter crescentus involves the polar morphogenesis and an asymmetric cell division driven by precise interactions and regulations of proteins, which makes Caulobacter an ideal model organism for investigating bacterial cell development and differentiation. The abundance of molecular data accumulated on Caulobacter motivates system biologists to analyze the complex regulatory network of cell cycle via quantitative modeling. In this paper, We propose a comprehensive model to accurately characterize the underlying mechanisms of cell cycle regulation based on the study of: a) chromosome replication and methylation; b) interactive pathways of five master regulatory proteins including DnaA, GcrA, CcrM, CtrA, and SciP, as well as novel consideration of their corresponding mRNAs; c) cell cycle-dependent proteolysis of CtrA through hierarchical protease complexes. The temporal dynamics of our simulation results are able to closely replicate an extensive set of experimental observations and capture the main phenotype of seven mutant strains of Caulobacter crescentus. Collectively, the proposed model can be used to predict phenotypes of other mutant cases, especially for nonviable strains which are hard to cultivate and observe. Moreover, the module of cyclic proteolysis is an efficient tool to study the metabolism of proteins with similar mechanisms.


Asunto(s)
Caulobacter crescentus , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Caulobacter crescentus/genética , Caulobacter crescentus/metabolismo , Ciclo Celular/fisiología , Proteínas de Unión al ADN/metabolismo , Regulación Bacteriana de la Expresión Génica , Proteolisis
3.
J Chem Theory Comput ; 16(7): 4669-4684, 2020 Jul 14.
Artículo en Inglés | MEDLINE | ID: mdl-32450041

RESUMEN

Accuracy of protein-ligand binding free energy calculations utilizing implicit solvent models is critically affected by parameters of the underlying dielectric boundary, specifically, the atomic and water probe radii. Here, a global multidimensional optimization pipeline is developed to find optimal atomic radii specifically for protein-ligand binding calculations in implicit solvent. The computational pipeline has these three key components: (1) a massively parallel implementation of a deterministic global optimization algorithm (VTDIRECT95), (2) an accurate yet reasonably fast generalized Born implicit solvent model (GBNSR6), and (3) a novel robustness metric that helps distinguish between nearly degenerate local minima via a postprocessing step of the optimization. A graph-based "kT-connectivity" approach to explore and visualize the multidimensional energy landscape is proposed: local minima that can be reached from the global minimum without exceeding a given energy threshold (kT) are considered to be connected. As an illustration of the capabilities of the optimization pipeline, we apply it to find a global optimum in the space of just five radii: four atomic (O, H, N, and C) radii and water probe radius. The optimized radii, ρW = 1.37 Å, ρC = 1.40 Å, ρH = 1.55 Å, ρN = 2.35 Å, and ρO = 1.28 Å, lead to a closer agreement of electrostatic binding free energies with the explicit solvent reference than two commonly used sets of radii previously optimized for small molecules. At the same time, the ability of the optimizer to find the global optimum reveals fundamental limits of the common two-dielectric implicit solvation model: the computed electrostatic binding free energies are still almost 4 kcal/mol away from the explicit solvent reference. The proposed computational approach opens the possibility to further improve the accuracy of practical computational protocols for binding free energy calculations.


Asunto(s)
Ligandos , Proteínas/química , Algoritmos , Modelos Químicos , Unión Proteica , Proteínas/metabolismo , Solventes/química , Electricidad Estática , Termodinámica
4.
Methods Mol Biol ; 1945: 119-139, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30945244

RESUMEN

Biologists seek to create increasingly complex molecular regulatory network models. Writing such a model is a creative effort that requires flexible analysis tools and better modeling languages than offered by many of today's biochemical model editors. Our Multistate Model Builder (MSMB) supports multistate models created using different modeling styles that suit the modeler rather than the software. MSMB defines a simple but powerful syntax to describe multistate species. Our syntax reduces the number of reactions needed to encode the model, thereby reducing the cognitive load involved with model creation. MSMB gives extensive feedback during all stages of model creation. Users can activate error notifications, and use these notifications as a guide toward a consistent, syntactically correct model. Any consistent model can be exported to SBML or COPASI formats. We show the effectiveness of MSMB's multistate syntax through realistic models of cell cycle regulation and mRNA transcription. MSMB is an open-source project implemented in Java and it uses the COPASI API. Complete information and the installation package can be found at http://copasi.org/Projects/ .


Asunto(s)
Biología Computacional/métodos , Modelos Biológicos , Programas Informáticos , Biología de Sistemas/métodos , Algoritmos , Gráficos por Computador , Simulación por Computador , Lenguajes de Programación
5.
Artículo en Inglés | MEDLINE | ID: mdl-29990127

RESUMEN

Parameter estimation in discrete or continuous deterministic cell cycle models is challenging for several reasons, including the nature of what can be observed, and the accuracy and quantity of those observations. The challenge is even greater for stochastic models, where the number of simulations and amount of empirical data must be even larger to obtain statistically valid parameter estimates. The two main contributions of this work are (1) stochastic model parameter estimation based on directly matching multivariate probability distributions, and (2) a new quasi-Newton algorithm class QNSTOP for stochastic optimization problems. QNSTOP directly uses the random objective function value samples rather than creating ensemble statistics. QNSTOP is used here to directly match empirical and simulated joint probability distributions rather than matching summary statistics. Results are given for a current state-of-the-art stochastic cell cycle model of budding yeast, whose predictions match well some summary statistics and one-dimensional distributions from empirical data, but do not match well the empirical joint distributions. The nature of the mismatch provides insight into the weakness in the stochastic model.


Asunto(s)
Ciclo Celular/fisiología , Saccharomycetales , Biología de Sistemas/métodos , Algoritmos , Simulación por Computador , Modelos Biológicos , Saccharomycetales/citología , Saccharomycetales/genética , Saccharomycetales/fisiología , Procesos Estocásticos
6.
BMC Med Genomics ; 11(1): 78, 2018 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-30200981

RESUMEN

BACKGROUND: CRISPR/CAS9 (epi)genome editing revolutionized the field of gene and cell therapy. Our previous study demonstrated that a rapid and robust reactivation of the HIV latent reservoir by a catalytically-deficient Cas9 (dCas9)-synergistic activation mediator (SAM) via HIV long terminal repeat (LTR)-specific MS2-mediated single guide RNAs (msgRNAs) directly induces cellular suicide without additional immunotherapy. However, potential off-target effect remains a concern for any clinical application of Cas9 genome editing and dCas9 epigenome editing. After dCas9 treatment, potential off-target responses have been analyzed through different strategies such as mRNA sequence analysis, and functional screening. In this study, a comprehensive analysis of the host transcriptome including mRNA, lncRNA, and alternative splicing was performed using human cell lines expressing dCas9-SAM and HIV-targeting msgRNAs. RESULTS: The control scrambled msgRNA (LTR_Zero), and two LTR-specific msgRNAs (LTR_L and LTR_O) groups show very similar expression profiles of the whole transcriptome. Among 839 identified lncRNAs, none exhibited significantly different expression in LTR_L vs. LTR_Zero group. In LTR_O group, only TERC and scaRNA2 lncRNAs were significantly decreased. Among 142,791 mRNAs, four genes were differentially expressed in LTR_L vs. LTR_Zero group. There were 21 genes significantly downregulated in LTR_O vs. either LTR_Zero or LTR_L group and one third of them are histone related. The distributions of different types of alternative splicing were very similar either within or between groups. There were no apparent changes in all the lncRNA and mRNA transcripts between the LTR_L and LTR_Zero groups. CONCLUSION: This is an extremely comprehensive study demonstrating the rare off-target effects of the HIV-specific dCas9-SAM system in human cells. This finding is encouraging for the safe application of dCas9-SAM technology to induce target-specific reactivation of latent HIV for an effective "shock-and-kill" strategy.


Asunto(s)
Proteína 9 Asociada a CRISPR/metabolismo , Duplicado del Terminal Largo de VIH/genética , VIH-1/genética , VIH-1/fisiología , Secuenciación de Nucleótidos de Alto Rendimiento , ARN Largo no Codificante/genética , Activación Viral/genética , Empalme Alternativo , Perfilación de la Expresión Génica , Células HeLa , Humanos , Polimorfismo de Nucleótido Simple , ARN Mensajero/genética , Análisis de Secuencia de ARN
7.
Simulation ; 94(11): 993-1008, 2018 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-31303682

RESUMEN

The growing size and complexity of molecular network models makes them increasingly difficult to construct and understand. Modifying a model that consists of tens of reactions is no easy task. Attempting the same on a model containing hundreds of reactions can seem nearly impossible. We present the JigCell Model Connector, a software tool that supports large-scale molecular network modeling. Our approach to developing large models is to combine smaller models, making the result easier to comprehend. At the base, the smaller models (called modules) are defined by small collections of reactions. Modules connect together to form larger modules through clearly defined interfaces, called ports. In this work, we enhance the port concept by defining three types of ports. An output port is linked to an internal component that will send a value. An input port is linked to an internal component that will receive a value. An equivalence port is linked to an internal component that will both receive and send values. Not all modules connect together in the same way; therefore, multiple connection options need to exist.

8.
Sci Rep ; 7(1): 14106, 2017 10 26.
Artículo en Inglés | MEDLINE | ID: mdl-29074871

RESUMEN

Storing biologically equivalent indels as distinct entries in databases causes data redundancy, and misleads downstream analysis. It is thus desirable to have a unified system for identifying and representing equivalent indels. Moreover, a unified system is also desirable to compare the indel calling results produced by different tools. This paper describes UPS-indel, a utility tool that creates a universal positioning system for indels so that equivalent indels can be uniquely determined by their coordinates in the new system, which also can be used to compare different indel calling results. UPS-indel identifies 15% redundant indels in dbSNP, 29% in COSMIC coding, and 13% in COSMIC noncoding datasets across all human chromosomes, higher than previously reported. Comparing the performance of UPS-indel with existing variant normalization tools vt normalize, BCFtools, and GATK LeftAlignAndTrimVariants shows that UPS-indel is able to identify 456,352 more redundant indels in dbSNP; 2,118 more in COSMIC coding, and 553 more in COSMIC noncoding indel dataset in addition to the ones reported jointly by these tools. Moreover, comparing UPS-indel to state-of-the-art approaches for indel call set comparison demonstrates its clear superiority in finding common indels among call sets. UPS-indel is theoretically proven to find all equivalent indels, and thus exhaustive.

9.
BMC Syst Biol ; 11(1): 30, 2017 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-28241833

RESUMEN

BACKGROUND: Parameter estimation in systems biology is typically done by enforcing experimental observations through an objective function as the parameter space of a model is explored by numerical simulations. Past studies have shown that one usually finds a set of "feasible" parameter vectors that fit the available experimental data equally well, and that these alternative vectors can make different predictions under novel experimental conditions. In this study, we characterize the feasible region of a complex model of the budding yeast cell cycle under a large set of discrete experimental constraints in order to test whether the statistical features of relative protein abundance predictions are influenced by the topology of the cell cycle regulatory network. RESULTS: Using differential evolution, we generate an ensemble of feasible parameter vectors that reproduce the phenotypes (viable or inviable) of wild-type yeast cells and 110 mutant strains. We use this ensemble to predict the phenotypes of 129 mutant strains for which experimental data is not available. We identify 86 novel mutants that are predicted to be viable and then rank the cell cycle proteins in terms of their contributions to cumulative variability of relative protein abundance predictions. Proteins involved in "regulation of cell size" and "regulation of G1/S transition" contribute most to predictive variability, whereas proteins involved in "positive regulation of transcription involved in exit from mitosis," "mitotic spindle assembly checkpoint" and "negative regulation of cyclin-dependent protein kinase by cyclin degradation" contribute the least. These results suggest that the statistics of these predictions may be generating patterns specific to individual network modules (START, S/G2/M, and EXIT). To test this hypothesis, we develop random forest models for predicting the network modules of cell cycle regulators using relative abundance statistics as model inputs. Predictive performance is assessed by the areas under receiver operating characteristics curves (AUC). Our models generate an AUC range of 0.83-0.87 as opposed to randomized models with AUC values around 0.50. CONCLUSIONS: By using differential evolution and random forest modeling, we show that the model prediction statistics generate distinct network module-specific patterns within the cell cycle network.


Asunto(s)
Proteínas de Ciclo Celular/metabolismo , Ciclo Celular , Modelos Biológicos , Proteínas de Ciclo Celular/genética , Mutación , Fenotipo , Saccharomycetales/citología , Saccharomycetales/genética , Saccharomycetales/metabolismo
10.
BMC Syst Biol ; 9: 95, 2015 Dec 24.
Artículo en Inglés | MEDLINE | ID: mdl-26704692

RESUMEN

BACKGROUND: Most biomolecular reaction modeling tools allow users to build models with a single list of parameter values. However, a common scenario involves different parameterizations of the model to account for the results of related experiments, for example, to define the phenotypes for a variety of mutations (gene knockout, over expression, etc.) of a specific biochemical network. This scenario is not well supported by existing model editors, forcing the user to manually generate, store, and maintain many variations of the same model. RESULTS: We developed an extension to our modeling editor called the JigCell Run Manager (JC-RM). JC-RM allows the modeler to define a hierarchy of parameter values, simulations, and plot settings, and to save them together with the initial model. JC-RM supports generation of simulation plots, as well as export to COPASI and SBML (L3V1) for further analysis. CONCLUSIONS: Developing a model with its initial list of parameter values is just the first step in modeling a biological system. Models are often parameterized in many different ways to account for mutations of the organism and/or for sets of related experiments performed on the organism. JC-RM offers two critical features: it supports the everyday management of a large model, complete with its parameterizations, and it facilitates sharing this information before and after publication. JC-RM allows the modeler to define a hierarchy of parameter values, simulation, and plot settings, and to maintain a relationship between this hierarchy and the initial model. JC-RM is implemented in Java and uses the COPASI API. JC-RM runs on all major operating systems, with minimal system requirements. Installers, source code, user manual, and examples can be found at the COPASI website ( http://www.copasi.org/Projects ).


Asunto(s)
Modelos Biológicos , Programas Informáticos , Biología de Sistemas/métodos , Gráficos por Computador , Factores de Tiempo
11.
BMC Bioinformatics ; 16: 351, 2015 Oct 30.
Artículo en Inglés | MEDLINE | ID: mdl-26518340

RESUMEN

BACKGROUND: Numerous tools have been developed to predict the fitness effects (i.e., neutral, deleterious, or beneficial) of genetic variants on corresponding proteins. However, prediction in terms of whether a variant causes the variant bearing protein to lose the original function or gain new function is also needed for better understanding of how the variant contributes to disease/cancer. To address this problem, the present work introduces and computationally defines four types of functional outcome of a variant: gain, loss, switch, and conservation of function. The deployment of multiple hidden Markov models is proposed to computationally classify mutations by the four functional impact types. RESULTS: The functional outcome is predicted for over a hundred thyroid stimulating hormone receptor (TSHR) mutations, as well as cancer related mutations in oncogenes or tumor suppressor genes. The results show that the proposed computational method is effective in fine grained prediction of the functional outcome of a mutation, and can be used to help elucidate the molecular mechanism of disease/cancer causing mutations. The program is freely available at http://bioinformatics.cs.vt.edu/zhanglab/HMMvar/download.php. CONCLUSION: This work is the first to computationally define and predict functional impact of mutations, loss, switch, gain, or conservation of function. These fine grained predictions can be especially useful for identifying mutations that cause or are linked to cancer.


Asunto(s)
Biología Computacional/métodos , Variación Genética , Humanos , Internet , Cadenas de Markov , Mutación , Neoplasias/genética , Neoplasias/patología , Receptores de Tirotropina/genética , Interfaz Usuario-Computador
12.
Hum Genomics ; 9: 18, 2015 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-26223264

RESUMEN

BACKGROUND: Many genetic variants have been identified in the human genome. The functional effects of a single variant have been intensively studied. However, the joint effects of multiple variants in the same genes have been largely ignored due to their complexity or lack of data. This paper uses HMMvar, a hidden Markov model based approach, to investigate the combined effect of multiple variants from the 1000 Genomes Project. Two tumor suppressor genes, TP53 and phosphatase and tensin homolog (PTEN), are also studied for the joint effect of compensatory indel variants. RESULTS: Results show that there are cases where the joint effect of having multiple variants in the same genes is significantly different from that of a single variant. The deleterious effect of a single indel variant can be alleviated by their compensatory indels in TP53 and PTEN. Compound mutations in two genes, ß-MHC and MyBP-C, leading to severer cardiovascular disease compared to single mutations, are also validated. CONCLUSIONS: This paper extends the functionality of HMMvar, a tool for assigning a quantitative score to a variant, to measure not only the deleterious effect of a single variant but also the joint effect of multiple variants. HMMvar is the first tool that can predict the functional effects of both single and general multiple variations on proteins. The precomputed scores for multiple variants from the 1000 Genomes Project and the HMMvar package are available at https://bioinformatics.cs.vt.edu/zhanglab/HMMvar/.


Asunto(s)
Enfermedades Cardiovasculares/genética , Variación Genética/genética , Fosfohidrolasa PTEN/genética , Proteína p53 Supresora de Tumor/genética , Enfermedades Cardiovasculares/patología , Genoma Humano , Proyecto Genoma Humano , Humanos , Mutación INDEL/genética , Cadenas de Markov , Polimorfismo de Nucleótido Simple
13.
PLoS One ; 9(5): e96726, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24816736

RESUMEN

In this study, we focus on a recent stochastic budding yeast cell cycle model. First, we estimate the model parameters using extensive data sets: phenotypes of 110 genetic strains, single cell statistics of wild type and cln3 strains. Optimization of stochastic model parameters is achieved by an automated algorithm we recently used for a deterministic cell cycle model. Next, in order to test the predictive ability of the stochastic model, we focus on a recent experimental study in which forced periodic expression of CLN2 cyclin (driven by MET3 promoter in cln3 background) has been used to synchronize budding yeast cell colonies. We demonstrate that the model correctly predicts the experimentally observed synchronization levels and cell cycle statistics of mother and daughter cells under various experimental conditions (numerical data that is not enforced in parameter optimization), in addition to correctly predicting the qualitative changes in size control due to forced CLN2 expression. Our model also generates a novel prediction: under frequent CLN2 expression pulses, G1 phase duration is bimodal among small-born cells. These cells originate from daughters with extended budded periods due to size control during the budded period. This novel prediction and the experimental trends captured by the model illustrate the interplay between cell cycle dynamics, synchronization of cell colonies, and size control in budding yeast.


Asunto(s)
Ciclo Celular , Ciclinas/genética , Regulación Fúngica de la Expresión Génica , Modelos Biológicos , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/citología , Saccharomyces cerevisiae/genética , Tamaño de la Célula , Mutación , Procesos Estocásticos
14.
BMC Syst Biol ; 8: 42, 2014 Apr 04.
Artículo en Inglés | MEDLINE | ID: mdl-24708852

RESUMEN

BACKGROUND: Building models of molecular regulatory networks is challenging not just because of the intrinsic difficulty of describing complex biological processes. Writing a model is a creative effort that calls for more flexibility and interactive support than offered by many of today's biochemical model editors. Our model editor MSMB - Multistate Model Builder - supports multistate models created using different modeling styles. RESULTS: MSMB provides two separate advances on existing network model editors. (1) A simple but powerful syntax is used to describe multistate species. This reduces the number of reactions needed to represent certain molecular systems, thereby reducing the complexity of model creation. (2) Extensive feedback is given during all stages of the model creation process on the existing state of the model. Users may activate error notifications of varying stringency on the fly, and use these messages as a guide toward a consistent, syntactically correct model. MSMB default values and behavior during model manipulation (e.g., when renaming or deleting an element) can be adapted to suit the modeler, thus supporting creativity rather than interfering with it. MSMB's internal model representation allows saving a model with errors and inconsistencies (e.g., an undefined function argument; a syntactically malformed reaction). A consistent model can be exported to SBML or COPASI formats. We show the effectiveness of MSMB's multistate syntax through models of the cell cycle and mRNA transcription. CONCLUSIONS: Using multistate reactions reduces the number of reactions need to encode many biochemical network models. This reduces the cognitive load for a given model, thereby making it easier for modelers to build more complex models. The many interactive editing support features provided by MSMB make it easier for modelers to create syntactically valid models, thus speeding model creation. Complete information and the installation package can be found at http://www.copasi.org/SoftwareProjects. MSMB is based on Java and the COPASI API.


Asunto(s)
Modelos Biológicos , Programas Informáticos , Algoritmos , Sitios de Unión , Fosforilación , Biosíntesis de Proteínas , ARN Mensajero/genética , Biología de Sistemas , Interfaz Usuario-Computador
15.
BMC Bioinformatics ; 15: 5, 2014 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-24405700

RESUMEN

BACKGROUND: With the development of sequencing technologies, more and more sequence variants are available for investigation. Different classes of variants in the human genome have been identified, including single nucleotide substitutions, insertion and deletion, and large structural variations such as duplications and deletions. Insertion and deletion (indel) variants comprise a major proportion of human genetic variation. However, little is known about their effects on humans. The absence of understanding is largely due to the lack of both biological data and computational resources. RESULTS: This paper presents a new indel functional prediction method HMMvar based on HMM profiles, which capture the conservation information in sequences. The results demonstrate that a scoring strategy based on HMM profiles can achieve good performance in identifying deleterious or neutral variants for different data sets, and can predict the protein functional effects of both single and multiple mutations. CONCLUSIONS: This paper proposed a quantitative prediction method, HMMvar, to predict the effect of genetic variation using hidden Markov models. The HMM based pipeline program implementing the method HMMvar is freely available at https://bioinformatics.cs.vt.edu/zhanglab/hmm.


Asunto(s)
Variación Genética , Genoma Humano/genética , Mutación INDEL/genética , Mutación INDEL/fisiología , Biología Computacional/métodos , Genoma Humano/fisiología , Humanos , Cadenas de Markov , Modelos Genéticos , Modelos Estadísticos , Proteínas/genética , Proteínas/metabolismo , Proteínas/fisiología , Curva ROC
16.
BMC Syst Biol ; 7: 53, 2013 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-23809412

RESUMEN

BACKGROUND: Parameter estimation from experimental data is critical for mathematical modeling of protein regulatory networks. For realistic networks with dozens of species and reactions, parameter estimation is an especially challenging task. In this study, we present an approach for parameter estimation that is effective in fitting a model of the budding yeast cell cycle (comprising 26 nonlinear ordinary differential equations containing 126 rate constants) to the experimentally observed phenotypes (viable or inviable) of 119 genetic strains carrying mutations of cell cycle genes. RESULTS: Starting from an initial guess of the parameter values, which correctly captures the phenotypes of only 72 genetic strains, our parameter estimation algorithm quickly improves the success rate of the model to 105-111 of the 119 strains. This success rate is comparable to the best values achieved by a skilled modeler manually choosing parameters over many weeks. The algorithm combines two search and optimization strategies. First, we use Latin hypercube sampling to explore a region surrounding the initial guess. From these samples, we choose ∼20 different sets of parameter values that correctly capture wild type viability. These sets form the starting generation of differential evolution that selects new parameter values that perform better in terms of their success rate in capturing phenotypes. In addition to producing highly successful combinations of parameter values, we analyze the results to determine the parameters that are most critical for matching experimental outcomes and the most competitive strains whose correct outcome with a given parameter vector forces numerous other strains to have incorrect outcomes. These "most critical parameters" and "most competitive strains" provide biological insights into the model. Conversely, the "least critical parameters" and "least competitive strains" suggest ways to reduce the computational complexity of the optimization. CONCLUSIONS: Our approach proves to be a useful tool to help systems biologists fit complex dynamical models to large experimental datasets. In the process of fitting the model to the data, the tool identifies suggestive correlations among aspects of the model and the data.


Asunto(s)
Ciclo Celular , Modelos Biológicos , Saccharomycetales/citología , Algoritmos , Fenotipo , Fosforilación , Saccharomycetales/metabolismo , Factores de Tiempo
17.
BMC Med Genomics ; 6 Suppl 3: S6, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24565418

RESUMEN

BACKGROUND: Insulin secreted by pancreatic islet ß-cells is the principal regulating hormone of glucose metabolism and plays a key role in controlling glucose level in blood. Impairment of the pancreatic islet function may cause glucose to accumulate in blood, and result in diabetes mellitus. Recent studies have shown that mitochondrial dysfunction has a strong negative effect on insulin secretion. METHODS: In order to study the cause of dysfunction of pancreatic islets, a multiple cell model containing healthy and unhealthy cells is proposed based on an existing single cell model. A parameter that represents the function of mitochondria is modified for unhealthy cells. A 3-D hexagonal lattice structure is used to model the spatial differences among ß-cells in a pancreatic islet. The ß-cells in the model are connected through direct electrical connections between neighboring ß-cells. RESULTS: The simulation results show that the low ratio of total mitochondrial volume over cytoplasm volume per ß-cell is a main reason that causes some mitochondria to lose their function. The results also show that the overall insulin secretion will be seriously disrupted when more than 15% of the ß-cells in pancreatic islets become unhealthy. CONCLUSION: Analysis of the model shows that the insulin secretion can be reinstated by increasing the glucokinase level. This new discovery sheds light on antidiabetic medication.


Asunto(s)
Algoritmos , Células Secretoras de Insulina/metabolismo , Insulina/metabolismo , Islotes Pancreáticos/metabolismo , Modelos Biológicos , Animales , Membrana Celular/metabolismo , Simulación por Computador , Citoplasma/metabolismo , Glucoquinasa/metabolismo , Glucosa/metabolismo , Glucosa/farmacología , Glucólisis , Humanos , Secreción de Insulina , Células Secretoras de Insulina/efectos de los fármacos , Células Secretoras de Insulina/patología , Islotes Pancreáticos/patología , Mitocondrias/metabolismo
18.
BMC Bioinformatics ; 12: 191, 2011 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-21635719

RESUMEN

BACKGROUND: The Structural Classification of Proteins (SCOP) database uses a large number of hidden Markov models (HMMs) to represent families and superfamilies composed of proteins that presumably share the same evolutionary origin. However, how the HMMs are related to one another has not been examined before. RESULTS: In this work, taking into account the processes used to build the HMMs, we propose a working hypothesis to examine the relationships between HMMs and the families and superfamilies that they represent. Specifically, we perform an all-against-all HMM comparison using the HHsearch program (similar to BLAST) and construct a network where the nodes are HMMs and the edges connect similar HMMs. We hypothesize that the HMMs in a connected component belong to the same family or superfamily more often than expected under a random network connection model. Results show a pattern consistent with this working hypothesis. Moreover, the HMM network possesses features distinctly different from the previously documented biological networks, exemplified by the exceptionally high clustering coefficient and the large number of connected components. CONCLUSIONS: The current finding may provide guidance in devising computational methods to reduce the degree of overlaps between the HMMs representing the same superfamilies, which may in turn enable more efficient large-scale sequence searches against the database of HMMs.


Asunto(s)
Bases de Datos de Proteínas , Cadenas de Markov , Proteínas/química , Proteínas/genética , Evolución Molecular , Familia de Multigenes , Estructura Secundaria de Proteína
19.
J Chem Phys ; 134(5): 054105, 2011 Feb 07.
Artículo en Inglés | MEDLINE | ID: mdl-21303090

RESUMEN

Typical multiscale biochemical models contain fast-scale and slow-scale reactions, where "fast" reactions fire much more frequently than "slow" ones. This feature often causes stiffness in discrete stochastic simulation methods such as Gillespie's algorithm and the Tau-Leaping method leading to inefficient simulation. This paper proposes a new strategy to automatically detect stiffness and identify species that cause stiffness for the Tau-Leaping method, as well as two stiffness reduction methods. Numerical results on a stiff decaying dimerization model and a heat shock protein regulation model demonstrate the efficiency and accuracy of the proposed methods for multiscale biochemical systems.


Asunto(s)
Simulación por Computador , Modelos Biológicos , Algoritmos , Fenómenos Bioquímicos , Simulación por Computador/economía , Dimerización , Proteínas de Choque Térmico/metabolismo , Modelos Químicos , Procesos Estocásticos , Factores de Tiempo
20.
Proc Natl Acad Sci U S A ; 107(28): 12511-6, 2010 Jul 13.
Artículo en Inglés | MEDLINE | ID: mdl-20571120

RESUMEN

Biological processes such as circadian rhythms, cell division, metabolism, and development occur as ordered sequences of events. The synchronization of these coordinated events is essential for proper cell function, and hence the determination of critical time points in biological processes is an important component of all biological investigations. In particular, such critical time points establish logical ordering constraints on subprocesses, impose prerequisites on temporal regulation and spatial compartmentalization, and situate dynamic reorganization of functional elements in preparation for subsequent stages. Thus, building temporal phenomenological representations of biological processes from genome-wide datasets is relevant in formulating biological hypotheses on: how processes are mechanistically regulated; how the regulations vary on an evolutionary scale, and how their inadvertent disregulation leads to a diseased state or fatality. This paper presents a general framework (GOALIE) to reconstruct temporal models of cellular processes from time-course gene expression data. We mathematically formulate the problem as one of optimally segmenting datasets into a succession of "informative" windows such that time points within a window expose concerted clusters of gene action whereas time points straddling window boundaries constitute points of significant restructuring. We illustrate here how GOALIE successfully brings out the interplay between multiple yeast processes, inferred from combined experimental datasets for the cell cycle and the metabolic cycle.


Asunto(s)
Fenómenos Fisiológicos Celulares , Fenómenos Biológicos , Ciclo Celular/genética , División Celular , Análisis por Conglomerados , Expresión Génica , Saccharomyces cerevisiae/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...