RESUMO
: BioNetGen is an open-source software package for rule-based modeling of complex biochemical systems. Version 2.2 of the software introduces numerous new features for both model specification and simulation. Here, we report on these additions, discussing how they facilitate the construction, simulation and analysis of larger and more complex models than previously possible. AVAILABILITY AND IMPLEMENTATION: Stable BioNetGen releases (Linux, Mac OS/X and Windows), with documentation, are available at http://bionetgen.org Source code is available at http://github.com/RuleWorld/bionetgen CONTACT: bionetgen.help@gmail.comSupplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Bioquímica , Software , Humanos , Modelos Teóricos , Linguagens de ProgramaçãoRESUMO
Detailed modeling and simulation of biochemical systems is complicated by the problem of combinatorial complexity, an explosion in the number of species and reactions due to myriad protein-protein interactions and post-translational modifications. Rule-based modeling overcomes this problem by representing molecules as structured objects and encoding their interactions as pattern-based rules. This greatly simplifies the process of model specification, avoiding the tedious and error prone task of manually enumerating all species and reactions that can potentially exist in a system. From a simulation perspective, rule-based models can be expanded algorithmically into fully-enumerated reaction networks and simulated using a variety of network-based simulation methods, such as ordinary differential equations or Gillespie's algorithm, provided that the network is not exceedingly large. Alternatively, rule-based models can be simulated directly using particle-based kinetic Monte Carlo methods. This "network-free" approach produces exact stochastic trajectories with a computational cost that is independent of network size. However, memory and run time costs increase with the number of particles, limiting the size of system that can be feasibly simulated. Here, we present a hybrid particle/population simulation method that combines the best attributes of both the network-based and network-free approaches. The method takes as input a rule-based model and a user-specified subset of species to treat as population variables rather than as particles. The model is then transformed by a process of "partial network expansion" into a dynamically equivalent form that can be simulated using a population-adapted network-free simulator. The transformation method has been implemented within the open-source rule-based modeling platform BioNetGen, and resulting hybrid models can be simulated using the particle-based simulator NFsim. Performance tests show that significant memory savings can be achieved using the new approach and a monetary cost analysis provides a practical measure of its utility.
Assuntos
Modelos Biológicos , Modelos Químicos , Método de Monte CarloRESUMO
BACKGROUND: Staphylococcus aureus is associated with a spectrum of symbiotic relationships with its human host from carriage to sepsis and is frequently associated with nosocomial and community-acquired infections, thus the differential gene content among strains is of interest. RESULTS: We sequenced three clinical strains and combined these data with 13 publically available human isolates and one bovine strain for comparative genomic analyses. All genomes were annotated using RAST, and then their gene similarities and differences were delineated. Gene clustering yielded 3,155 orthologous gene clusters, of which 2,266 were core, 755 were distributed, and 134 were unique. Individual genomes contained between 2,524 and 2,648 genes. Gene-content comparisons among all possible S. aureus strain pairs (n = 136) revealed a mean difference of 296 genes and a maximum difference of 476 genes. We developed a revised version of our finite supragenome model to estimate the size of the S. aureus supragenome (3,221 genes, with 2,245 core genes), and compared it with those of Haemophilus influenzae and Streptococcus pneumoniae. There was excellent agreement between RAST's annotations and our CDS clustering procedure providing for high fidelity metabolomic subsystem analyses to extend our comparative genomic characterization of these strains. CONCLUSIONS: Using a multi-species comparative supragenomic analysis enabled by an improved version of our finite supragenome model we provide data and an interpretation explaining the relatively larger core genome of S. aureus compared to other opportunistic nasopharyngeal pathogens. In addition, we provide independent validation for the efficiency and effectiveness of our orthologous gene clustering algorithm.
Assuntos
Genoma Bacteriano , Haemophilus influenzae/genética , Staphylococcus aureus/genética , Streptococcus pneumoniae/genética , Algoritmos , Animais , Bovinos , Regulação Bacteriana da Expressão Gênica , Haemophilus influenzae/isolamento & purificação , Humanos , Modelos Genéticos , Família Multigênica , Fases de Leitura Aberta , Infecções Estafilocócicas/microbiologia , Staphylococcus aureus/isolamento & purificação , Streptococcus pneumoniae/isolamento & purificaçãoRESUMO
Models of biological systems often have many unknown parameters that must be determined in order for model behavior to match experimental observations. Commonly-used methods for parameter estimation that return point estimates of the best-fit parameters are insufficient when models are high dimensional and under-constrained. As a result, Bayesian methods, which treat model parameters as random variables and attempt to estimate their probability distributions given data, have become popular in systems biology. Bayesian parameter estimation often relies on Markov Chain Monte Carlo (MCMC) methods to sample model parameter distributions, but the slow convergence of MCMC sampling can be a major bottleneck. One approach to improving performance is parallel tempering (PT), a physics-based method that uses swapping between multiple Markov chains run in parallel at different temperatures to accelerate sampling. The temperature of a Markov chain determines the probability of accepting an unfavorable move, so swapping with higher temperatures chains enables the sampling chain to escape from local minima. In this work we compared the MCMC performance of PT and the commonly-used Metropolis-Hastings (MH) algorithm on six biological models of varying complexity. We found that for simpler models PT accelerated convergence and sampling, and that for more complex models, PT often converged in cases MH became trapped in non-optimal local minima. We also developed a freely-available MATLAB package for Bayesian parameter estimation called PTEMPEST (http://github.com/RuleWorld/ptempest), which is closely integrated with the popular BioNetGen software for rule-based modeling of biological systems.
RESUMO
The distributed-genome hypothesis (DGH) states that pathogenic bacteria possess a supragenome that is much larger than the genome of any single bacterium and that these pathogens utilize genetic recombination and a large, noncore set of genes as a means of diversity generation. We sequenced the genomes of eight nasopharyngeal strains of Streptococcus pneumoniae isolated from pediatric patients with upper respiratory symptoms and performed quantitative genomic analyses among these and nine publicly available pneumococcal strains. Coding sequences from all strains were grouped into 3,170 orthologous gene clusters, of which 1,454 (46%) were conserved among all 17 strains. The majority of the gene clusters, 1,716 (54%), were not found in all strains. Genic differences per strain pair ranged from 35 to 629 orthologous clusters, with each strain's genome containing between 21 and 32% noncore genes. The distribution of the orthologous clusters per genome for the 17 strains was entered into the finite-supragenome model, which predicted that (i) the S. pneumoniae supragenome contains more than 5,000 orthologous clusters and (ii) 99% of the orthologous clusters ( approximately 3,000) that are represented in the S. pneumoniae population at frequencies of >or=0.1 can be identified if 33 representative genomes are sequenced. These extensive genic diversity data support the DGH and provide a basis for understanding the great differences in clinical phenotype associated with various pneumococcal strains. When these findings are taken together with previous studies that demonstrated the presence of a supragenome for Streptococcus agalactiae and Haemophilus influenzae, it appears that the possession of a distributed genome is a common host interaction strategy.
Assuntos
Genoma Bacteriano , Genômica , Streptococcus pneumoniae/classificação , Streptococcus pneumoniae/genética , Regulação Bacteriana da Expressão Gênica , Família Multigênica , FilogeniaRESUMO
The body responds to endotoxins by triggering the acute inflammatory response system to eliminate the threat posed by gram-negative bacteria (endotoxin) and restore health. However, an uncontrolled inflammatory response can lead to tissue damage, organ failure, and ultimately death; this is clinically known as sepsis. Mathematical models of acute inflammatory disease have the potential to guide treatment decisions in critically ill patients. In this work, an 8-state (8-D) differential equation model of the acute inflammatory response system to endotoxin challenge was developed. Endotoxin challenges at 3 and 12 mg/kg were administered to rats, and dynamic cytokine data for interleukin (IL)-6, tumor necrosis factor (TNF), and IL-10 were obtained and used to calibrate the model. Evaluation of competing model structures was performed by analyzing model predictions at 3, 6, and 12 mg/kg endotoxin challenges with respect to experimental data from rats. Subsequently, a model predictive control (MPC) algorithm was synthesized to control a hemoadsorption (HA) device, a blood purification treatment for acute inflammation. A particle filter (PF) algorithm was implemented to estimate the full state vector of the endotoxemic rat based on time series cytokine measurements. Treatment simulations show that: (i) the apparent primary mechanism of HA efficacy is white blood cell (WBC) capture, with cytokine capture a secondary benefit; and (ii) differential filtering of cytokines and WBC does not provide substantial improvement in treatment outcomes vs. existing HA devices.
RESUMO
BACKGROUND: The distributed genome hypothesis (DGH) posits that chronic bacterial pathogens utilize polyclonal infection and reassortment of genic characters to ensure persistence in the face of adaptive host defenses. Studies based on random sequencing of multiple strain libraries suggested that free-living bacterial species possess a supragenome that is much larger than the genome of any single bacterium. RESULTS: We derived high depth genomic coverage of nine nontypeable Haemophilus influenzae (NTHi) clinical isolates, bringing to 13 the number of sequenced NTHi genomes. Clustering identified 2,786 genes, of which 1,461 were common to all strains, with each of the remaining 1,328 found in a subset of strains; the number of clusters ranged from 1,686 to 1,878 per strain. Genic differences of between 96 and 585 were identified per strain pair. Comparisons of each of the NTHi strains with the Rd strain revealed between 107 and 158 insertions and 100 and 213 deletions per genome. The mean insertion and deletion sizes were 1,356 and 1,020 base-pairs, respectively, with mean maximum insertions and deletions of 26,977 and 37,299 base-pairs. This relatively large number of small rearrangements among strains is in keeping with what is known about the transformation mechanisms in this naturally competent pathogen. CONCLUSION: A finite supragenome model was developed to explain the distribution of genes among strains. The model predicts that the NTHi supragenome contains between 4,425 and 6,052 genes with most uncertainty regarding the number of rare genes, those that have a frequency of <0.1 among strains; collectively, these results support the DGH.