RESUMO
The design of biocatalytic reaction systems is highly complex owing to the dependency of the estimated kinetic parameters on the enzyme, the reaction conditions, and the modeling method. Consequently, reproducibility of enzymatic experiments and reusability of enzymatic data are challenging. We developed the XML-based markup language EnzymeML to enable storage and exchange of enzymatic data such as reaction conditions, the time course of the substrate and the product, kinetic parameters and the kinetic model, thus making enzymatic data findable, accessible, interoperable and reusable (FAIR). The feasibility and usefulness of the EnzymeML toolbox is demonstrated in six scenarios, for which data and metadata of different enzymatic reactions are collected and analyzed. EnzymeML serves as a seamless communication channel between experimental platforms, electronic lab notebooks, tools for modeling of enzyme kinetics, publication platforms and enzymatic reaction databases. EnzymeML is open and transparent, and invites the community to contribute. All documents and codes are freely available at https://enzymeml.org .
Assuntos
Gerenciamento de Dados , Metadados , Reprodutibilidade dos Testes , Bases de Dados Factuais , CinéticaRESUMO
Computational models have great potential to accelerate bioscience, bioengineering, and medicine. However, it remains challenging to reproduce and reuse simulations, in part, because the numerous formats and methods for simulating various subsystems and scales remain siloed by different software tools. For example, each tool must be executed through a distinct interface. To help investigators find and use simulation tools, we developed BioSimulators (https://biosimulators.org), a central registry of the capabilities of simulation tools and consistent Python, command-line and containerized interfaces to each version of each tool. The foundation of BioSimulators is standards, such as CellML, SBML, SED-ML and the COMBINE archive format, and validation tools for simulation projects and simulation tools that ensure these standards are used consistently. To help modelers find tools for particular projects, we have also used the registry to develop recommendation services. We anticipate that BioSimulators will help modelers exchange, reproduce, and combine simulations.
Assuntos
Simulação por Computador , Software , Humanos , Bioengenharia , Modelos Biológicos , Sistema de Registros , PesquisadoresRESUMO
MOTIVATION: COPASI is a biochemical simulator and model analyzer which has found widespread use in academic research, teaching and beyond. One of COPASI's strengths is its graphical user interface, and this is what most users work with. COPASI also provides a command-line tool. So far, an intuitive scripting interface that allows the creation and documentation of systems biology workflows was missing though. RESULTS: We have developed CoRC, the COPASI R Connector, an R package which provides a high-level scripting interface for COPASI. It closely mirrors the thought process of a (graphical interface) user and should therefore be very easy to use. This allows for complex workflows to be reproducibly scripted, utilizing COPASI's powerful analytic toolset in combination with R's extensive analysis and package ecosystem. AVAILABILITY AND IMPLEMENTATION: CoRC is a free and open-source R package, available via GitHub at https://jpahle.github.io/CoRC/ under the Artistic-2.0 license. SUPPLEMENTARY INFORMATION: We provide tutorial articles as well as several example scripts on the project's website.
RESUMO
Reproducibility and reusability of the results of data-based modeling studies are essential. Yet, there has been-so far-no broadly supported format for the specification of parameter estimation problems in systems biology. Here, we introduce PEtab, a format which facilitates the specification of parameter estimation problems using Systems Biology Markup Language (SBML) models and a set of tab-separated value files describing the observation model and experimental data as well as parameters to be estimated. We already implemented PEtab support into eight well-established model simulation and parameter estimation toolboxes with hundreds of users in total. We provide a Python library for validation and modification of a PEtab problem and currently 20 example parameter estimation problems based on recent studies.
Assuntos
Linguagens de Programação , Biologia de Sistemas/métodos , Algoritmos , Bases de Dados Factuais , Modelos Biológicos , Modelos Estatísticos , Reprodutibilidade dos TestesRESUMO
Systems biology has experienced dramatic growth in the number, size, and complexity of computational models. To reproduce simulation results and reuse models, researchers must exchange unambiguous model descriptions. We review the latest edition of the Systems Biology Markup Language (SBML), a format designed for this purpose. A community of modelers and software authors developed SBML Level 3 over the past decade. Its modular form consists of a core suited to representing reaction-based models and packages that extend the core with features suited to other model types including constraint-based models, reaction-diffusion models, logical network models, and rule-based models. The format leverages two decades of SBML and a rich software ecosystem that transformed how systems biologists build and interact with models. More recently, the rise of multiscale models of whole cells and organs, and new data sources such as single-cell measurements and live imaging, has precipitated new ways of integrating data with models. We provide our perspectives on the challenges presented by these developments and how SBML Level 3 provides the foundation needed to support this evolution.
Assuntos
Biologia de Sistemas/métodos , Animais , Humanos , Modelos Logísticos , Modelos Biológicos , SoftwareRESUMO
Summary: The Simulation Experiment Description Markup Language (SED-ML) is a standardized format for exchanging simulation studies independently of software tools. We present the SED-ML Web Tools, an online application for creating, editing, simulating and validating SED-ML documents. The Web Tools implement all current SED-ML specifications and, thus, support complex modifications and co-simulation of models in SBML and CellML formats. Ultimately, the Web Tools lower the bar on working with SED-ML documents and help users create valid simulation descriptions. Availability and Implementation: http://sysbioapps.dyndns.org/SED-ML_Web_Tools/ . Contact: fbergman@caltech.edu .
Assuntos
Simulação por Computador , Software , Internet , Linguagens de ProgramaçãoRESUMO
MOTIVATION: Computational modeling is widely used for deepening the understanding of biological processes. Parameterizing models to experimental data needs computationally efficient techniques for parameter estimation. Challenges for parameter estimation include in general the high dimensionality of the parameter space with local minima and in specific for stochastic modeling the intrinsic stochasticity. RESULTS: We implemented the recently suggested multiple shooting for stochastic systems (MSS) objective function for parameter estimation in stochastic models into COPASI. This MSS objective function can be used for parameter estimation in stochastic models but also shows beneficial properties when used for ordinary differential equation models. The method can be applied with all of COPASI's optimization algorithms, and can be used for SBML models as well. AVAILABILITY AND IMPLEMENTATION: The methodology is available in COPASI as of version 4.15.95 and can be downloaded from http://www.copasi.org CONTACT: frank.bergmann@bioquant.uni-heidelberg.de or fbergman@caltech.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Software , Algoritmos , Modelos Biológicos , Biologia de SistemasRESUMO
UNLABELLED: : SBtab is a table-based data format for Systems Biology, designed to support automated data integration and model building. It uses the structure of spreadsheets and defines conventions for table structure, controlled vocabularies and semantic annotations. The format comes with predefined table types for experimental data and SBML-compliant model structures and can easily be customized to cover new types of data. AVAILABILITY AND IMPLEMENTATION: SBtab documents can be created and edited with any text editor or spreadsheet tool. The website www.sbtab.net provides online tools for syntax validation and conversion to SBML and HTML, as well as software for using SBtab in MS Excel, MATLAB and R. The stand-alone Python code contains functions for file parsing, validation, conversion to SBML and HTML and an interface to SQLite databases, to be integrated into Systems Biology workflows. A detailed specification of SBtab, including examples and descriptions of table types and available tools, can be found at www.sbtab.net CONTACT: : wolfram.liebermeister@gmail.com.
Assuntos
Bases de Dados Factuais , Modelos Biológicos , Biologia de Sistemas , Modelos Teóricos , SoftwareRESUMO
BACKGROUND: With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other essential information in a consistent fashion. These constitute various separate components required to reproduce a given published scientific result. RESULTS: We describe the Open Modeling EXchange format (OMEX). Together with the use of other standard formats from the Computational Modeling in Biology Network (COMBINE), OMEX is the basis of the COMBINE Archive, a single file that supports the exchange of all the information necessary for a modeling and simulation experiment in biology. An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. The content of a COMBINE Archive consists of files encoded in COMBINE standards whenever possible, but may include additional files defined by an Internet Media Type. Several tools that support the COMBINE Archive are available, either as independent libraries or embedded in modeling software. CONCLUSIONS: The COMBINE Archive facilitates the reproduction of modeling and simulation experiments in biology by embedding all the relevant information in one file. Having all the information stored and exchanged at once also helps in building activity logs and audit trails. We anticipate that the COMBINE Archive will become a significant help for modellers, as the domain moves to larger, more complex experiments such as multi-scale models of organs, digital organisms, and bioengineering.
Assuntos
Biologia Computacional/métodos , Simulação por Computador , Bases de Dados de Ácidos Nucleicos , Software , Arquivos , Humanos , Armazenamento e Recuperação da Informação , InternetRESUMO
Modern biological research is increasingly informed by computational simulation experiments, which necessitate the development of methods for annotating, archiving, sharing, and reproducing the conducted experiments. These simulations increasingly require extensive collaboration among modelers, experimentalists, and engineers. The Minimum Information About a Simulation Experiment (MIASE) guidelines outline the information needed to share simulation experiments. SED-ML is a computer-readable format for the information outlined by MIASE, created as a community project and supported by many investigators and software tools. Level 1 Version 5 of SED-ML expands the ability of modelers to define simulations in SED-ML using the Kinetic Simulation Algorithm Onotoloy (KiSAO). While it was possible in Version 4 to define a simulation entirely using KiSAO, Version 5 now allows users to define tasks, model changes, ranges, and outputs using the ontology as well. SED-ML is supported by a growing ecosystem of investigators, model languages, and software tools, including various languages for constraint-based, kinetic, qualitative, rule-based, and spatial models, and many simulation tools, visual editors, model repositories, and validators. Additional information about SED-ML is available at https://sed-ml.org/.
Assuntos
Simulação por Computador , Linguagens de Programação , Software , Algoritmos , Modelos Biológicos , Humanos , Biologia Computacional/métodosRESUMO
MOTIVATION: LibSBGN is a software library for reading, writing and manipulating Systems Biology Graphical Notation (SBGN) maps stored using the recently developed SBGN-ML file format. The library (available in C++ and Java) makes it easy for developers to add SBGN support to their tools, whereas the file format facilitates the exchange of maps between compatible software applications. The library also supports validation of maps, which simplifies the task of ensuring compliance with the detailed SBGN specifications. With this effort we hope to increase the adoption of SBGN in bioinformatics tools, ultimately enabling more researchers to visualize biological knowledge in a precise and unambiguous manner. AVAILABILITY AND IMPLEMENTATION: Milestone 2 was released in December 2011. Source code, example files and binaries are freely available under the terms of either the LGPL v2.1+ or Apache v2.0 open source licenses from http://libsbgn.sourceforge.net. CONTACT: sbgn-libsbgn@lists.sourceforge.net.
Assuntos
Biologia Computacional/métodos , Software , Biologia de Sistemas , Linguagens de ProgramaçãoRESUMO
The fermentation process of milk to yoghurt using Lactobacillus delbrueckii subsp. bulgaricus in co-culture with Streptococcus thermophilus is hallmarked by the breakdown of lactose to organic acids such as lactate. This leads to a substantial decrease in pH - both in the medium, as well as cytosolic. The latter impairs metabolic activities due to the pH-dependence of enzymes, which compromises microbial growth. To quantitatively elucidate the impact of the acidification on metabolism of L. bulgaricus in an integrated way, we have developed a proton-dependent computational model of lactose metabolism and casein degradation based on experimental data. The model accounts for the influence of pH on enzyme activities as well as cellular growth and proliferation of the bacterial population. We used a machine learning approach to quantify the cell volume throughout fermentation. Simulation results show a decrease in metabolic flux with acidification of the cytosol. Additionally, the validated model predicts a similar metabolic behaviour within a wide range of non-limiting substrate concentrations. This computational model provides a deeper understanding of the intricate relationships between metabolic activity and acidification and paves the way for further optimization of yoghurt production under industrial settings.
Assuntos
Lactobacillus delbrueckii , Lactobacillus delbrueckii/metabolismo , Lactose , Metabolismo dos Carboidratos , Fermentação , Concentração de Íons de HidrogênioRESUMO
Genome-scale metabolic models are frequently used in computational biology. They offer an integrative view on the metabolic network of an organism without the need to know kinetic information in detail. However, the huge solution space which comes with the analysis of genome-scale models by using, e.g., Flux Balance Analysis (FBA) poses a problem, since it is hard to thoroughly investigate and often only an arbitrarily selected individual flux distribution is discussed as an outcome of FBA. Here, we introduce a new approach to inspect the solution space and we compare it with other approaches, namely Flux Variability Analysis (FVA) and CoPE-FBA, using several different genome-scale models of lactic acid bacteria. We examine the extent to which different types of experimental data limit the solution space and how the robustness of the system increases as a result. We find that our new approach to inspect the solution space is a good complementary method that offers additional insights into the variance of biological phenotypes and can help to prevent wrong conclusions in the analysis of FBA results.
RESUMO
EnzymeML is an XML-based data exchange format that supports the comprehensive documentation of enzymatic data by describing reaction conditions, time courses of substrate and product concentrations, the kinetic model, and the estimated kinetic constants. EnzymeML is based on the Systems Biology Markup Language, which was extended by implementing the STRENDA Guidelines. An EnzymeML document serves as a container to transfer data between experimental platforms, modeling tools, and databases. EnzymeML supports the scientific community by introducing a standardized data exchange format to make enzymatic data findable, accessible, interoperable, and reusable according to the FAIR data principles. An application programming interface in Python supports the integration of software tools for data acquisition, data analysis, and publication. The feasibility of a seamless data flow using EnzymeML is demonstrated by creating an EnzymeML document from a structured spreadsheet or from a STRENDA DB database entry, by kinetic modeling using the modeling platform COPASI, and by uploading to the enzymatic reaction kinetics database SABIO-RK.
Assuntos
Software , Biocatálise , Bases de Dados FactuaisRESUMO
Computational simulation experiments increasingly inform modern biological research, and bring with them the need to provide ways to annotate, archive, share and reproduce the experiments performed. These simulations increasingly require extensive collaboration among modelers, experimentalists, and engineers. The Minimum Information About a Simulation Experiment (MIASE) guidelines outline the information needed to share simulation experiments. SED-ML is a computer-readable format for the information outlined by MIASE, created as a community project and supported by many investigators and software tools. The first versions of SED-ML focused on deterministic and stochastic simulations of models. Level 1 Version 4 of SED-ML substantially expands these capabilities to cover additional types of models, model languages, parameter estimations, simulations and analyses of models, and analyses and visualizations of simulation results. To facilitate consistent practices across the community, Level 1 Version 4 also more clearly describes the use of SED-ML constructs, and includes numerous concrete validation rules. SED-ML is supported by a growing ecosystem of investigators, model languages, and software tools, including eight languages for constraint-based, kinetic, qualitative, rule-based, and spatial models, over 20 simulation tools, visual editors, model repositories, and validators. Additional information about SED-ML is available at https://sed-ml.org/.
Assuntos
Idioma , Linguagens de Programação , Ecossistema , Modelos Biológicos , Biologia de SistemasRESUMO
MOTIVATION: Model exchange in systems and synthetic biology has been standardized for computers with the Systems Biology Markup Language (SBML) and CellML, but specialized software is needed for the generation of models in these formats. Text-based model definition languages allow researchers to create models simply, and then export them to a common exchange format. Modular languages allow researchers to create and combine complex models more easily. We saw a use for a modular text-based language, together with a translation library to allow other programs to read the models as well. SUMMARY: The Antimony language provides a way for a researcher to use simple text statements to create, import, and combine biological models, allowing complex models to be built from simpler models, and provides a special syntax for the creation of modular genetic networks. The libAntimony library allows other software packages to import these models and convert them either to SBML or their own internal format. AVAILABILITY: The Antimony language specification and the libAntimony library are available under a BSD license from http://antimony.sourceforge.net/.
Assuntos
Armazenamento e Recuperação da Informação/métodos , Software , Bases de Dados Factuais , Internet , Biologia de SistemasRESUMO
This document defines Version 0.3 Markup Language (ML) support for the Systems Biology Graphical Notation (SBGN), a set of three complementary visual languages developed for biochemists, modelers, and computer scientists. SBGN aims at representing networks of biochemical interactions in a standard, unambiguous way to foster efficient and accurate representation, visualization, storage, exchange, and reuse of information on all kinds of biological knowledge, from gene regulation, to metabolism, to cellular signaling. SBGN is defined neutrally to programming languages and software encoding; however, it is oriented primarily towards allowing models to be encoded using XML, the eXtensible Markup Language. The notable changes from the previous version include the addition of attributes for better specify metadata about maps, as well as support for multiple maps, sub-maps, colors, and annotations. These changes enable a more efficient exchange of data to other commonly used systems biology formats (e. g., BioPAX and SBML) and between tools supporting SBGN (e. g., CellDesigner, Newt, Krayon, SBGN-ED, STON, cd2sbgnml, and MINERVA). More details on SBGN and related software are available at http://sbgn.org. With this effort, we hope to increase the adoption of SBGN in bioinformatics tools, ultimately enabling more researchers to visualize biological knowledge in a precise and unambiguous manner.
Assuntos
Linguagens de Programação , Biologia de Sistemas , Biologia Computacional , Metadados , Modelos Biológicos , SoftwareRESUMO
Biological models often contain elements that have inexact numerical values, since they are based on values that are stochastic in nature or data that contains uncertainty. The Systems Biology Markup Language (SBML) Level 3 Core specification does not include an explicit mechanism to include inexact or stochastic values in a model, but it does provide a mechanism for SBML packages to extend the Core specification and add additional syntactic constructs. The SBML Distributions package for SBML Level 3 adds the necessary features to allow models to encode information about the distribution and uncertainty of values underlying a quantity.
Assuntos
Linguagens de Programação , Biologia de Sistemas , Documentação , Idioma , Modelos Biológicos , SoftwareRESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
MOTIVATION: Simulations are an essential tool when analyzing biochemical networks. Researchers and developers seeking to refine simulation tools or develop new ones would benefit greatly from being able to compare their simulation results. SUMMARY: We present an approach to compare simulation results between several SBML capable simulators and provide a website for the community to share simulation results. AVAILABILITY: The website with simulation results and additional material can be found under: http://sys-bio.org/sbwWiki/compare. The software used to generate the simulation results is available on the website for download.