RESUMEN
Life science researchers use computational models to articulate and test hypotheses about the behavior of biological systems. Semantic annotation is a critical component for enhancing the interoperability and reusability of such models as well as for the integration of the data needed for model parameterization and validation. Encoded as machine-readable links to knowledge resource terms, semantic annotations describe the computational or biological meaning of what models and data represent. These annotations help researchers find and repurpose models, accelerate model composition and enable knowledge integration across model repositories and experimental data stores. However, realizing the potential benefits of semantic annotation requires the development of model annotation standards that adhere to a community-based annotation protocol. Without such standards, tool developers must account for a variety of annotation formats and approaches, a situation that can become prohibitively cumbersome and which can defeat the purpose of linking model elements to controlled knowledge resource terms. Currently, no consensus protocol for semantic annotation exists among the larger biological modeling community. Here, we report on the landscape of current annotation practices among the COmputational Modeling in BIology NEtwork community and provide a set of recommendations for building a consensus approach to semantic annotation.
Asunto(s)
Disciplinas de las Ciencias Biológicas , Biología Computacional/métodos , Simulación por Computador , Bases de Datos Factuales , Semántica , Humanos , Programas InformáticosRESUMEN
Public health research and epidemiological and clinical studies are necessary to understand the COVID-19 pandemic and to take appropriate action. Therefore, since early 2020, numerous research projects have also been initiated in Germany. However, due to the large amount of information, it is currently difficult to get an overview of the diverse research activities and their results. Based on the "Federated research data infrastructure for personal health data" (NFDI4Health) initiative, the "COVID-19 task force" is able to create easier access to SARS-CoV-2- and COVID-19-related clinical, epidemiological, and public health research data. Therefore, the so-called FAIR data principles (findable, accessible, interoperable, reusable) are taken into account and should allow an expedited communication of results. The most essential work of the task force includes the generation of a study portal with metadata, selected instruments, other study documents, and study results as well as a search engine for preprint publications. Additional contents include a concept for the linkage between research and routine data, a service for an enhanced practice of image data, and the application of a standardized analysis routine for harmonized quality assessment. This infrastructure, currently being established, will facilitate the findability and handling of German COVID-19 research. The developments initiated in the context of the NFDI4Health COVID-19 task force are reusable for further research topics, as the challenges addressed are generic for the findability of and the handling with research data.
Asunto(s)
Investigación Biomédica/tendencias , COVID-19 , Difusión de la Información , Alemania , Humanos , Metadatos , Pandemias , SARS-CoV-2RESUMEN
The FAIRDOMHub is a repository for publishing FAIR (Findable, Accessible, Interoperable and Reusable) Data, Operating procedures and Models (https://fairdomhub.org/) for the Systems Biology community. It is a web-accessible repository for storing and sharing systems biology research assets. It enables researchers to organize, share and publish data, models and protocols, interlink them in the context of the systems biology investigations that produced them, and to interrogate them via API interfaces. By using the FAIRDOMHub, researchers can achieve more effective exchange with geographically distributed collaborators during projects, ensure results are sustained and preserved and generate reproducible publications that adhere to the FAIR guiding principles of data stewardship.
Asunto(s)
Bases de Datos Factuales , Biología de Sistemas/métodos , Carbono/metabolismo , Curaduría de Datos , Difusión de la Información , Redes y Vías Metabólicas , InvestigaciónRESUMEN
BACKGROUND: With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other essential information in a consistent fashion. These constitute various separate components required to reproduce a given published scientific result. RESULTS: We describe the Open Modeling EXchange format (OMEX). Together with the use of other standard formats from the Computational Modeling in Biology Network (COMBINE), OMEX is the basis of the COMBINE Archive, a single file that supports the exchange of all the information necessary for a modeling and simulation experiment in biology. An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. The content of a COMBINE Archive consists of files encoded in COMBINE standards whenever possible, but may include additional files defined by an Internet Media Type. Several tools that support the COMBINE Archive are available, either as independent libraries or embedded in modeling software. CONCLUSIONS: The COMBINE Archive facilitates the reproduction of modeling and simulation experiments in biology by embedding all the relevant information in one file. Having all the information stored and exchanged at once also helps in building activity logs and audit trails. We anticipate that the COMBINE Archive will become a significant help for modellers, as the domain moves to larger, more complex experiments such as multi-scale models of organs, digital organisms, and bioengineering.
Asunto(s)
Biología Computacional/métodos , Simulación por Computador , Bases de Datos de Ácidos Nucleicos , Programas Informáticos , Archivos , Humanos , Almacenamiento y Recuperación de la Información , InternetRESUMEN
SABIO-RK (http://sabio.h-its.org/) is a web-accessible database storing comprehensive information about biochemical reactions and their kinetic properties. SABIO-RK offers standardized data manually extracted from the literature and data directly submitted from lab experiments. The database content includes kinetic parameters in relation to biochemical reactions and their biological sources with no restriction on any particular set of organisms. Additionally, kinetic rate laws and corresponding equations as well as experimental conditions are represented. All the data are manually curated and annotated by biological experts, supported by automated consistency checks. SABIO-RK can be accessed via web-based user interfaces or automatically via web services that allow direct data access by other tools. Both interfaces support the export of the data together with its annotations in SBML (Systems Biology Markup Language), e.g. for import in modelling tools.
Asunto(s)
Fenómenos Bioquímicos , Bases de Datos Factuales , Enzimas/metabolismo , Internet , Cinética , Interfaz Usuario-ComputadorRESUMEN
INTRODUCTION: NFDI4Health is a consortium funded by the German Research Foundation to make structured health data findable and accessible internationally according to the FAIR principles. Its goal is bringing data users and Data Holding Organizations (DHOs) together. It mainly considers DHOs conducting epidemiological and public health studies or clinical trials. METHODS: Local data hubs (LDH) are provided for such DHOs to connect decentralized local research data management within their organizations with the option of publishing shareable metadata via centralized NFDI4Health services such as the German central Health Study Hub. The LDH platform is based on FAIRDOM SEEK and provides a complete and flexible, locally controlled data and information management platform for health research data. A tailored NFDI4Health metadata schema for studies and their corresponding resources has been developed which is fully supported by the LDH software, e.g. for metadata transfer to other NFDI4Health services. RESULTS: The SEEK platform has been technically enhanced to support extended metadata structures tailored to the needs of the user communities in addition to the existing metadata structuring of SEEK. CONCLUSION: With the LDH and the MDS, the NFDI4Health provides all DHOs with a standardized and free and open source research data management platform for the FAIR exchange of structured health data.
Asunto(s)
Metadatos , Alemania , Humanos , Manejo de Datos , Difusión de la Información , Programas InformáticosRESUMEN
INTRODUCTION: The Local Data Hub (LDH) is a platform for FAIR sharing of medical research (meta-)data. In order to promote the usage of LDH in different research communities, it is important to understand the domain-specific needs, solutions currently used for data organization and provide support for seamless uploads to a LDH. In this work, we analyze the use case of microneurography, which is an electrophysiological technique for analyzing neural activity. METHODS: After performing a requirements analysis in dialogue with microneurography researchers, we propose a concept-mapping and a workflow, for the researchers to transform and upload their metadata. Further, we implemented a semi-automatic upload extension to odMLtables, a template-based tool for handling metadata in the electrophysiological community. RESULTS: The open-source implementation enables the odML-to-LDH concept mapping, allows data anonymization from within the tool and the creation of custom-made summaries on the underlying data sets. DISCUSSION: This concludes a first step towards integrating improved FAIR processes into the research laboratory's daily workflow. In future work, we will extend this approach to other use cases to disseminate the usage of LDHs in a larger research community.
Asunto(s)
Metadatos , Humanos , Difusión de la Información/métodos , Almacenamiento y Recuperación de la Información/métodosRESUMEN
Adhering to FAIR principles (findability, accessibility, interoperability, reusability) ensures sustainability and reliable exchange of data and metadata. Research communities need common infrastructures and information models to collect, store, manage and work with data and metadata. The German initiative NFDI4Health created a metadata schema and an infrastructure integrating existing platforms based on different information models and standards. To ensure system compatibility and enhance data integration possibilities, we mapped the Investigation-Study-Assay (ISA) model to Fast Healthcare Interoperability Resources (FHIR). We present the mapping in FHIR logical models, a resulting FHIR resources' network and challenges that we encountered. Challenges mainly related to ISA's genericness, and to different structures and datatypes used in ISA and FHIR. Mapping ISA to FHIR is feasible but requires further analyses of example data and adaptations to better specify target FHIR elements, and enable possible automatized conversions from ISA to FHIR.
Asunto(s)
Medicamentos Genéricos , Instituciones de Salud , Humanos , Metadatos , Atención a la SaludRESUMEN
The German initiative "National Research Data Infrastructure for Personal Health Data" (NFDI4Health) focuses on research data management in health research. It aims to foster and develop harmonized informatics standards for public health, epidemiological studies, and clinical trials, facilitating access to relevant data and metadata standards. This publication lists syntactic and semantic data standards of potential use for NFDI4Health and beyond, based on interdisciplinary meetings and workshops, mappings of study questionnaires and the NFDI4Health metadata schema, and literature search. Included are 7 syntactic, 32 semantic and 9 combined syntactic and semantic standards. In addition, 101 ISO Standards from ISO/TC 215 Health Informatics and ISO/TC 276 Biotechnology could be identified as being potentially relevant. The work emphasizes the utilization of standards for epidemiological and health research data ensuring interoperability as well as the compatibility to NFDI4Health, its use cases, and to (inter-)national efforts within these sectors. The goal is to foster collaborative and inter-sectoral work in health research and initiate a debate around the potential of using common standards.
Asunto(s)
Interoperabilidad de la Información en Salud , Humanos , Metadatos , Alemania , Registros de Salud Personal , Manejo de DatosRESUMEN
Open and practical exchange, dissemination, and reuse of specimens and data have become a fundamental requirement for life sciences research. The quality of the data obtained and thus the findings and knowledge derived is thus significantly influenced by the quality of the samples, the experimental methods, and the data analysis. Therefore, a comprehensive and precise documentation of the pre-analytical conditions, the analytical procedures, and the data processing are essential to be able to assess the validity of the research results. With the increasing importance of the exchange, reuse, and sharing of data and samples, procedures are required that enable cross-organizational documentation, traceability, and non-repudiation. At present, this information on the provenance of samples and data is mostly either sparse, incomplete, or incoherent. Since there is no uniform framework, this information is usually only provided within the organization and not interoperably. At the same time, the collection and sharing of biological and environmental specimens increasingly require definition and documentation of benefit sharing and compliance to regulatory requirements rather than consideration of pure scientific needs. In this publication, we present an ongoing standardization effort to provide trustworthy machine-actionable documentation of the data lineage and specimens. We would like to invite experts from the biotechnology and biomedical fields to further contribute to the standard.
RESUMEN
This special issue of the Journal of Integrative Bioinformatics contains updated specifications of COMBINE standards in systems and synthetic biology. The 2022 special issue presents three updates to the standards: CellML 2.0.1, SBML Level 3 Package: Spatial Processes, Version 1, Release 1, and Synthetic Biology Open Language (SBOL) Version 3.1.0. This document can also be used to identify the latest specifications for all COMBINE standards. In addition, this editorial provides a brief overview of the COMBINE 2022 meeting in Berlin.
Asunto(s)
Biología Computacional , Biología Sintética , Lenguajes de Programación , Programas InformáticosRESUMEN
The use of computational modeling to describe and analyze biological systems is at the heart of systems biology. Model structures, simulation descriptions and numerical results can be encoded in structured formats, but there is an increasing need to provide an additional semantic layer. Semantic information adds meaning to components of structured descriptions to help identify and interpret them unambiguously. Ontologies are one of the tools frequently used for this purpose. We describe here three ontologies created specifically to address the needs of the systems biology community. The Systems Biology Ontology (SBO) provides semantic information about the model components. The Kinetic Simulation Algorithm Ontology (KiSAO) supplies information about existing algorithms available for the simulation of systems biology models, their characterization and interrelationships. The Terminology for the Description of Dynamics (TEDDY) categorizes dynamical features of the simulation results and general systems behavior. The provision of semantic information extends a model's longevity and facilitates its reuse. It provides useful insight into the biology of modeled processes, and may be used to make informed decisions on subsequent simulation experiments.
Asunto(s)
Biología Computacional , Semántica , Biología de Sistemas , Vocabulario Controlado , Algoritmos , Simulación por Computador , Almacenamiento y Recuperación de la Información , Modelos BiológicosRESUMEN
The future development of personalized medicine depends on a vast exchange of data from different sources, as well as harmonized integrative analysis of large-scale clinical health and sample data. Computational-modelling approaches play a key role in the analysis of the underlying molecular processes and pathways that characterize human biology, but they also lead to a more profound understanding of the mechanisms and factors that drive diseases; hence, they allow personalized treatment strategies that are guided by central clinical questions. However, despite the growing popularity of computational-modelling approaches in different stakeholder communities, there are still many hurdles to overcome for their clinical routine implementation in the future. Especially the integration of heterogeneous data from multiple sources and types are challenging tasks that require clear guidelines that also have to comply with high ethical and legal standards. Here, we discuss the most relevant computational models for personalized medicine in detail that can be considered as best-practice guidelines for application in clinical care. We define specific challenges and provide applicable guidelines and recommendations for study design, data acquisition, and operation as well as for model validation and clinical translation and other research areas.
RESUMEN
In this white paper, we describe the founding of a new ELIXIR Community - the Systems Biology Community - and its proposed future contributions to both ELIXIR and the broader community of systems biologists in Europe and worldwide. The Community believes that the infrastructure aspects of systems biology - databases, (modelling) tools and standards development, as well as training and access to cloud infrastructure - are not only appropriate components of the ELIXIR infrastructure, but will prove key components of ELIXIR's future support of advanced biological applications and personalised medicine. By way of a series of meetings, the Community identified seven key areas for its future activities, reflecting both future needs and previous and current activities within ELIXIR Platforms and Communities. These are: overcoming barriers to the wider uptake of systems biology; linking new and existing data to systems biology models; interoperability of systems biology resources; further development and embedding of systems medicine; provisioning of modelling as a service; building and coordinating capacity building and training resources; and supporting industrial embedding of systems biology. A set of objectives for the Community has been identified under four main headline areas: Standardisation and Interoperability, Technology, Capacity Building and Training, and Industrial Embedding. These are grouped into short-term (3-year), mid-term (6-year) and long-term (10-year) objectives.
Asunto(s)
Biología de Sistemas , Europa (Continente) , Bases de Datos FactualesRESUMEN
The German Central Health Study Hub COVID-19 is an online service that offers bundled access to COVID-19 related studies conducted in Germany. It combines metadata and other information of epidemiologic, public health and clinical studies into a single data repository for FAIR data access. In addition to study characteristics the system also allows easy access to study documents, as well as instruments for data collection. Study metadata and survey instruments are decomposed into individual data items and semantically enriched to ease the findability. Data from existing clinical trial registries (DRKS, clinicaltrails.gov and WHO ICTRP) are merged with epidemiological and public health studies manually collected and entered. More than 850 studies are listed as of September 2021.
Asunto(s)
COVID-19 , Alemania , Humanos , Metadatos , SARS-CoV-2 , Encuestas y CuestionariosRESUMEN
This special issue of the Journal of Integrative Bioinformatics contains updated specifications of COMBINE standards in systems and synthetic biology. The 2021 special issue presents four updates of standards: Synthetic Biology Open Language Visual Version 2.3, Synthetic Biology Open Language Visual Version 3.0, Simulation Experiment Description Markup Language Level 1 Version 4, and OMEX Metadata specification Version 1.2. This document can also be consulted to identify the latest specifications of all COMBINE standards.
Asunto(s)
Biología Computacional , Biología Sintética , Simulación por Computador , Metadatos , Lenguajes de Programación , Programas InformáticosRESUMEN
COVID-19 poses a major challenge to individuals and societies around the world. Yet, it is difficult to obtain a good overview of studies across different medical fields of research such as clinical trials, epidemiology, and public health. Here, we describe a consensus metadata model to facilitate structured searches of COVID-19 studies and resources along with its implementation in three linked complementary web-based platforms. A relational database serves as central study metadata hub that secures compatibilities with common trials registries (e.g. ICTRP and standards like HL7 FHIR, CDISC ODM, and DataCite). The Central Search Hub was developed as a single-page application, the other two components with additional frontends are based on the SEEK platform and MICA, respectively. These platforms have different features concerning cohort browsing, item browsing, and access to documents and other study resources to meet divergent user needs. By this we want to promote transparent and harmonized COVID-19 research.
Asunto(s)
COVID-19 , Estudios Epidemiológicos , Humanos , Metadatos , Sistema de Registros , SARS-CoV-2RESUMEN
SUMMARY: The XML-based Systems Biology Markup Language (SBML) has emerged as a standard for storage, communication and interchange of models in systems biology. As a machine-readable format XML is difficult for humans to read and understand. Many tools are available that visualize the reaction pathways stored in SBML files, but many components, e.g. unit declarations, complex kinetic equations or links to MIRIAM resources, are often not made visible in these diagrams. For a broader understanding of the models, support in scientific writing and error detection, a human-readable report of the complete model is needed. We present SBML2L(A)T(E)X, a Java-based stand-alone program to fill this gap. A convenient web service allows users to directly convert SBML to various formats, including DVI, L(A)T(E)X and PDF, and provides many settings for customization. AVAILABILITY: Source code, documentation and a web service are freely available at (http://www.ra.cs.uni-tuebingen.de/software/SBML2LaTeX).