Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
EMBO J ; 42(23): e115008, 2023 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-37964598

RESUMO

The main goals and challenges for the life science communities in the Open Science framework are to increase reuse and sustainability of data resources, software tools, and workflows, especially in large-scale data-driven research and computational analyses. Here, we present key findings, procedures, effective measures and recommendations for generating and establishing sustainable life science resources based on the collaborative, cross-disciplinary work done within the EOSC-Life (European Open Science Cloud for Life Sciences) consortium. Bringing together 13 European life science research infrastructures, it has laid the foundation for an open, digital space to support biological and medical research. Using lessons learned from 27 selected projects, we describe the organisational, technical, financial and legal/ethical challenges that represent the main barriers to sustainability in the life sciences. We show how EOSC-Life provides a model for sustainable data management according to FAIR (findability, accessibility, interoperability, and reusability) principles, including solutions for sensitive- and industry-related resources, by means of cross-disciplinary training and best practices sharing. Finally, we illustrate how data harmonisation and collaborative work facilitate interoperability of tools, data, solutions and lead to a better understanding of concepts, semantics and functionalities in the life sciences.


Assuntos
Disciplinas das Ciências Biológicas , Pesquisa Biomédica , Software , Fluxo de Trabalho
2.
Nat Rev Genet ; 20(11): 693-701, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31455890

RESUMO

Human genomics is undergoing a step change from being a predominantly research-driven activity to one driven through health care as many countries in Europe now have nascent precision medicine programmes. To maximize the value of the genomic data generated, these data will need to be shared between institutions and across countries. In recognition of this challenge, 21 European countries recently signed a declaration to transnationally share data on at least 1 million human genomes by 2022. In this Roadmap, we identify the challenges of data sharing across borders and demonstrate that European research infrastructures are well-positioned to support the rapid implementation of widespread genomic data access.


Assuntos
Pesquisa Biomédica , Genoma Humano , Projeto Genoma Humano , Europa (Continente) , Humanos
4.
Bioinformatics ; 37(12): 1781-1782, 2021 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-33031499

RESUMO

MOTIVATION: Since its launch in 2010, Identifiers.org has become an important tool for the annotation and cross-referencing of Life Science data. In 2016, we established the Compact Identifier (CID) scheme (prefix: accession) to generate globally unique identifiers for data resources using their locally assigned accession identifiers. Since then, we have developed and improved services to support the growing need to create, reference and resolve CIDs, in systems ranging from human readable text to cloud-based e-infrastructures, by providing high availability and low-latency cloud-based services, backed by a high-quality, manually curated resource. RESULTS: We describe a set of services that can be used to construct and resolve CIDs in Life Sciences and beyond. We have developed a new front end for accessing the Identifiers.org registry data and APIs to simplify integration of Identifiers.org CID services with third-party applications. We have also deployed the new Identifiers.org infrastructure in a commercial cloud environment, bringing our services closer to the data. AVAILABILITYAND IMPLEMENTATION: https://identifiers.org.


Assuntos
Disciplinas das Ciências Biológicas , Computação em Nuvem , Humanos
5.
Brief Bioinform ; 20(2): 540-550, 2019 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-30462164

RESUMO

Life science researchers use computational models to articulate and test hypotheses about the behavior of biological systems. Semantic annotation is a critical component for enhancing the interoperability and reusability of such models as well as for the integration of the data needed for model parameterization and validation. Encoded as machine-readable links to knowledge resource terms, semantic annotations describe the computational or biological meaning of what models and data represent. These annotations help researchers find and repurpose models, accelerate model composition and enable knowledge integration across model repositories and experimental data stores. However, realizing the potential benefits of semantic annotation requires the development of model annotation standards that adhere to a community-based annotation protocol. Without such standards, tool developers must account for a variety of annotation formats and approaches, a situation that can become prohibitively cumbersome and which can defeat the purpose of linking model elements to controlled knowledge resource terms. Currently, no consensus protocol for semantic annotation exists among the larger biological modeling community. Here, we report on the landscape of current annotation practices among the COmputational Modeling in BIology NEtwork community and provide a set of recommendations for building a consensus approach to semantic annotation.


Assuntos
Disciplinas das Ciências Biológicas , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados Factuais , Semântica , Humanos , Software
6.
PLoS Biol ; 15(6): e2001414, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28662064

RESUMO

In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.


Assuntos
Disciplinas das Ciências Biológicas/métodos , Biologia Computacional/métodos , Mineração de Dados/métodos , Design de Software , Software , Disciplinas das Ciências Biológicas/estatística & dados numéricos , Disciplinas das Ciências Biológicas/tendências , Biologia Computacional/tendências , Mineração de Dados/estatística & dados numéricos , Mineração de Dados/tendências , Bases de Dados Factuais/estatística & dados numéricos , Bases de Dados Factuais/tendências , Previsões , Humanos , Internet
7.
Nucleic Acids Res ; 44(D1): D38-47, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26538599

RESUMO

Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.


Assuntos
Biologia Computacional , Sistema de Registros , Curadoria de Dados , Software
8.
Nucleic Acids Res ; 43(Database issue): D542-8, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25414348

RESUMO

BioModels (http://www.ebi.ac.uk/biomodels/) is a repository of mathematical models of biological processes. A large set of models is curated to verify both correspondence to the biological process that the model seeks to represent, and reproducibility of the simulation results as described in the corresponding peer-reviewed publication. Many models submitted to the database are annotated, cross-referencing its components to external resources such as database records, and terms from controlled vocabularies and ontologies. BioModels comprises two main branches: one is composed of models derived from literature, while the second is generated through automated processes. BioModels currently hosts over 1200 models derived directly from the literature, as well as in excess of 140,000 models automatically generated from pathway resources. This represents an approximate 60-fold growth for literature-based model numbers alone, since BioModels' first release a decade ago. This article describes updates to the resource over this period, which include changes to the user interface, the annotation profiles of models in the curation pipeline, major infrastructure changes, ability to perform online simulations and the availability of model content in Linked Data form. We also outline planned improvements to cope with a diverse array of new challenges.


Assuntos
Bases de Dados Factuais , Modelos Biológicos , Simulação por Computador , Internet
9.
Bioinformatics ; 31(11): 1875-7, 2015 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-25638809

RESUMO

MOTIVATION: On the semantic web, in life sciences in particular, data is often distributed via multiple resources. Each of these sources is likely to use their own International Resource Identifier for conceptually the same resource or database record. The lack of correspondence between identifiers introduces a barrier when executing federated SPARQL queries across life science data. RESULTS: We introduce a novel SPARQL-based service to enable on-the-fly integration of life science data. This service uses the identifier patterns defined in the Identifiers.org Registry to generate a plurality of identifier variants, which can then be used to match source identifiers with target identifiers. We demonstrate the utility of this identifier integration approach by answering queries across major producers of life science Linked Data. AVAILABILITY AND IMPLEMENTATION: The SPARQL-based identifier conversion service is available without restriction at http://identifiers.org/services/sparql.


Assuntos
Bases de Dados Factuais , Disciplinas das Ciências Biológicas , Internet , Semântica , Integração de Sistemas
10.
Nucleic Acids Res ; 40(Database issue): D580-6, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22140103

RESUMO

The Minimum Information Required in the Annotation of Models Registry (http://www.ebi.ac.uk/miriam) provides unique, perennial and location-independent identifiers for data used in the biomedical domain. At its core is a shared catalogue of data collections, for each of which an individual namespace is created, and extensive metadata recorded. This namespace allows the generation of Uniform Resource Identifiers (URIs) to uniquely identify any record in a collection. Moreover, various services are provided to facilitate the creation and resolution of the identifiers. Since its launch in 2005, the system has evolved in terms of the structure of the identifiers provided, the software infrastructure, the number of data collections recorded, as well as the scope of the Registry itself. We describe here the new parallel identification scheme and the updated supporting software infrastructure. We also introduce the new Identifiers.org service (http://identifiers.org) that is built upon the information stored in the Registry and which provides directly resolvable identifiers, in the form of Uniform Resource Locators (URLs). The flexibility of the identification scheme and resolving system allows its use in many different fields, where unambiguous and perennial identification of data entities are necessary.


Assuntos
Bases de Dados Factuais , Sistema de Registros , Biologia Computacional , Internet , Software
11.
Drug Discov Today ; 28(4): 103510, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36716952

RESUMO

The FAIR (findable, accessible, interoperable and reusable) principles are data management and stewardship guidelines aimed at increasing the effective use of scientific research data. Adherence to these principles in managing data assets in pharmaceutical research and development (R&D) offers pharmaceutical companies the potential to maximise the value of such assets, but the endeavour is costly and challenging. We describe the 'FAIR-Decide' framework, which aims to guide decision-making on the retrospective FAIRification of existing datasets by using business analysis techniques to estimate costs and expected benefits. This framework supports decision-making on FAIRification in the pharmaceutical R&D industry and can be integrated into a company's data management strategy.


Assuntos
Indústria Farmacêutica , Pesquisa , Estudos Retrospectivos , Gerenciamento de Dados , Preparações Farmacêuticas
12.
J Biomed Semantics ; 14(1): 6, 2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37264430

RESUMO

BACKGROUND: The Findable, Accessible, Interoperable and Reusable(FAIR) Principles explicitly require the use of FAIR vocabularies, but what precisely constitutes a FAIR vocabulary remains unclear. Being able to define FAIR vocabularies, identify features of FAIR vocabularies, and provide assessment approaches against the features can guide the development of vocabularies. RESULTS: We differentiate data, data resources and vocabularies used for FAIR, examine the application of the FAIR Principles to vocabularies, align their requirements with the Open Biomedical Ontologies principles, and propose FAIR Vocabulary Features. We also design assessment approaches for FAIR vocabularies by mapping the FVFs with existing FAIR assessment indicators. Finally, we demonstrate how they can be used for evaluating and improving vocabularies using exemplary biomedical vocabularies. CONCLUSIONS: Our work proposes features of FAIR vocabularies and corresponding indicators for assessing the FAIR levels of different types of vocabularies, identifies use cases for vocabulary engineers, and guides the evolution of vocabularies.


Assuntos
Ontologias Biológicas , Vocabulário Controlado , Vocabulário
13.
Sci Data ; 10(1): 291, 2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37208349

RESUMO

The COVID-19 pandemic has highlighted the need for FAIR (Findable, Accessible, Interoperable, and Reusable) data more than any other scientific challenge to date. We developed a flexible, multi-level, domain-agnostic FAIRification framework, providing practical guidance to improve the FAIRness for both existing and future clinical and molecular datasets. We validated the framework in collaboration with several major public-private partnership projects, demonstrating and delivering improvements across all aspects of FAIR and across a variety of datasets and their contexts. We therefore managed to establish the reproducibility and far-reaching applicability of our approach to FAIRification tasks.


Assuntos
COVID-19 , Conjuntos de Dados como Assunto , Humanos , Pandemias , Parcerias Público-Privadas , Reprodutibilidade dos Testes
14.
Sci Data ; 10(1): 292, 2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37208467

RESUMO

The notion that data should be Findable, Accessible, Interoperable and Reusable, according to the FAIR Principles, has become a global norm for good data stewardship and a prerequisite for reproducibility. Nowadays, FAIR guides data policy actions and professional practices in the public and private sectors. Despite such global endorsements, however, the FAIR Principles are aspirational, remaining elusive at best, and intimidating at worst. To address the lack of practical guidance, and help with capability gaps, we developed the FAIR Cookbook, an open, online resource of hands-on recipes for "FAIR doers" in the Life Sciences. Created by researchers and data managers professionals in academia, (bio)pharmaceutical companies and information service industries, the FAIR Cookbook covers the key steps in a FAIRification journey, the levels and indicators of FAIRness, the maturity model, the technologies, the tools and the standards available, as well as the skills required, and the challenges to achieve and improve data FAIRness. Part of the ELIXIR ecosystem, and recommended by funders, the FAIR Cookbook is open to contributions of new recipes.

15.
BMC Bioinformatics ; 13: 101, 2012 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-22591039

RESUMO

BACKGROUND: Computing accurate nucleic acid melting temperatures has become a crucial step for the efficiency and the optimisation of numerous molecular biology techniques such as in situ hybridization, PCR, antigene targeting, and microarrays. MELTING is a free open source software which computes the enthalpy, entropy and melting temperature of nucleic acids. MELTING 4.2 was able to handle several types of hybridization such as DNA/DNA, RNA/RNA, DNA/RNA and provided corrections to melting temperatures due to the presence of sodium. The program can use either an approximative approach or a more accurate Nearest-Neighbor approach. RESULTS: Two new versions of the MELTING software have been released. MELTING 4.3 is a direct update of version 4.2, integrating newly available thermodynamic parameters for inosine, a modified adenine base with an universal base capacity, and incorporates a correction for magnesium. MELTING 5 is a complete reimplementation which allows much greater flexibility and extensibility. It incorporates all the thermodynamic parameters and corrections provided in MELTING 4.x and introduces a large set of thermodynamic formulae and parameters, to facilitate the calculation of melting temperatures for perfectly matching sequences, mismatches, bulge loops, CNG repeats, dangling ends, inosines, locked nucleic acids, 2-hydroxyadenines and azobenzenes. It also includes temperature corrections for monovalent ions (sodium, potassium, Tris), magnesium ions and commonly used denaturing agents such as formamide and DMSO. CONCLUSIONS: MELTING is a useful and very flexible tool for predicting melting temperatures using approximative formulae or Nearest-Neighbor approaches, where one can select different sets of Nearest-Neighbor parameters, corrections and formulae. Both versions are freely available at http://sourceforge.net/projects/melting/and at http://www.ebi.ac.uk/compneur-srv/melting/under the terms of the GPL license.


Assuntos
Desnaturação de Ácido Nucleico , Ácidos Nucleicos/química , Software , Temperatura , DNA/química , Hibridização de Ácido Nucleico/métodos , RNA/química , Termodinâmica , Interface Usuário-Computador
16.
Mol Syst Biol ; 7: 543, 2011 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-22027554

RESUMO

The use of computational modeling to describe and analyze biological systems is at the heart of systems biology. Model structures, simulation descriptions and numerical results can be encoded in structured formats, but there is an increasing need to provide an additional semantic layer. Semantic information adds meaning to components of structured descriptions to help identify and interpret them unambiguously. Ontologies are one of the tools frequently used for this purpose. We describe here three ontologies created specifically to address the needs of the systems biology community. The Systems Biology Ontology (SBO) provides semantic information about the model components. The Kinetic Simulation Algorithm Ontology (KiSAO) supplies information about existing algorithms available for the simulation of systems biology models, their characterization and interrelationships. The Terminology for the Description of Dynamics (TEDDY) categorizes dynamical features of the simulation results and general systems behavior. The provision of semantic information extends a model's longevity and facilitates its reuse. It provides useful insight into the biology of modeled processes, and may be used to make informed decisions on subsequent simulation experiments.


Assuntos
Biologia Computacional , Semântica , Biologia de Sistemas , Vocabulário Controlado , Algoritmos , Simulação por Computador , Armazenamento e Recuperação da Informação , Modelos Biológicos
17.
Drug Discov Today ; 27(8): 2080-2085, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35595012

RESUMO

Despite the intuitive value of adopting the Findable, Accessible, Interoperable, and Reusable (FAIR) principles in both academic and industrial sectors, challenges exist in resourcing, balancing long- versus short-term priorities, and achieving technical implementation. This situation is exacerbated by the unclear mechanisms by which costs and benefits can be assessed when decisions on FAIR are made. Scientific and research and development (R&D) leadership need reliable evidence of the potential benefits and information on effective implementation mechanisms and remediating strategies. In this article, we describe procedures for cost-benefit evaluation, and identify best-practice approaches to support the decision-making process involved in FAIR implementation.


Assuntos
Descoberta de Drogas , Análise Custo-Benefício
19.
F1000Res ; 9: 136, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32308977

RESUMO

We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.


Assuntos
Disciplinas das Ciências Biológicas , Biologia Computacional , Web Semântica , Mineração de Dados , Metadados , Reprodutibilidade dos Testes
20.
Sci Data ; 5: 180029, 2018 05 08.
Artigo em Inglês | MEDLINE | ID: mdl-29737976

RESUMO

Most biomedical data repositories issue locally-unique accessions numbers, but do not provide globally unique, machine-resolvable, persistent identifiers for their datasets, as required by publishers wishing to implement data citation in accordance with widely accepted principles. Local accessions may however be prefixed with a namespace identifier, providing global uniqueness. Such "compact identifiers" have been widely used in biomedical informatics to support global resource identification with local identifier assignment. We report here on our project to provide robust support for machine-resolvable, persistent compact identifiers in biomedical data citation, by harmonizing the Identifiers.org and N2T.net (Name-To-Thing) meta-resolvers and extending their capabilities. Identifiers.org services hosted at the European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), and N2T.net services hosted at the California Digital Library (CDL), can now resolve any given identifier from over 600 source databases to its original source on the Web, using a common registry of prefix-based redirection rules. We believe these services will be of significant help to publishers and others implementing persistent, machine-resolvable citation of research data.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA