Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 72
Filtrar
1.
PeerJ Comput Sci ; 10: e1781, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38855229

RESUMEN

FAIR Digital Object (FDO) is an emerging concept that is highlighted by European Open Science Cloud (EOSC) as a potential candidate for building an ecosystem of machine-actionable research outputs. In this work we systematically evaluate FDO and its implementations as a global distributed object system, by using five different conceptual frameworks that cover interoperability, middleware, FAIR principles, EOSC requirements and FDO guidelines themself. We compare the FDO approach with established Linked Data practices and the existing Web architecture, and provide a brief history of the Semantic Web while discussing why these technologies may have been difficult to adopt for FDO purposes. We conclude with recommendations for both Linked Data and FDO communities to further their adaptation and alignment.

2.
Learn Health Syst ; 8(1): e10365, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38249839

RESUMEN

Open and practical exchange, dissemination, and reuse of specimens and data have become a fundamental requirement for life sciences research. The quality of the data obtained and thus the findings and knowledge derived is thus significantly influenced by the quality of the samples, the experimental methods, and the data analysis. Therefore, a comprehensive and precise documentation of the pre-analytical conditions, the analytical procedures, and the data processing are essential to be able to assess the validity of the research results. With the increasing importance of the exchange, reuse, and sharing of data and samples, procedures are required that enable cross-organizational documentation, traceability, and non-repudiation. At present, this information on the provenance of samples and data is mostly either sparse, incomplete, or incoherent. Since there is no uniform framework, this information is usually only provided within the organization and not interoperably. At the same time, the collection and sharing of biological and environmental specimens increasingly require definition and documentation of benefit sharing and compliance to regulatory requirements rather than consideration of pure scientific needs. In this publication, we present an ongoing standardization effort to provide trustworthy machine-actionable documentation of the data lineage and specimens. We would like to invite experts from the biotechnology and biomedical fields to further contribute to the standard.

3.
EMBO J ; 42(23): e115008, 2023 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-37964598

RESUMEN

The main goals and challenges for the life science communities in the Open Science framework are to increase reuse and sustainability of data resources, software tools, and workflows, especially in large-scale data-driven research and computational analyses. Here, we present key findings, procedures, effective measures and recommendations for generating and establishing sustainable life science resources based on the collaborative, cross-disciplinary work done within the EOSC-Life (European Open Science Cloud for Life Sciences) consortium. Bringing together 13 European life science research infrastructures, it has laid the foundation for an open, digital space to support biological and medical research. Using lessons learned from 27 selected projects, we describe the organisational, technical, financial and legal/ethical challenges that represent the main barriers to sustainability in the life sciences. We show how EOSC-Life provides a model for sustainable data management according to FAIR (findability, accessibility, interoperability, and reusability) principles, including solutions for sensitive- and industry-related resources, by means of cross-disciplinary training and best practices sharing. Finally, we illustrate how data harmonisation and collaborative work facilitate interoperability of tools, data, solutions and lead to a better understanding of concepts, semantics and functionalities in the life sciences.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Investigación Biomédica , Programas Informáticos , Flujo de Trabajo
4.
Sci Data ; 10(1): 756, 2023 11 02.
Artículo en Inglés | MEDLINE | ID: mdl-37919302

RESUMEN

Biological science produces "big data" in varied formats, which necessitates using computational tools to process, integrate, and analyse data. Researchers using computational biology tools range from those using computers for communication, to those writing analysis code. We examine differences in how researchers conceptualise the same data, which we call "subjective data models". We interviewed 22 people with biological experience and varied levels of computational experience, and found that many had fluid subjective data models that changed depending on circumstance. Surprisingly, results did not cluster around participants' computational experience levels. People did not consistently map entities from abstract data models to the real-world entities in files, and certain data identifier formats were easier to infer meaning from than others. Real-world implications: 1) software engineers should design interfaces for task performance, emulating popular user interfaces, rather than targeting professional backgrounds; 2) when insufficient context is provided, people may guess what data means, whether or not they are correct, emphasising the importance of contextual metadata to remove the need for erroneous guesswork.

5.
J Biomed Semantics ; 14(1): 6, 2023 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-37264430

RESUMEN

BACKGROUND: The Findable, Accessible, Interoperable and Reusable(FAIR) Principles explicitly require the use of FAIR vocabularies, but what precisely constitutes a FAIR vocabulary remains unclear. Being able to define FAIR vocabularies, identify features of FAIR vocabularies, and provide assessment approaches against the features can guide the development of vocabularies. RESULTS: We differentiate data, data resources and vocabularies used for FAIR, examine the application of the FAIR Principles to vocabularies, align their requirements with the Open Biomedical Ontologies principles, and propose FAIR Vocabulary Features. We also design assessment approaches for FAIR vocabularies by mapping the FVFs with existing FAIR assessment indicators. Finally, we demonstrate how they can be used for evaluating and improving vocabularies using exemplary biomedical vocabularies. CONCLUSIONS: Our work proposes features of FAIR vocabularies and corresponding indicators for assessing the FAIR levels of different types of vocabularies, identifies use cases for vocabulary engineers, and guides the evolution of vocabularies.


Asunto(s)
Ontologías Biológicas , Vocabulario Controlado , Vocabulario
6.
Sci Data ; 10(1): 291, 2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-37208349

RESUMEN

The COVID-19 pandemic has highlighted the need for FAIR (Findable, Accessible, Interoperable, and Reusable) data more than any other scientific challenge to date. We developed a flexible, multi-level, domain-agnostic FAIRification framework, providing practical guidance to improve the FAIRness for both existing and future clinical and molecular datasets. We validated the framework in collaboration with several major public-private partnership projects, demonstrating and delivering improvements across all aspects of FAIR and across a variety of datasets and their contexts. We therefore managed to establish the reproducibility and far-reaching applicability of our approach to FAIRification tasks.


Asunto(s)
COVID-19 , Conjuntos de Datos como Asunto , Humanos , Pandemias , Asociación entre el Sector Público-Privado , Reproducibilidad de los Resultados
7.
Curr Protoc ; 3(2): e682, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-36809564

RESUMEN

Many trainers and organizations are passionate about sharing their training material. Sharing training material has several benefits, such as providing a record of recognition as an author, offering inspiration to other trainers, enabling researchers to discover training resources for their personal learning path, and improving the training resource landscape using data-driven gap analysis from the bioinformatics community. In this article, we present a series of protocols for using the ELIXIR online training registry Training eSupport System (TeSS). TeSS provides a one-stop shop for trainers and trainees to discover online information and content, including training materials, events, and interactive tutorials. For trainees, we provide protocols for registering and logging in and for searching and filtering content. For trainers and organizations, we also show how to manually or automatically register training events and materials. Following these protocols will contribute to promoting training events and add to a growing catalog of materials. This will concomitantly increase the FAIRness of training materials and events. Training registries like TeSS use a scraping mechanism to aggregate training resources from many providers when they have been annotated using Bioschemas specifications. Finally, we describe how to enrich training resources to allow for more efficient sharing of the structured metadata, such as prerequisites, target audience, and learning outcomes using Bioschemas specification. As increasing training events and material are aggregated in TeSS, searching the registry for specific events and materials becomes crucial. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Searching for training events and materials in TeSS Support Protocol: Integrating TeSS widgets on your website Basic Protocol 2: Logging in to TeSS using an institutional account Alternate Protocol: Creating and logging in to a TeSS account Basic Protocol 3: Manual registration of training events in TeSS Basic Protocol 4: Manual registration of training materials in TeSS Basic Protocol 5: Registration of a content provider in TeSS Basic Protocol 6: Automated harvesting of training events and materials in TeSS.


Asunto(s)
Biología Computacional , Investigadores , Humanos
8.
Drug Discov Today ; 28(4): 103510, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36716952

RESUMEN

The FAIR (findable, accessible, interoperable and reusable) principles are data management and stewardship guidelines aimed at increasing the effective use of scientific research data. Adherence to these principles in managing data assets in pharmaceutical research and development (R&D) offers pharmaceutical companies the potential to maximise the value of such assets, but the endeavour is costly and challenging. We describe the 'FAIR-Decide' framework, which aims to guide decision-making on the retrospective FAIRification of existing datasets by using business analysis techniques to estimate costs and expected benefits. This framework supports decision-making on FAIRification in the pharmaceutical R&D industry and can be integrated into a company's data management strategy.


Asunto(s)
Industria Farmacéutica , Investigación , Estudios Retrospectivos , Manejo de Datos , Preparaciones Farmacéuticas
9.
Drug Discov Today ; 27(8): 2080-2085, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35595012

RESUMEN

Despite the intuitive value of adopting the Findable, Accessible, Interoperable, and Reusable (FAIR) principles in both academic and industrial sectors, challenges exist in resourcing, balancing long- versus short-term priorities, and achieving technical implementation. This situation is exacerbated by the unclear mechanisms by which costs and benefits can be assessed when decisions on FAIR are made. Scientific and research and development (R&D) leadership need reliable evidence of the potential benefits and information on effective implementation mechanisms and remediating strategies. In this article, we describe procedures for cost-benefit evaluation, and identify best-practice approaches to support the decision-making process involved in FAIR implementation.


Asunto(s)
Descubrimiento de Drogas , Análisis Costo-Beneficio
11.
F1000Res ; 112022.
Artículo en Inglés | MEDLINE | ID: mdl-36742342

RESUMEN

In this white paper, we describe the founding of a new ELIXIR Community - the Systems Biology Community - and its proposed future contributions to both ELIXIR and the broader community of systems biologists in Europe and worldwide. The Community believes that the infrastructure aspects of systems biology - databases, (modelling) tools and standards development, as well as training and access to cloud infrastructure - are not only appropriate components of the ELIXIR infrastructure, but will prove key components of ELIXIR's future support of advanced biological applications and personalised medicine. By way of a series of meetings, the Community identified seven key areas for its future activities, reflecting both future needs and previous and current activities within ELIXIR Platforms and Communities. These are: overcoming barriers to the wider uptake of systems biology; linking new and existing data to systems biology models; interoperability of systems biology resources; further development and embedding of systems medicine; provisioning of modelling as a service; building and coordinating capacity building and training resources; and supporting industrial embedding of systems biology. A set of objectives for the Community has been identified under four main headline areas: Standardisation and Interoperability, Technology, Capacity Building and Training, and Industrial Embedding. These are grouped into short-term (3-year), mid-term (6-year) and long-term (10-year) objectives.


Asunto(s)
Biología de Sistemas , Europa (Continente) , Bases de Datos Factuales
13.
F1000Res ; 10: 897, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34804501

RESUMEN

Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have returned the long-standing vision of automated workflow composition into focus. This article summarizes a recent Lorentz Center workshop dedicated to automated composition of workflows in the life sciences. We survey previous initiatives to automate the composition process, and discuss the current state of the art and future perspectives. We start by drawing the "big picture" of the scientific workflow development life cycle, before surveying and discussing current methods, technologies and practices for semantic domain modelling, automation in workflow development, and workflow assessment. Finally, we derive a roadmap of individual and community-based actions to work toward the vision of automated workflow development in the forthcoming years. A central outcome of the workshop is a general description of the workflow life cycle in six stages: 1) scientific question or hypothesis, 2) conceptual workflow, 3) abstract workflow, 4) concrete workflow, 5) production workflow, and 6) scientific results. The transitions between stages are facilitated by diverse tools and methods, usually incorporating domain knowledge in some form. Formal semantic domain modelling is hard and often a bottleneck for the application of semantic technologies. However, life science communities have made considerable progress here in recent years and are continuously improving, renewing interest in the application of semantic technologies for workflow exploration, composition and instantiation. Combined with systematic benchmarking with reference data and large-scale deployment of production-stage workflows, such technologies enable a more systematic process of workflow development than we know today. We believe that this can lead to more robust, reusable, and sustainable workflows in the future.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Biología Computacional , Benchmarking , Programas Informáticos , Flujo de Trabajo
14.
F1000Res ; 10: 324, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-36873457

RESUMEN

Artificial Intelligence (AI) is increasingly used within plant science, yet it is far from being routinely and effectively implemented in this domain. Particularly relevant to the development of novel food and agricultural technologies is the development of validated, meaningful and usable ways to integrate, compare and visualise large, multi-dimensional datasets from different sources and scientific approaches. After a brief summary of the reasons for the interest in data science and AI within plant science, the paper identifies and discusses eight key challenges in data management that must be addressed to further unlock the potential of AI in crop and agronomic research, and particularly the application of Machine Learning (AI) which holds much promise for this domain.

15.
Bioinformatics ; 37(12): 1781-1782, 2021 07 19.
Artículo en Inglés | MEDLINE | ID: mdl-33031499

RESUMEN

MOTIVATION: Since its launch in 2010, Identifiers.org has become an important tool for the annotation and cross-referencing of Life Science data. In 2016, we established the Compact Identifier (CID) scheme (prefix: accession) to generate globally unique identifiers for data resources using their locally assigned accession identifiers. Since then, we have developed and improved services to support the growing need to create, reference and resolve CIDs, in systems ranging from human readable text to cloud-based e-infrastructures, by providing high availability and low-latency cloud-based services, backed by a high-quality, manually curated resource. RESULTS: We describe a set of services that can be used to construct and resolve CIDs in Life Sciences and beyond. We have developed a new front end for accessing the Identifiers.org registry data and APIs to simplify integration of Identifiers.org CID services with third-party applications. We have also deployed the new Identifiers.org infrastructure in a commercial cloud environment, bringing our services closer to the data. AVAILABILITYAND IMPLEMENTATION: https://identifiers.org.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Nube Computacional , Humanos
16.
Bioinformatics ; 36(10): 3290-3291, 2020 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-32044952

RESUMEN

SUMMARY: Dispersed across the Internet is an abundance of disparate, disconnected training information, making it hard for researchers to find training opportunities that are relevant to them. To address this issue, we have developed a new platform-TeSS-which aggregates geographically distributed information and presents it in a central, feature-rich portal. Data are gathered automatically from content providers via bespoke scripts. These resources are cross-linked with related data and tools registries, and made available via a search interface, a data API and through widgets. AVAILABILITY AND IMPLEMENTATION: https://tess.elixir-europe.org.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Programas Informáticos , Humanos , Internet , Investigadores
17.
Gigascience ; 8(11)2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-31675414

RESUMEN

BACKGROUND: The automation of data analysis in the form of scientific workflows has become a widely adopted practice in many fields of research. Computationally driven data-intensive experiments using workflows enable automation, scaling, adaptation, and provenance support. However, there are still several challenges associated with the effective sharing, publication, and reproducibility of such workflows due to the incomplete capture of provenance and lack of interoperability between different technical (software) platforms. RESULTS: Based on best-practice recommendations identified from the literature on workflow design, sharing, and publishing, we define a hierarchical provenance framework to achieve uniformity in provenance and support comprehensive and fully re-executable workflows equipped with domain-specific information. To realize this framework, we present CWLProv, a standard-based format to represent any workflow-based computational analysis to produce workflow output artefacts that satisfy the various levels of provenance. We use open source community-driven standards, interoperable workflow definitions in Common Workflow Language (CWL), structured provenance representation using the W3C PROV model, and resource aggregation and sharing as workflow-centric research objects generated along with the final outputs of a given workflow enactment. We demonstrate the utility of this approach through a practical implementation of CWLProv and evaluation using real-life genomic workflows developed by independent groups. CONCLUSIONS: The underlying principles of the standards utilized by CWLProv enable semantically rich and executable research objects that capture computational workflows with retrospective provenance such that any platform supporting CWL will be able to understand the analysis, reuse the methods for partial reruns, or reproduce the analysis to validate the published findings.


Asunto(s)
Genómica , Modelos Teóricos , Flujo de Trabajo , Humanos , Programas Informáticos
18.
Methods Mol Biol ; 2049: 285-314, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31602618

RESUMEN

Computational systems biology involves integrating heterogeneous datasets in order to generate models. These models can assist with understanding and prediction of biological phenomena. Generating datasets and integrating them into models involves a wide range of scientific expertise. As a result these datasets are often collected by one set of researchers, and exchanged with others researchers for constructing the models. For this process to run smoothly the data and models must be FAIR-findable, accessible, interoperable, and reusable. In order for data and models to be FAIR they must be structured in consistent and predictable ways, and described sufficiently for other researchers to understand them. Furthermore, these data and models must be shared with other researchers, with appropriately controlled sharing permissions, before and after publication. In this chapter we explore the different data and model standards that assist with structuring, describing, and sharing. We also highlight the popular standards and sharing databases within computational systems biology.


Asunto(s)
Manejo de Datos/métodos , Biología de Sistemas/métodos , Biología Computacional , Bases de Datos Factuales
19.
Sci Data ; 6(1): 169, 2019 09 10.
Artículo en Inglés | MEDLINE | ID: mdl-31506435

RESUMEN

In the recent years, the improvement of software and hardware performance has made biomolecular simulations a mature tool for the study of biological processes. Simulation length and the size and complexity of the analyzed systems make simulations both complementary and compatible with other bioinformatics disciplines. However, the characteristics of the software packages used for simulation have prevented the adoption of the technologies accepted in other bioinformatics fields like automated deployment systems, workflow orchestration, or the use of software containers. We present here a comprehensive exercise to bring biomolecular simulations to the "bioinformatics way of working". The exercise has led to the development of the BioExcel Building Blocks (BioBB) library. BioBB's are built as Python wrappers to provide an interoperable architecture. BioBB's have been integrated in a chain of usual software management tools to generate data ontologies, documentation, installation packages, software containers and ways of integration with workflow managers, that make them usable in most computational environments.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...