RESUMO
Research data is accumulating rapidly and with it the challenge of fully reproducible science. As a consequence, implementation of high-quality management of scientific data has become a global priority. The FAIR (Findable, Accesible, Interoperable and Reusable) principles provide practical guidelines for maximizing the value of research data; however, processing data using workflows-systematic executions of a series of computational tools-is equally important for good data management. The FAIR principles have recently been adapted to Research Software (FAIR4RS Principles) to promote the reproducibility and reusability of any type of research software. Here, we propose a set of 10 quick tips, drafted by experienced workflow developers that will help researchers to apply FAIR4RS principles to workflows. The tips have been arranged according to the FAIR acronym, clarifying the purpose of each tip with respect to the FAIR4RS principles. Altogether, these tips can be seen as practical guidelines for workflow developers who aim to contribute to more reproducible and sustainable computational science, aiming to positively impact the open science and FAIR community.
RESUMO
BACKGROUND: Ontologies have become an essential asset in the bioinformatics toolbox and a number of ontology access resources are now available, for example, the EBI Ontology Lookup Service (OLS) and the NCBO BioPortal. However, these resources differ substantially in mode, ease of access, and ontology content. This makes it relatively difficult to access each ontology source separately, map their contents to research data, and much of this effort is being replicated across different research groups. RESULTS: OntoCAT provides a seamless programming interface to query heterogeneous ontology resources including OLS and BioPortal, as well as user-specified local OWL and OBO files. Each resource is wrapped behind easy to learn Java, Bioconductor/R and REST web service commands enabling reuse and integration of ontology software efforts despite variation in technologies. It is also available as a stand-alone MOLGENIS database and a Google App Engine application. CONCLUSIONS: OntoCAT provides a robust, configurable solution for accessing ontology terms specified locally and from remote services, is available as a stand-alone tool and has been tested thoroughly in the ArrayExpress, MOLGENIS, EFO and Gen2Phen phenotype use cases. AVAILABILITY: http://www.ontocat.org.
Assuntos
Biologia Computacional/métodos , Software , Vocabulário , Bases de Dados Factuais , Humanos , Linguagens de Programação , Interface Usuário-Computador , Vocabulário ControladoRESUMO
In Caenorhabditis elegans, the recent advances in high-throughput quantitative analyses of natural genetic and phenotypic variation have led to a wealth of data on genotype phenotype relations. This data has resulted in the discovery of genes with major allelic effects and insights in the effect of natural genetic variation on a whole range of complex traits as well as how this variation is distributed across the genome. Regardless of the advances presented in specific studies, the majority of the data generated in these studies had yet to be made easily accessible, allowing for meta-analysis. Not only data in figures or tables but meta-data should be accessible for further investigation and comparison between studies. A platform was created where all the data, phenotypic measurements, genotypes, and mappings can be stored, compared, and new linkages within and between published studies can be discovered. WormQTL focuses on quantitative genetics in Caenorhabditis and other nematode species, whereas WormQTL(HD) quantitatively links gene expression quantitative trait loci (eQTL) in C. elegans to gene-disease associations in humans.