Búsqueda | Portal de Búsqueda de la BVS Colombia

Sequence database versioning for command line and Galaxy bioinformatics servers.

Dooley, Damion M; Petkau, Aaron J; Van Domselaar, Gary; Hsiao, William W L.

Bioinformatics ; 32(8): 1275-7, 2016 04 15.

Artículo en Inglés | MEDLINE | ID: mdl-26656932

RESUMEN

MOTIVATION: There are various reasons for rerunning bioinformatics tools and pipelines on sequencing data, including reproducing a past result, validation of a new tool or workflow using a known dataset, or tracking the impact of database changes. For identical results to be achieved, regularly updated reference sequence databases must be versioned and archived. Database administrators have tried to fill the requirements by supplying users with one-off versions of databases, but these are time consuming to set up and are inconsistent across resources. Disk storage and data backup performance has also discouraged maintaining multiple versions of databases since databases such as NCBI nr can consume 50 Gb or more disk space per version, with growth rates that parallel Moore's law. RESULTS: Our end-to-end solution combines our own Kipper software package-a simple key-value large file versioning system-with BioMAJ (software for downloading sequence databases), and Galaxy (a web-based bioinformatics data processing platform). Available versions of databases can be recalled and used by command-line and Galaxy users. The Kipper data store format makes publishing curated FASTA databases convenient since in most cases it can store a range of versions into a file marginally larger than the size of the latest version. AVAILABILITY AND IMPLEMENTATION: Kipper v1.0.0 and the Galaxy Versioned Data tool are written in Python and released as free and open source software available at https://github.com/Public-Health-Bioinformatics/kipper and https://github.com/Public-Health-Bioinformatics/versioned_data, respectively; detailed setup instructions can be found at https://github.com/Public-Health-Bioinformatics/versioned_data/blob/master/doc/setup.md CONTACT: : Damion.Dooley@Bccdc.Ca or William.Hsiao@Bccdc.CaSupplementary information: Supplementary data are available at Bioinformatics online.

Asunto(s)

Biología Computacional , Bases de Datos de Ácidos Nucleicos , Programas Informáticos , Interfaz Usuario-Computador

OBO Foundry in 2021: operationalizing open data principles to evaluate ontologies.

Jackson, Rebecca; Matentzoglu, Nicolas; Overton, James A; Vita, Randi; Balhoff, James P; Buttigieg, Pier Luigi; Carbon, Seth; Courtot, Melanie; Diehl, Alexander D; Dooley, Damion M; Duncan, William D; Harris, Nomi L; Haendel, Melissa A; Lewis, Suzanna E; Natale, Darren A; Osumi-Sutherland, David; Ruttenberg, Alan; Schriml, Lynn M; Smith, Barry; Stoeckert, Christian J; Vasilevsky, Nicole A; Walls, Ramona L; Zheng, Jie; Mungall, Christopher J; Peters, Bjoern.

Database (Oxford) ; 20212021 10 26.

Artículo en Inglés | MEDLINE | ID: mdl-34697637

RESUMEN

Biological ontologies are used to organize, curate and interpret the vast quantities of data arising from biological experiments. While this works well when using a single ontology, integrating multiple ontologies can be problematic, as they are developed independently, which can lead to incompatibilities. The Open Biological and Biomedical Ontologies (OBO) Foundry was created to address this by facilitating the development, harmonization, application and sharing of ontologies, guided by a set of overarching principles. One challenge in reaching these goals was that the OBO principles were not originally encoded in a precise fashion, and interpretation was subjective. Here, we show how we have addressed this by formally encoding the OBO principles as operational rules and implementing a suite of automated validation checks and a dashboard for objectively evaluating each ontology's compliance with each principle. This entailed a substantial effort to curate metadata across all ontologies and to coordinate with individual stakeholders. We have applied these checks across the full OBO suite of ontologies, revealing areas where individual ontologies require changes to conform to our principles. Our work demonstrates how a sizable, federated community can be organized and evaluated on objective criteria that help improve overall quality and interoperability, which is vital for the sustenance of the OBO project and towards the overall goals of making data Findable, Accessible, Interoperable, and Reusable (FAIR). Database URL http://obofoundry.org/.

Asunto(s)

Ontologías Biológicas , Bases de Datos Factuales , Metadatos

FoodOn: a harmonized food ontology to increase global food traceability, quality control and data integration.

Dooley, Damion M; Griffiths, Emma J; Gosal, Gurinder S; Buttigieg, Pier L; Hoehndorf, Robert; Lange, Matthew C; Schriml, Lynn M; Brinkman, Fiona S L; Hsiao, William W L.

NPJ Sci Food ; 2: 23, 2018.

Artículo en Inglés | MEDLINE | ID: mdl-31304272

RESUMEN

The construction of high capacity data sharing networks to support increasing government and commercial data exchange has highlighted a key roadblock: the content of existing Internet-connected information remains siloed due to a multiplicity of local languages and data dictionaries. This lack of a digital lingua franca is obvious in the domain of human food as materials travel from their wild or farm origin, through processing and distribution chains, to consumers. Well defined, hierarchical vocabulary, connected with logical relationships-in other words, an ontology-is urgently needed to help tackle data harmonization problems that span the domains of food security, safety, quality, production, distribution, and consumer health and convenience. FoodOn (http://foodon.org) is a consortium-driven project to build a comprehensive and easily accessible global farm-to-fork ontology about food, that accurately and consistently describes foods commonly known in cultures from around the world. FoodOn addresses food product terminology gaps and supports food traceability. Focusing on human and domesticated animal food description, FoodOn contains animal and plant food sources, food categories and products, and other facets like preservation processes, contact surfaces, and packaging. Much of FoodOn's vocabulary comes from transforming LanguaL, a mature and popular food indexing thesaurus, into a World Wide Web Consortium (W3C) OWL Web Ontology Language-formatted vocabulary that provides system interoperability, quality control, and software-driven intelligence. FoodOn compliments other technologies facilitating food traceability, which is becoming critical in this age of increasing globalization of food networks.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA