Improving the discoverability, accessibility, and citability of omics datasets: a case report.

Darlington, Yolanda F; Naumov, Alexey; McOwiti, Apollo; Kankanamge, Wasula H; Becnel, Lauren B; McKenna, Neil J

Darlington, Yolanda F; Naumov, Alexey; McOwiti, Apollo; Kankanamge, Wasula H; Becnel, Lauren B; McKenna, Neil J.

Afiliação

Darlington YF; Dan L. Duncan Comprehensive Cancer Center Biomedical Informatics Group, Baylor College of Medicine, Houston, Texas, USA.
Naumov A; Dan L. Duncan Comprehensive Cancer Center Biomedical Informatics Group, Baylor College of Medicine, Houston, Texas, USA.
McOwiti A; Dan L. Duncan Comprehensive Cancer Center Biomedical Informatics Group, Baylor College of Medicine, Houston, Texas, USA.
Kankanamge WH; Dan L. Duncan Comprehensive Cancer Center Biomedical Informatics Group, Baylor College of Medicine, Houston, Texas, USA.
Becnel LB; Dan L. Duncan Comprehensive Cancer Center Biomedical Informatics Group, Baylor College of Medicine, Houston, Texas, USA.
McKenna NJ; Clinical Data Interchange Standards Consortium (CDISC), Austin, Texas, USA.

J Am Med Inform Assoc ; 24(2): 388-393, 2017 03 01.

Article em En | MEDLINE | ID: mdl-27413121

ABSTRACT

ABSTRACT

Although omics datasets represent valuable assets for hypothesis generation, model testing, and data validation, the infrastructure supporting their reuse lacks organization and consistency. Using nuclear receptor signaling transcriptomic datasets as proof of principle, we developed a model to improve the discoverability, accessibility, and citability of published omics datasets. Primary datasets were retrieved from archives, processed to extract data points, then subjected to metadata enrichment and gap filling. The resulting secondary datasets were exposed on responsive web pages to support mining of gene lists, discovery of related datasets, and single-click citation integration with popular reference managers. Automated processes were established to embed digital object identifier-driven links to the secondary datasets in associated journal articles, small molecule and gene-centric databases, and a dataset search engine. Our model creates multiple points of access to reprocessed and reannotated derivative datasets across the digital biomedical research ecosystem, promoting their visibility and usability across disparate research communities.

Assuntos

Conjuntos de Dados como Assunto; Transcriptoma; Pesquisa Biomédica; Bases de Dados Genéticas; Genômica; Humanos; Metadados

Palavras-chave

citation; curation; dataset; metadata; omics; reuse

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Transcriptoma / Conjuntos de Dados como Assunto Limite: Humans Idioma: En Ano de publicação: 2017 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google