Ontology-driven integrative analysis of omics data through Onassis.
Sci Rep
; 10(1): 703, 2020 01 20.
Article
em En
| MEDLINE
| ID: mdl-31959844
Public repositories of large-scale omics datasets represent a valuable resource for researchers. In fact, data re-analysis can either answer novel questions or provide critical data able to complement in-house experiments. However, despite the development of standards for the compilation of metadata, the identification and organization of samples still constitutes a major bottleneck hampering data reuse. We introduce Onassis, an R package within the Bioconductor environment providing key functionalities of Natural Language Processing (NLP) tools. Leveraging biomedical ontologies, Onassis greatly simplifies the association of samples from large-scale repositories to their representation in terms of ontology-based annotations. Moreover, through the use of semantic similarity measures, Onassis hierarchically organizes the datasets of interest, thus supporting the semantically aware analysis of the corresponding omics data. In conclusion, Onassis leverages NLP techniques, biomedical ontologies, and the R statistical framework, to identify, relate, and analyze datasets from public repositories. The tool was tested on various large-scale datasets, including compendia of gene expression, histone marks, and DNA methylation, illustrating how it can facilitate the integrative analysis of various omics data.
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Idioma:
En
Revista:
Sci Rep
Ano de publicação:
2020
Tipo de documento:
Article
País de afiliação:
Itália
País de publicação:
Reino Unido