Pesquisa | Biblioteca Virtual em Saúde

BugSigDB captures patterns of differential abundance across a broad range of host-associated microbial signatures.

Geistlinger, Ludwig; Mirzayi, Chloe; Zohra, Fatima; Azhar, Rimsha; Elsafoury, Shaimaa; Grieve, Clare; Wokaty, Jennifer; Gamboa-Tuz, Samuel David; Sengupta, Pratyay; Hecht, Issac; Ravikrishnan, Aarthi; Gonçalves, Rafael S; Franzosa, Eric; Raman, Karthik; Carey, Vincent; Dowd, Jennifer B; Jones, Heidi E; Davis, Sean; Segata, Nicola; Huttenhower, Curtis; Waldron, Levi.

Nat Biotechnol ; 2023 Sep 11.

Artigo em Inglês | MEDLINE | ID: mdl-37697152

RESUMO

The literature of human and other host-associated microbiome studies is expanding rapidly, but systematic comparisons among published results of host-associated microbiome signatures of differential abundance remain difficult. We present BugSigDB, a community-editable database of manually curated microbial signatures from published differential abundance studies accompanied by information on study geography, health outcomes, host body site and experimental, epidemiological and statistical methods using controlled vocabulary. The initial release of the database contains >2,500 manually curated signatures from >600 published studies on three host species, enabling high-throughput analysis of signature similarity, taxon enrichment, co-occurrence and coexclusion and consensus signatures. These data allow assessment of microbiome differential abundance within and across experimental conditions, environments or body sites. Database-wide analysis reveals experimental conditions with the highest level of consistency in signatures reported by independent studies and identifies commonalities among disease-associated signatures, including frequent introgression of oral pathobionts into the gut.

Obstacles to the reuse of study metadata in ClinicalTrials.gov.

Miron, Laura; Gonçalves, Rafael S; Musen, Mark A.

Sci Data ; 7(1): 443, 2020 12 18.

Artigo em Inglês | MEDLINE | ID: mdl-33339830

RESUMO

Metadata that are structured using principled schemas and that use terms from ontologies are essential to making biomedical data findable and reusable for downstream analyses. The largest source of metadata that describes the experimental protocol, funding, and scientific leadership of clinical studies is ClinicalTrials.gov. We evaluated whether values in 302,091 trial records adhere to expected data types and use terms from biomedical ontologies, whether records contain fields required by government regulations, and whether structured elements could replace free-text elements. Contact information, outcome measures, and study design are frequently missing or underspecified. Important fields for search, such as condition and intervention, are not restricted to ontologies, and almost half of the conditions are not denoted by MeSH terms, as recommended. Eligibility criteria are stored as semi-structured free text. Enforcing the presence of all required elements, requiring values for certain fields to be drawn from ontologies, and creating a structured eligibility criteria element would improve the reusability of data from ClinicalTrials.gov in systematic reviews, metanalyses, and matching of eligible patients to trials.

Assuntos

Ensaios Clínicos como Assunto , Bases de Dados Factuais , Metadados , Projetos de Pesquisa/normas , Conjuntos de Dados como Assunto

ParaDB: A manually curated database containing genomic annotation for the human pathogenic fungi Paracoccidioides spp.

Aciole Barbosa, David; Menegidio, Fabiano Bezerra; Alencar, Valquíria Campos; Gonçalves, Rafael S; Silva, Juliana de Fátima Santos; Vilas Boas, Renata Ozelami; Faustino de Maria, Yara Natércia Lima; Jabes, Daniela Leite; Costa de Oliveira, Regina; Nunes, Luiz R.

PLoS Negl Trop Dis ; 13(7): e0007576, 2019 07.

Artigo em Inglês | MEDLINE | ID: mdl-31306428

RESUMO

BACKGROUND: The genus Paracoccidioides consists of thermodymorphic fungi responsible for Paracoccidioidomycosis (PCM), a systemic mycosis that has been registered to affect ~10 million people in Latin America. Biogeographical data subdivided the genus Paracoccidioides in five divergent subgroups, which have been recently classified as different species. Genomic sequencing of five Paracoccidioides isolates, representing each of these subgroups/species provided an important framework for the development of post-genomic studies with these fungi. However, functional annotations of these genomes have not been submitted to manual curation and, as a result, ~60-90% of the Paracoccidioides protein-coding genes (depending on isolate/annotation) are currently described as responsible for hypothetical proteins, without any further functional/structural description. PRINCIPAL FINDINGS: The present work reviews the functional assignment of Paracoccidioides genes, reducing the number of hypothetical proteins to ~25-28%. These results were compiled in a relational database called ParaDB, dedicated to the main representatives of Paracoccidioides spp. ParaDB can be accessed through a friendly graphical interface, which offers search tools based on keywords or protein/DNA sequences. All data contained in ParaDB can be partially or completely downloaded through spreadsheet, multi-fasta and GFF3-formatted files, which can be subsequently used in a variety of downstream functional analyses. Moreover, the entire ParaDB environment has been configured in a Docker service, which has been submitted to the GitHub repository, ensuring long-term data availability to researchers. This service can be downloaded and used to perform fully functional local installations of the database in alternative computing ecosystems, allowing users to conduct their data mining and analyses in a personal and stable working environment. CONCLUSIONS: These new annotations greatly reduce the number of genes identified solely as hypothetical proteins and are integrated into a dedicated database, providing resources to assist researchers in this field to conduct post-genomic studies with this group of human pathogenic fungi.

Assuntos

Bases de Dados Genéticas , Genoma Fúngico/genética , Anotação de Sequência Molecular , Paracoccidioides/genética , Paracoccidioidomicose/microbiologia , Sequência de Aminoácidos , Sequência de Bases , Computadores Moleculares , Ecossistema , Proteínas Fúngicas/genética , Humanos , América Latina , Paracoccidioides/isolamento & purificação , Pesquisa

The variable quality of metadata about biological samples used in biomedical experiments.

Gonçalves, Rafael S; Musen, Mark A.

Sci Data ; 6: 190021, 2019 02 19.

Artigo em Inglês | MEDLINE | ID: mdl-30778255

RESUMO

We present an analytical study of the quality of metadata about samples used in biomedical experiments. The metadata under analysis are stored in two well-known databases: BioSample-a repository managed by the National Center for Biotechnology Information (NCBI), and BioSamples-a repository managed by the European Bioinformatics Institute (EBI). We tested whether 11.4 M sample metadata records in the two repositories are populated with values that fulfill the stated requirements for such values. Our study revealed multiple anomalies in the metadata. Most metadata field names and their values are not standardized or controlled. Even simple binary or numeric fields are often populated with inadequate values of different data types. By clustering metadata field names, we discovered there are often many distinct ways to represent the same aspect of a sample. Overall, the metadata we analyzed reveal that there is a lack of principled mechanisms to enforce and validate metadata requirements. The significant aberrancies that we found in the metadata are likely to impede search and secondary use of the associated datasets.

Assuntos

Bancos de Espécimes Biológicos , Metadados/normas , Confiabilidade dos Dados

An ontology-driven tool for structured data acquisition using Web forms.

Gonçalves, Rafael S; Tu, Samson W; Nyulas, Csongor I; Tierney, Michael J; Musen, Mark A.

J Biomed Semantics ; 8(1): 26, 2017 Aug 01.

Artigo em Inglês | MEDLINE | ID: mdl-28764813

RESUMO

BACKGROUND: Structured data acquisition is a common task that is widely performed in biomedicine. However, current solutions for this task are far from providing a means to structure data in such a way that it can be automatically employed in decision making (e.g., in our example application domain of clinical functional assessment, for determining eligibility for disability benefits) based on conclusions derived from acquired data (e.g., assessment of impaired motor function). To use data in these settings, we need it structured in a way that can be exploited by automated reasoning systems, for instance, in the Web Ontology Language (OWL); the de facto ontology language for the Web. RESULTS: We tackle the problem of generating Web-based assessment forms from OWL ontologies, and aggregating input gathered through these forms as an ontology of "semantically-enriched" form data that can be queried using an RDF query language, such as SPARQL. We developed an ontology-based structured data acquisition system, which we present through its specific application to the clinical functional assessment domain. We found that data gathered through our system is highly amenable to automatic analysis using queries. CONCLUSIONS: We demonstrated how ontologies can be used to help structuring Web-based forms and to semantically enrich the data elements of the acquired structured data. The ontologies associated with the enriched data elements enable automated inferences and provide a rich vocabulary for performing queries.

Assuntos

Ontologias Biológicas , Armazenamento e Recuperação da Informação/métodos , Internet , Software

The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments.

Gonçalves, Rafael S; O'Connor, Martin J; Martínez-Romero, Marcos; Egyedi, Attila L; Willrett, Debra; Graybeal, John; Musen, Mark A.

Semant Web ISWC ; 10588: 103-110, 2017 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-32219223

RESUMO

The Center for Expanded Data Annotation and Retrieval (CEDAR) aims to revolutionize the way that metadata describing scientific experiments are authored. The software we have developed-the CEDAR Workbench-is a suite of Web-based tools and REST APIs that allows users to construct metadata templates, to fill in templates to generate high-quality metadata, and to share and manage these resources. The CEDAR Workbench provides a versatile, REST-based environment for authoring metadata that are enriched with terms from ontologies. The metadata are available as JSON, JSON-LD, or RDF for easy integration in scientific applications and reusability on the Web. Users can leverage our APIs for validating and submitting metadata to external repositories. The CEDAR Workbench is freely available and open-source.

The OWL Reasoner Evaluation (ORE) 2015 Competition Report.

Parsia, Bijan; Matentzoglu, Nicolas; Gonçalves, Rafael S; Glimm, Birte; Steigmiller, Andreas.

J Autom Reason ; 59(4): 455-482, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-30069067

RESUMO

The OWL Reasoner Evaluation competition is an annual competition (with an associated workshop) that pits OWL 2 compliant reasoners against each other on various standard reasoning tasks over naturally occurring problems. The 2015 competition was the third of its sort and had 14 reasoners competing in six tracks comprising three tasks (consistency, classification, and realisation) over two profiles (OWL 2 DL and EL). In this paper, we discuss the design, execution and results of the 2015 competition with particular attention to lessons learned for benchmarking, comparative experiments, and future competitions.

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA