Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Bioinformatics ; 38(24): 5466-5468, 2022 12 13.
Artículo en Inglés | MEDLINE | ID: mdl-36303318

RESUMEN

MOTIVATION: A global medical crisis like the coronavirus disease 2019 (COVID-19) pandemic requires interdisciplinary and highly collaborative research from all over the world. One of the key challenges for collaborative research is a lack of interoperability among various heterogeneous data sources. Interoperability, standardization and mapping of datasets are necessary for data analysis and applications in advanced algorithms such as developing personalized risk prediction modeling. RESULTS: To ensure the interoperability and compatibility among COVID-19 datasets, we present here a common data model (CDM) which has been built from 11 different COVID-19 datasets from various geographical locations. The current version of the CDM holds 4639 data variables related to COVID-19 such as basic patient information (age, biological sex and diagnosis) as well as disease-specific data variables, for example, Anosmia and Dyspnea. Each of the data variables in the data model is associated with specific data types, variable mappings, value ranges, data units and data encodings that could be used for standardizing any dataset. Moreover, the compatibility with established data standards like OMOP and FHIR makes the CDM a well-designed CDM for COVID-19 data interoperability. AVAILABILITY AND IMPLEMENTATION: The CDM is available in a public repo here: https://github.com/Fraunhofer-SCAI-Applied-Semantics/COVID-19-Global-Model. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
COVID-19 , Humanos , Algoritmos , Pandemias
2.
Bioinformatics ; 38(15): 3850-3852, 2022 08 02.
Artículo en Inglés | MEDLINE | ID: mdl-35652780

RESUMEN

MOTIVATION: The importance of clinical data in understanding the pathophysiology of complex disorders has prompted the launch of multiple initiatives designed to generate patient-level data from various modalities. While these studies can reveal important findings relevant to the disease, each study captures different yet complementary aspects and modalities which, when combined, generate a more comprehensive picture of disease etiology. However, achieving this requires a global integration of data across studies, which proves to be challenging given the lack of interoperability of cohort datasets. RESULTS: Here, we present the Data Steward Tool (DST), an application that allows for semi-automatic semantic integration of clinical data into ontologies and global data models and data standards. We demonstrate the applicability of the tool in the field of dementia research by establishing a Clinical Data Model (CDM) in this domain. The CDM currently consists of 277 common variables covering demographics (e.g. age and gender), diagnostics, neuropsychological tests and biomarker measurements. The DST combined with this disease-specific data model shows how interoperability between multiple, heterogeneous dementia datasets can be achieved. AVAILABILITY AND IMPLEMENTATION: The DST source code and Docker images are respectively available at https://github.com/SCAI-BIO/data-steward and https://hub.docker.com/r/phwegner/data-steward. Furthermore, the DST is hosted at https://data-steward.bio.scai.fraunhofer.de/data-steward. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Demencia , Semántica , Humanos , Programas Informáticos , Demencia/diagnóstico
3.
Cerebellum ; 2023 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-37002505

RESUMEN

With SCAview, we present a prompt and comprehensive tool that enables scientists to browse large datasets of the most common spinocerebellar ataxias intuitively and without technical effort. Basic concept is a visualization of data, with a graphical handling and filtering to select and define subgroups and their comparison. Several plot types to visualize all data points resulting from the selected attributes are provided. The underlying synthetic cohort is based on clinical data from five different European and US longitudinal multicenter cohorts in spinocerebellar ataxia type 1, 2, 3, and 6 (SCA1, 2, 3, and 6) comprising > 1400 patients with overall > 5500 visits. First, we developed a common data model to integrate the clinical, demographic, and characterizing data of each source cohort. Second, the available datasets from each cohort were mapped onto the data model. Third, we created a synthetic cohort based on the cleaned dataset. With SCAview, we demonstrate the feasibility of mapping cohort data from different sources onto a common data model. The resulting browser-based visualization tool with a thoroughly graphical handling of the data offers researchers the unique possibility to visualize relationships and distributions of clinical data, to define subgroups and to further investigate them without any technical effort. Access to SCAview can be requested via the Ataxia Global Initiative and is free of charge.

4.
J Alzheimers Dis ; 99(4): 1409-1423, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38759012

RESUMEN

Background: Despite numerous past endeavors for the semantic harmonization of Alzheimer's disease (AD) cohort studies, an automatic tool has yet to be developed. Objective: As cohort studies form the basis of data-driven analysis, harmonizing them is crucial for cross-cohort analysis. We aimed to accelerate this task by constructing an automatic harmonization tool. Methods: We created a common data model (CDM) through cross-mapping data from 20 cohorts, three CDMs, and ontology terms, which was then used to fine-tune a BioBERT model. Finally, we evaluated the model using three previously unseen cohorts and compared its performance to a string-matching baseline model. Results: Here, we present our AD-Mapper interface for automatic harmonization of AD cohort studies, which outperformed a string-matching baseline on previously unseen cohort studies. We showcase our CDM comprising 1218 unique variables. Conclusion: AD-Mapper leverages semantic similarities in naming conventions across cohorts to improve mapping performance.


Asunto(s)
Enfermedad de Alzheimer , Semántica , Enfermedad de Alzheimer/diagnóstico , Humanos , Estudios de Cohortes
5.
Heliyon ; 9(11): e21502, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-38027969

RESUMEN

Objectives: Knowledge graphs and ontologies in the biomedical domain provide rich contextual knowledge for a variety of challenges. Employing that for knowledge-driven NLP tasks such as gene-disease association prediction represents a promising way to increase the predictive power of a model. Methods: We investigated the power of infusing the embedding of two aligned ontologies as prior knowledge to the NLP models. We evaluated the performance of different models on some large-scale gene-disease association datasets and compared it with a model without incorporating contextualized knowledge (BERT). Results: The experiments demonstrated that the knowledge-infused model slightly outperforms BERT by creating a small number of bridges. Thus, indicating that incorporating cross-references across ontologies can enhance the performance of base models without the need for more complex and costly training. However, further research is needed to explore the generalizability of the model. We expected that adding more bridges would bring further improvement based on the trend we observed in the experiments. In addition, the use of state-of-the-art knowledge graph embedding methods on a joint graph from connecting OGG and DOID with bridges also yielded promising results. Conclusion: Our work shows that allowing language models to leverage structured knowledge from ontologies does come with clear advantages in the performance. Besides, the annotation stage brought out in this paper is constrained in reasonable complexity.

6.
Database (Oxford) ; 20232023 Dec 02.
Artículo en Inglés | MEDLINE | ID: mdl-38041858

RESUMEN

As one of the leading causes for dementia in the population, it is imperative that we discern exactly why Alzheimer's disease (AD) has a strong molecular association with beta-amyloid and tau. Although a clear understanding about etiology and pathogenesis of AD remains unsolved, scientists worldwide have dedicated significant efforts to discovering the molecular interactions linked to the pathological characteristics and potential treatments. Knowledge representations, such as domain ontologies encompassing our current understanding about AD, could greatly assist and contribute to disease research. This paper describes the construction and application of the integrated Alzheimer's Disease Ontology (ADO), combining selected concepts from the former version of the ADO and the Alzheimer's Disease Mapping Ontology (ADMO). In addition to the existing entities available from these knowledge models, essential knowledge about AD from public sources, such as newly discovered risk factor genes and novel treatments, was also integrated. The ADO can also be leveraged in text mining scenarios given that it is conceptually enriched with domain-specific knowledge as well as their relations. The integrated ADO consists of 39 855 total axioms. The ontology covers many aspects of the AD domain, including risk factor genes, clinical features, treatments and experimental models. The ontology complies with the Open Biological and Biomedical Ontology principles and was accepted by the foundry. In this paper, we illustrate the role of the presented ontology in extracting textual information from the SCAIView database and key measures in an ADO-based corpus. Database URL:  https://academic.oup.com/database.


Asunto(s)
Enfermedad de Alzheimer , Ontologías Biológicas , Humanos , Enfermedad de Alzheimer/genética , Minería de Datos
7.
Bioinform Adv ; 3(1): vbad033, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37016683

RESUMEN

Motivation: Epilepsy is a multifaceted complex disorder that requires a precise understanding of the classification, diagnosis, treatment and disease mechanism governing it. Although scattered resources are available on epilepsy, comprehensive and structured knowledge is missing. In contemplation to promote multidisciplinary knowledge exchange and facilitate advancement in clinical management, especially in pre-clinical research, a disease-specific ontology is necessary. The presented ontology is designed to enable better interconnection between scientific community members in the epilepsy domain. Results: The Epilepsy Ontology (EPIO) is an assembly of structured knowledge on various aspects of epilepsy, developed according to Basic Formal Ontology (BFO) and Open Biological and Biomedical Ontology (OBO) Foundry principles. Concepts and definitions are collected from the latest International League against Epilepsy (ILAE) classification, domain-specific ontologies and scientific literature. This ontology consists of 1879 classes and 28 151 axioms (2171 declaration axioms, 2219 logical axioms) from several aspects of epilepsy. This ontology is intended to be used for data management and text mining purposes. Availability and implementation: The current release of the ontology is publicly available under a Creative Commons 4.0 License and shared via http://purl.obolibrary.org/obo/epso.owl and is a community-based effort assembling various facets of the complex disease. The ontology is also deposited in BioPortal at https://bioportal.bioontology.org/ontologies/EPIO. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA