Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Sci Data ; 11(1): 363, 2024 Apr 11.
Artículo en Inglés | MEDLINE | ID: mdl-38605048

RESUMEN

Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Bases del Conocimiento , Reconocimiento de Normas Patrones Automatizadas , Algoritmos , Investigación Biomédica Traslacional
2.
J Biomed Inform ; 142: 104368, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37086959

RESUMEN

BACKGROUND: Causal feature selection is essential for estimating effects from observational data. Identifying confounders is a crucial step in this process. Traditionally, researchers employ content-matter expertise and literature review to identify confounders. Uncontrolled confounding from unidentified confounders threatens validity, conditioning on intermediate variables (mediators) weakens estimates, and conditioning on common effects (colliders) induces bias. Additionally, without special treatment, erroneous conditioning on variables combining roles introduces bias. However, the vast literature is growing exponentially, making it infeasible to assimilate this knowledge. To address these challenges, we introduce a novel knowledge graph (KG) application enabling causal feature selection by combining computable literature-derived knowledge with biomedical ontologies. We present a use case of our approach specifying a causal model for estimating the total causal effect of depression on the risk of developing Alzheimer's disease (AD) from observational data. METHODS: We extracted computable knowledge from a literature corpus using three machine reading systems and inferred missing knowledge using logical closure operations. Using a KG framework, we mapped the output to target terminologies and combined it with ontology-grounded resources. We translated epidemiological definitions of confounder, collider, and mediator into queries for searching the KG and summarized the roles played by the identified variables. We compared the results with output from a complementary method and published observational studies and examined a selection of confounding and combined role variables in-depth. RESULTS: Our search identified 128 confounders, including 58 phenotypes, 47 drugs, 35 genes, 23 collider, and 16 mediator phenotypes. However, only 31 of the 58 confounder phenotypes were found to behave exclusively as confounders, while the remaining 27 phenotypes played other roles. Obstructive sleep apnea emerged as a potential novel confounder for depression and AD. Anemia exemplified a variable playing combined roles. CONCLUSION: Our findings suggest combining machine reading and KG could augment human expertise for causal feature selection. However, the complexity of causal feature selection for depression with AD highlights the need for standardized field-specific databases of causal variables. Further work is needed to optimize KG search and transform the output for human consumption.


Asunto(s)
Enfermedad de Alzheimer , Humanos , Depresión , Reconocimiento de Normas Patrones Automatizadas , Causalidad , Factores de Riesgo
3.
J Biomed Inform ; 117: 103719, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33716168

RESUMEN

INTRODUCTION: Drug safety research asks causal questions but relies on observational data. Confounding bias threatens the reliability of studies using such data. The successful control of confounding requires knowledge of variables called confounders affecting both the exposure and outcome of interest. However, causal knowledge of dynamic biological systems is complex and challenging. Fortunately, computable knowledge mined from the literature may hold clues about confounders. In this paper, we tested the hypothesis that incorporating literature-derived confounders can improve causal inference from observational data. METHODS: We introduce two methods (semantic vector-based and string-based confounder search) that query literature-derived information for confounder candidates to control, using SemMedDB, a database of computable knowledge mined from the biomedical literature. These methods search SemMedDB for confounders by applying semantic constraint search for indications treated by the drug (exposure) and that are also known to cause the adverse event (outcome). We then include the literature-derived confounder candidates in statistical and causal models derived from free-text clinical notes. For evaluation, we use a reference dataset widely used in drug safety containing labeled pairwise relationships between drugs and adverse events and attempt to rediscover these relationships from a corpus of 2.2 M NLP-processed free-text clinical notes. We employ standard adjustment and causal inference procedures to predict and estimate causal effects by informing the models with varying numbers of literature-derived confounders and instantiating the exposure, outcome, and confounder variables in the models with dichotomous EHR-derived data. Finally, we compare the results from applying these procedures with naive measures of association (χ2 and reporting odds ratio) and with each other. RESULTS AND CONCLUSIONS: We found semantic vector-based search to be superior to string-based search at reducing confounding bias. However, the effect of including more rather than fewer literature-derived confounders was inconclusive. We recommend using targeted learning estimation methods that can address treatment-confounder feedback, where confounders also behave as intermediate variables, and engaging subject-matter experts to adjudicate the handling of problematic covariates.


Asunto(s)
Modelos Teóricos , Farmacovigilancia , Sesgo , Causalidad , Reproducibilidad de los Resultados
4.
AMIA Jt Summits Transl Sci Proc ; 2020: 403-412, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32477661

RESUMEN

This paper introduces a database derived from Structured Product Labels (SPLs). SPLs are legally mandated snapshots containing information on all drugs released to market in the United States. Since publication is not required for pre-trial findings, we hypothesize that SPLs may contain knowledge absent in the literature, and hence "novel." SemMedDB is an existing database of computable knowledge derived from the literature. If SPL content could be similarly transformed, novel clinically relevant assertions in the SPLs could be identified through comparison with SemMedDB. After we derive a database (containing 4,297,481 assertions), we compare the extracted content with SemMedDB for recent FDA drug approvals. We find that novelty between the SPLs and the literature is nuanced, due to the redundancy of SPLs. Highlighting areas for improvement and future work, we conclude that SPLs contain a wealth of novel knowledge relevant to research and complementary to the literature.

5.
AMIA Annu Symp Proc ; 2016: 1920-1929, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-28269951

RESUMEN

Observational data recorded in the Electronic Health Record (EHR) can help us better understand the effects of therapeutic agents in routine clinical practice. As such data were not collected for research purposes, their reuse for research must compensate for additional information that may bias analyses and lead to faulty conclusions. Confounding is present when factors aside from the given predictor(s) affect the response of interest. However, these additional factors may not be known at the outset. In this paper, we present a scalable literature-based confounding variable discovery method for biomedical research applications with pharmacovigilance as our use case. We hypothesized that statistical models, adjusted with literature-derived confounders, will more accurately identify causative drug-adverse drug event (ADE) relationships. We evaluated our method with a curated reference standard, and found a pattern of improved performance ~ 5% in two out of three models for gastrointestinal bleeding (pre-adjusted Area Under Curve ≥ 0.6).


Asunto(s)
Factores de Confusión Epidemiológicos , Registros Electrónicos de Salud , Farmacovigilancia , Área Bajo la Curva , Investigación Biomédica , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Modelos Teóricos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...