Your browser doesn't support javascript.
loading
Using computable knowledge mined from the literature to elucidate confounders for EHR-based pharmacovigilance.
Malec, Scott A; Wei, Peng; Bernstam, Elmer V; Boyce, Richard D; Cohen, Trevor.
Afiliação
  • Malec SA; University of Pittsburgh School of Medicine, Department of Biomedical Informatics, Pittsburgh, PA, United States. Electronic address: sam413@pitt.edu.
  • Wei P; The University of Texas MD Anderson Cancer Center, Department of Biostatistics, Houston, TX, United States.
  • Bernstam EV; University of Texas Health Science Center at Houston, School of Biomedical Informatics, Houston, TX, United States.
  • Boyce RD; University of Pittsburgh School of Medicine, Department of Biomedical Informatics, Pittsburgh, PA, United States.
  • Cohen T; University of Washington, Department of Biomedical Informatics and Medical Education, Seattle, WA, United States.
J Biomed Inform ; 117: 103719, 2021 05.
Article em En | MEDLINE | ID: mdl-33716168
ABSTRACT

INTRODUCTION:

Drug safety research asks causal questions but relies on observational data. Confounding bias threatens the reliability of studies using such data. The successful control of confounding requires knowledge of variables called confounders affecting both the exposure and outcome of interest. However, causal knowledge of dynamic biological systems is complex and challenging. Fortunately, computable knowledge mined from the literature may hold clues about confounders. In this paper, we tested the hypothesis that incorporating literature-derived confounders can improve causal inference from observational data.

METHODS:

We introduce two methods (semantic vector-based and string-based confounder search) that query literature-derived information for confounder candidates to control, using SemMedDB, a database of computable knowledge mined from the biomedical literature. These methods search SemMedDB for confounders by applying semantic constraint search for indications treated by the drug (exposure) and that are also known to cause the adverse event (outcome). We then include the literature-derived confounder candidates in statistical and causal models derived from free-text clinical notes. For evaluation, we use a reference dataset widely used in drug safety containing labeled pairwise relationships between drugs and adverse events and attempt to rediscover these relationships from a corpus of 2.2 M NLP-processed free-text clinical notes. We employ standard adjustment and causal inference procedures to predict and estimate causal effects by informing the models with varying numbers of literature-derived confounders and instantiating the exposure, outcome, and confounder variables in the models with dichotomous EHR-derived data. Finally, we compare the results from applying these procedures with naive measures of association (χ2 and reporting odds ratio) and with each other. RESULTS AND

CONCLUSIONS:

We found semantic vector-based search to be superior to string-based search at reducing confounding bias. However, the effect of including more rather than fewer literature-derived confounders was inconclusive. We recommend using targeted learning estimation methods that can address treatment-confounder feedback, where confounders also behave as intermediate variables, and engaging subject-matter experts to adjudicate the handling of problematic covariates.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Farmacovigilância / Modelos Teóricos Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Farmacovigilância / Modelos Teóricos Tipo de estudo: Prognostic_studies Idioma: En Ano de publicação: 2021 Tipo de documento: Article