Your browser doesn't support javascript.
loading
Leveraging error-prone algorithm-derived phenotypes: Enhancing association studies for risk factors in EHR data.
Lu, Yiwen; Tong, Jiayi; Chubak, Jessica; Lumley, Thomas; Hubbard, Rebecca A; Xu, Hua; Chen, Yong.
Afiliação
  • Lu Y; Center for Health AI and Synthesis of Evidence (CHASE), Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; The Graduate Group in Applied Mathematics and Computational Science, School of Arts and Sciences, Univers
  • Tong J; Center for Health AI and Synthesis of Evidence (CHASE), Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
  • Chubak J; Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA.
  • Lumley T; Department of Statistics, Faculty of Science, University of Auckland, Auckland, New Zealand.
  • Hubbard RA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Penn Institute for Biomedical Informatics (IBI), Philadelphia, PA, USA.
  • Xu H; Department of Biomedical Informatics and Data Science, Yale University, New Haven, CT, USA.
  • Chen Y; Center for Health AI and Synthesis of Evidence (CHASE), Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; The Graduate Group in Applied Mathematics and Computational Science, School of Arts and Sciences, Univers
J Biomed Inform ; 157: 104690, 2024 Jul 14.
Article em En | MEDLINE | ID: mdl-39004110
ABSTRACT

OBJECTIVES:

It has become increasingly common for multiple computable phenotypes from electronic health records (EHR) to be developed for a given phenotype. However, EHR-based association studies often focus on a single phenotype. In this paper, we develop a method aiming to simultaneously make use of multiple EHR-derived phenotypes for reduction of bias due to phenotyping error and improved efficiency of phenotype/exposure associations. MATERIALS AND

METHODS:

The proposed method combines multiple algorithm-derived phenotypes with a small set of validated outcomes to reduce bias and improve estimation accuracy and efficiency. The performance of our method was evaluated through simulation studies and real-world application to an analysis of colon cancer recurrence using EHR data from Kaiser Permanente Washington.

RESULTS:

In settings where there was no single surrogate performing uniformly better than all others in terms of both sensitivity and specificity, our method achieved substantial bias reduction compared to using a single algorithm-derived phenotype. Our method also led to higher estimation efficiency by up to 30% compared to an estimator that used only one algorithm-derived phenotype.

DISCUSSION:

Simulation studies and application to real-world data demonstrated the effectiveness of our method in integrating multiple phenotypes, thereby enhancing bias reduction, statistical accuracy and efficiency.

CONCLUSIONS:

Our method combines information across multiple surrogates using a statistically efficient seemingly unrelated regression framework. Our method provides a robust alternative to single-surrogate-based bias correction, especially in contexts lacking information on which surrogate is superior.

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article