Your browser doesn't support javascript.
loading
Automated Family Histories Significantly Improve Risk Prediction in an EHR.
Huang, Xiayuan; Kleiman, Ross; Page, David; Hebbring, Scott.
Afiliação
  • Huang X; University of Wisconsin-Madison, Madison, Wisconsin, United Sates.
  • Kleiman R; University of Wisconsin-Madison, Madison, Wisconsin, United Sates.
  • Page D; Duke University, Durham, North Carolina, United States.
  • Hebbring S; Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States.
AMIA Jt Summits Transl Sci Proc ; 2024: 221-229, 2024.
Article em En | MEDLINE | ID: mdl-38827091
ABSTRACT
We recently demonstrated that electronically constructed family pedigrees (e-pedigrees) have great value in epidemiologic research using electronic health record (EHR) data. Prior to this work, it has been well accepted that family health history is a major predictor for a wide spectrum of diseases, reflecting shared effects of genetics, environment, and lifestyle. With the widespread digitalization of patient data via EHRs, there is an unprecedented opportunity to use machine learning algorithms to better predict disease risk. Although predictive models have previously been constructed for a few important diseases, we currently know very little about how accurately the risk for most diseases can be predicted. It is further unknown if the incorporation of e-pedigrees in machine learning can improve the value of these models. In this study, we devised a family pedigree-driven high-throughput machine learning pipeline to simultaneously predict risks for thousands of diagnosis codes using thousands of input features. Models were built to predict future disease risk for three time windows using both Logistic Regression and XGBoost. For example, we achieved average areas under the receiver operating characteristic curves (AUCs) of 0.82, 0.77 and 0.71 for 1, 6, and 24 months, respectively using XGBoost and without e-pedigrees. When adding e-pedigree features to the XGBoost pipeline, AUCs increased to 0.83, 0.79 and 0.74 for the same three time periods, respectively. E-pedigrees similarly improved the predictions when using Logistic Regression. These results emphasize the potential value of incorporating family health history via e-pedigrees into machine learning with no further human time.

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Idioma: En Ano de publicação: 2024 Tipo de documento: Article