Your browser doesn't support javascript.
loading
A conditional multi-label model to improve prediction of a rare outcome: An illustration predicting autism diagnosis.
Huang, Wei A; Engelhard, Matthew; Coffman, Marika; Hill, Elliot D; Weng, Qin; Scheer, Abby; Maslow, Gary; Henao, Ricardo; Dawson, Geraldine; Goldstein, Benjamin A.
Afiliação
  • Huang WA; Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA; AI Health, Duke University School of Medicine, Durham, North Carolina, USA.
  • Engelhard M; Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA; AI Health, Duke University School of Medicine, Durham, North Carolina, USA.
  • Coffman M; Department of Psychiatry, Duke University School of Medicine, Durham, NC, USA.
  • Hill ED; Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA; AI Health, Duke University School of Medicine, Durham, North Carolina, USA.
  • Weng Q; Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA.
  • Scheer A; Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA.
  • Maslow G; Department of Psychiatry, Duke University School of Medicine, Durham, NC, USA; Department of Pediatrics, Duke University School of Medicine, Durham, NC, USA.
  • Henao R; Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA; AI Health, Duke University School of Medicine, Durham, North Carolina, USA.
  • Dawson G; Department of Psychiatry, Duke University School of Medicine, Durham, NC, USA; Department of Pediatrics, Duke University School of Medicine, Durham, NC, USA.
  • Goldstein BA; Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA; AI Health, Duke University School of Medicine, Durham, North Carolina, USA; Department of Pediatrics, Duke University School of Medicine, Durham, NC, USA. Electronic address: ben.goldstein@duke.edu.
J Biomed Inform ; 157: 104711, 2024 Sep.
Article em En | MEDLINE | ID: mdl-39182632
ABSTRACT

OBJECTIVE:

This study aimed to develop a novel approach using routinely collected electronic health records (EHRs) data to improve the prediction of a rare event. We illustrated this using an example of improving early prediction of an autism diagnosis, given its low prevalence, by leveraging correlations between autism and other neurodevelopmental conditions (NDCs).

METHODS:

To achieve this, we introduced a conditional multi-label model by merging conditional learning and multi-label methodologies. The conditional learning approach breaks a hard task into more manageable pieces in each stage, and the multi-label approach utilizes information from related neurodevelopmental conditions to learn predictive latent features. The study involved forecasting autism diagnosis by age 5.5 years, utilizing data from the first 18 months of life, and the analysis of feature importance correlations to explore the alignment within the feature space across different conditions.

RESULTS:

Upon analysis of health records from 18,156 children, we are able to generate a model that predicts a future autism diagnosis with moderate performance (AUROC=0.76). The proposed conditional multi-label method significantly improves predictive performance with an AUROC of 0.80 (p < 0.001). Further examination shows that both the conditional and multi-label approach alone provided marginal lift to the model performance compared to a one-stage one-label approach. We also demonstrated the generalizability and applicability of this method using simulated data with high correlation between feature vectors for different labels.

CONCLUSION:

Our findings underscore the effectiveness of the developed conditional multi-label model for early prediction of an autism diagnosis. The study introduces a versatile strategy applicable to prediction tasks involving limited target populations but sharing underlying features or etiology among related groups.
Assuntos
Palavras-chave

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Transtorno Autístico / Registros Eletrônicos de Saúde Limite: Child / Child, preschool / Female / Humans / Infant / Male Idioma: En Revista: J Biomed Inform Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Bases de dados: MEDLINE Assunto principal: Transtorno Autístico / Registros Eletrônicos de Saúde Limite: Child / Child, preschool / Female / Humans / Infant / Male Idioma: En Revista: J Biomed Inform Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos