Your browser doesn't support javascript.
loading
Development and external validation of deep learning clinical prediction models using variable-length time series data.
Bashiri, Fereshteh S; Carey, Kyle A; Martin, Jennie; Koyner, Jay L; Edelson, Dana P; Gilbert, Emily R; Mayampurath, Anoop; Afshar, Majid; Churpek, Matthew M.
Afiliação
  • Bashiri FS; Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States.
  • Carey KA; Department of Medicine, University of Chicago, Chicago, IL 60637, United States.
  • Martin J; Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States.
  • Koyner JL; Department of Medicine, University of Chicago, Chicago, IL 60637, United States.
  • Edelson DP; Department of Medicine, University of Chicago, Chicago, IL 60637, United States.
  • Gilbert ER; Department of Medicine, Loyola University, Chicago, IL 60153, United States.
  • Mayampurath A; Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States.
  • Afshar M; Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, United States.
  • Churpek MM; Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States.
J Am Med Inform Assoc ; 31(6): 1322-1330, 2024 May 20.
Article em En | MEDLINE | ID: mdl-38679906
ABSTRACT

OBJECTIVES:

To compare and externally validate popular deep learning model architectures and data transformation methods for variable-length time series data in 3 clinical tasks (clinical deterioration, severe acute kidney injury [AKI], and suspected infection). MATERIALS AND

METHODS:

This multicenter retrospective study included admissions at 2 medical centers that spanned 2007-2022. Distinct datasets were created for each clinical task, with 1 site used for training and the other for testing. Three feature engineering methods (normalization, standardization, and piece-wise linear encoding with decision trees [PLE-DTs]) and 3 architectures (long short-term memory/gated recurrent unit [LSTM/GRU], temporal convolutional network, and time-distributed wrapper with convolutional neural network [TDW-CNN]) were compared in each clinical task. Model discrimination was evaluated using the area under the precision-recall curve (AUPRC) and the area under the receiver operating characteristic curve (AUROC).

RESULTS:

The study comprised 373 825 admissions for training and 256 128 admissions for testing. LSTM/GRU models tied with TDW-CNN models with both obtaining the highest mean AUPRC in 2 tasks, and LSTM/GRU had the highest mean AUROC across all tasks (deterioration 0.81, AKI 0.92, infection 0.87). PLE-DT with LSTM/GRU achieved the highest AUPRC in all tasks.

DISCUSSION:

When externally validated in 3 clinical tasks, the LSTM/GRU model architecture with PLE-DT transformed data demonstrated the highest AUPRC in all tasks. Multiple models achieved similar performance when evaluated using AUROC.

CONCLUSION:

The LSTM architecture performs as well or better than some newer architectures, and PLE-DT may enhance the AUPRC in variable-length time series data for predicting clinical outcomes during external validation.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Aprendizado Profundo Limite: Female / Humans / Male / Middle aged Idioma: En Revista: J Am Med Inform Assoc Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Aprendizado Profundo Limite: Female / Humans / Male / Middle aged Idioma: En Revista: J Am Med Inform Assoc Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Estados Unidos