Your browser doesn't support javascript.
loading
AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data.
Yuan, Han; Xie, Feng; Ong, Marcus Eng Hock; Ning, Yilin; Chee, Marcel Lucas; Saffari, Seyed Ehsan; Abdullah, Hairil Rizal; Goldstein, Benjamin Alan; Chakraborty, Bibhas; Liu, Nan.
Afiliação
  • Yuan H; Duke-NUS Medical School, National University of Singapore, Singapore.
  • Xie F; Duke-NUS Medical School, National University of Singapore, Singapore.
  • Ong MEH; Duke-NUS Medical School, National University of Singapore, Singapore; Department of Emergency Medicine, Singapore General Hospital, Singapore; Health Services Research Centre, Singapore Health Services, Singapore.
  • Ning Y; Duke-NUS Medical School, National University of Singapore, Singapore.
  • Chee ML; Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Australia.
  • Saffari SE; Duke-NUS Medical School, National University of Singapore, Singapore.
  • Abdullah HR; Duke-NUS Medical School, National University of Singapore, Singapore; Department of Anaesthesiology, Singapore General Hospital, Singapore.
  • Goldstein BA; Duke-NUS Medical School, National University of Singapore, Singapore; Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States.
  • Chakraborty B; Duke-NUS Medical School, National University of Singapore, Singapore; Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States; Department of Statistics and Data Science, National University of Singapore, Singapore.
  • Liu N; Duke-NUS Medical School, National University of Singapore, Singapore; Health Services Research Centre, Singapore Health Services, Singapore; Institute of Data Science, National University of Singapore, Singapore. Electronic address: liu.nan@duke-nus.edu.sg.
J Biomed Inform ; 129: 104072, 2022 05.
Article em En | MEDLINE | ID: mdl-35421602
ABSTRACT

BACKGROUND:

Medical decision-making impacts both individual and public health. Clinical scores are commonly used among various decision-making models to determine the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. However, its current framework still leaves room for improvement when addressing unbalanced data of rare events.

METHODS:

Using machine intelligence approaches, we developed AutoScore-Imbalance, which comprises three components training dataset optimization, sample weight optimization, and adjusted AutoScore. Baseline techniques for performance comparison included the original AutoScore, full logistic regression, stepwise logistic regression, least absolute shrinkage and selection operator (LASSO), full random forest, and random forest with a reduced number of variables. These models were evaluated based on their area under the curve (AUC) in the receiver operating characteristic analysis and balanced accuracy (i.e., mean value of sensitivity and specificity). By utilizing a publicly accessible dataset from Beth Israel Deaconess Medical Center, we assessed the proposed model and baseline approaches to predict inpatient mortality.

RESULTS:

AutoScore-Imbalance outperformed baselines in terms of AUC and balanced accuracy. The nine-variable AutoScore-Imbalance sub-model achieved the highest AUC of 0.786 (0.732-0.839), while the eleven-variable original AutoScore obtained an AUC of 0.723 (0.663-0.783), and the logistic regression with 21 variables obtained an AUC of 0.743 (0.685-0.801). The AutoScore-Imbalance sub-model (using a down-sampling algorithm) yielded an AUC of 0.771 (0.718-0.823) with only five variables, demonstrating a good balance between performance and variable sparsity. Furthermore, AutoScore-Imbalance obtained the highest balanced accuracy of 0.757 (0.702-0.805), compared to 0.698 (0.643-0.753) by the original AutoScore and the maximum of 0.720 (0.664-0.769) by other baseline models.

CONCLUSIONS:

We have developed an interpretable tool to handle clinical data imbalance, presented its structure, and demonstrated its superiority over baselines. The AutoScore-Imbalance tool can be applied to highly unbalanced datasets to gain further insight into rare medical events and facilitate real-world clinical decision-making.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Aprendizado de Máquina Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Aprendizado de Máquina Tipo de estudo: Prognostic_studies / Risk_factors_studies Idioma: En Ano de publicação: 2022 Tipo de documento: Article