Decision analysis framework for predicting no-shows to appointments using machine learning algorithms.

Deina, Carolina; Fogliatto, Flavio S; da Silveira, Giovani J C; Anzanello, Michel J

Deina, Carolina; Fogliatto, Flavio S; da Silveira, Giovani J C; Anzanello, Michel J.

Afiliação

Deina C; Department of Industrial Engineering, Federal University of Rio Grande do Sul, Av. Osvaldo Aranha, 99, 5° Andar, Porto Alegre, 90035-190, Brazil. caroldeina@gmail.com.
Fogliatto FS; Department of Industrial Engineering, Federal University of Rio Grande do Sul, Av. Osvaldo Aranha, 99, 5° Andar, Porto Alegre, 90035-190, Brazil.
da Silveira GJC; Haskayne School of Business, University of Calgary, 2500 University Dr NW, Calgary, AB, T2N 1N4, Canada.
Anzanello MJ; Department of Industrial Engineering, Federal University of Rio Grande do Sul, Av. Osvaldo Aranha, 99, 5° Andar, Porto Alegre, 90035-190, Brazil.

BMC Health Serv Res ; 24(1): 37, 2024 Jan 05.

Article em En | MEDLINE | ID: mdl-38183029

ABSTRACT

ABSTRACT

BACKGROUND:

No-show to medical appointments has significant adverse effects on healthcare systems and their clients. Using machine learning to predict no-shows allows managers to implement strategies such as overbooking and reminders targeting patients most likely to miss appointments, optimizing the use of resources.

METHODS:

In this study, we proposed a detailed analytical framework for predicting no-shows while addressing imbalanced datasets. The framework includes a novel use of z-fold cross-validation performed twice during the modeling process to improve model robustness and generalization. We also introduce Symbolic Regression (SR) as a classification algorithm and Instance Hardness Threshold (IHT) as a resampling technique and compared their performance with that of other classification algorithms, such as K-Nearest Neighbors (KNN) and Support Vector Machine (SVM), and resampling techniques, such as Random under Sampling (RUS), Synthetic Minority Oversampling Technique (SMOTE) and NearMiss-1. We validated the framework using two attendance datasets from Brazilian hospitals with no-show rates of 6.65% and 19.03%.

RESULTS:

From the academic perspective, our study is the first to propose using SR and IHT to predict the no-show of patients. Our findings indicate that SR and IHT presented superior performances compared to other techniques, particularly IHT, which excelled when combined with all classification algorithms and led to low variability in performance metrics results. Our results also outperformed sensitivity outcomes reported in the literature, with values above 0.94 for both datasets.

CONCLUSION:

This is the first study to use SR and IHT methods to predict patient no-shows and the first to propose performing z-fold cross-validation twice. Our study highlights the importance of avoiding relying on few validation runs for imbalanced datasets as it may lead to biased results and inadequate analysis of the generalization and stability of the models obtained during the training stage.

Assuntos

Algoritmos; Benchmarking; Humanos; Brasil; Aprendizado de Máquina; Técnicas de Apoio para a Decisão

Palavras-chave

Classification algorithms; Healthcare environments; Imbalanced dataset; Machine learning; Missed appointments; Resampling techniques

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Benchmarking Tipo de estudo: Health_economic_evaluation / Prognostic_studies / Risk_factors_studies Limite: Humans País/Região como assunto: America do sul / Brasil Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google