Your browser doesn't support javascript.
loading
COVID-19 Likelihood Meter: a machine learning approach to COVID-19 screening for Indonesian health workers
Shreyash Sonthalia; Muhammad Aji Muharrom; Levana L. Sani; Olivia Herlinda; Adrianna Bella; Dimitri Swashtika; Panji Hadisoemarto; Diah Saminarsih; Nurul Luntungan; Astrid Irwanto; Akmal Taher; Joseph L. Greenstein.
Afiliación
  • Shreyash Sonthalia; Nalagenetics Pte Ltd, Singapore, Singapore
  • Muhammad Aji Muharrom; Nalagenetics Pte Ltd, Singapore, Singapore
  • Levana L. Sani; Nalagenetics Pte Ltd, Singapore, Singapore
  • Olivia Herlinda; Center for Indonesia's Strategic Development Initiatives (CISDI)
  • Adrianna Bella; Center for Indonesia's Strategic Development Initiatives (CISDI)
  • Dimitri Swashtika; Center for Indonesia's Strategic Development Initiatives (CISDI)
  • Panji Hadisoemarto; Department of Public Health., Faculty of Medicine, Padjajaran University, Bandung, Indonesia
  • Diah Saminarsih; World Health Organization, Geneva, Switzerland
  • Nurul Luntungan; Center for Indonesia's Strategic Development Initiatives (CISDI)
  • Astrid Irwanto; Nalagenetics Pte Ltd, Singapore, Singapore
  • Akmal Taher; Department of Urology, Cipto Mangunkusumo Hospital, Universitas Indonesia
  • Joseph L. Greenstein; Johns Hopkins University, Baltimore, MD
Preprint en En | PREPRINT-MEDRXIV | ID: ppmedrxiv-21265021
ABSTRACT
The COVID-19 pandemic poses a heightened risk to health workers, especially in low- and middle-income countries such as Indonesia. Due to the limitations to implementing mass RT-PCR testing for health workers, high-performing and cost-effective methodologies must be developed to help identify COVID-19 positive health workers and protect the spearhead of the battle against the pandemic. This study aimed to investigate the application of machine learning classifiers to predict the risk of COVID-19 positivity (by RT-PCR) using data obtained from a survey specific to health workers. Machine learning tools can enhance COVID-19 screening capacity in high-risk populations such as health workers in environments where cost is a barrier to accessibility of adequate testing and screening supplies. We built two sets of COVID-19 Likelihood Meter (CLM) models one trained on data from a broad population of health workers in Jakarta and Semarang (full model) and tested on the same, and one trained on health workers from Jakarta only (Jakarta model) and tested on an independent population of Semarang health workers. The area under the receiver-operating-characteristic curve (AUC), average precision (AP), and the Brier score (BS) were used to assess model performance. Shapley additive explanations (SHAP) were used to analyze feature importance. The final dataset for the study included 3979 health workers. For the full model, the random forest was selected as the algorithm of choice. It achieved cross-validation mean AUC of 0.818 {+/-} 0.022 and AP of 0.449 {+/-} 0.028 and was high performing during testing with AUC and AP of 0.831 and 0.428 respectively. The random forest model was well-calibrated with a low mean brier score of 0.122 {+/-} 0.004. A random forest classifier was the best performing model during cross-validation for the Jakarta dataset, with AUC of 0.824 {+/-} 0.008, AP of 0.397 {+/-} 0.019, and BS of 0.102 {+/-} 0.007, but the extra trees classifier was selected as the model of choice due to better generalizability to the test set. The performance of the extra trees model, when tested on the independent set of Semarang health workers, was AUC of 0.672 and AP of 0.508. Our models yielded high predictive performance and may have the potential to be utilized as both a COVID-19 screening tool and a method to identify health workers at greatest risk of COVID-19 positivity, and therefore most in need of testing.
Licencia
cc_no
Texto completo: 1 Colección: 09-preprints Base de datos: PREPRINT-MEDRXIV Tipo de estudio: Observational_studies / Prognostic_studies / Rct Idioma: En Año: 2021 Tipo del documento: Preprint
Texto completo: 1 Colección: 09-preprints Base de datos: PREPRINT-MEDRXIV Tipo de estudio: Observational_studies / Prognostic_studies / Rct Idioma: En Año: 2021 Tipo del documento: Preprint