Using random forest and biomarkers for differentiating COVID-19 and Mycoplasma pneumoniae infections.
Sci Rep
; 14(1): 22673, 2024 09 30.
Article
in En
| MEDLINE
| ID: mdl-39349769
ABSTRACT
The COVID-19 pandemic has underscored the critical need for precise diagnostic methods to distinguish between similar respiratory infections, such as COVID-19 and Mycoplasma pneumoniae (MP). Identifying key biomarkers and utilizing machine learning techniques, such as random forest analysis, can significantly improve diagnostic accuracy. We conducted a retrospective analysis of clinical and laboratory data from 214 patients with acute respiratory infections, collected between October 2022 and October 2023 at the Second Hospital of Nanping. The study population was categorized into three groups COVID-19 positive (n = 52), MP positive (n = 140), and co-infected (n = 22). Key biomarkers, including C-reactive protein (CRP), procalcitonin (PCT), interleukin- 6 (IL-6), and white blood cell (WBC) counts, were evaluated. Correlation analyses were conducted to assess relationships between biomarkers within each group. The random forest analysis was applied to evaluate the discriminative power of these biomarkers. The random forest model demonstrated high classification performance, with area under the ROC curve (AUC) scores of 0.86 (95% CI 0.70-0.97) for COVID-19, 0.79 (95% CI 0.64-0.92) for MP, 0.69 (95% CI 0.50-0.87) for co-infections, and 0.90 (95% CI 0.83-0.95) for the micro-average ROC. Additionally, the precision-recall curve for the random forest classifier showed a micro-average AUC of 0.80 (95% CI 0.69-0.91). Confusion matrices highlighted the model's accuracy (0.77) and biomarker relationships. The SHAP feature importance analysis indicated that age (0.27), CRP (0.25), IL6 (0.14), and PCT (0.14) were the most significant predictors. The integration of computational methods, particularly random forest analysis, in evaluating clinical and biomarker data presents a promising approach for enhancing diagnostic processes for infectious diseases. Our findings support the use of specific biomarkers in differentiating between COVID-19 and MP, potentially leading to more targeted and effective diagnostic strategies. This study underscores the potential of machine learning techniques in improving disease classification in the era of precision medicine.
Key words
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Main subject:
Pneumonia, Mycoplasma
/
C-Reactive Protein
/
Biomarkers
/
Machine Learning
/
Procalcitonin
/
COVID-19
Limits:
Adult
/
Aged
/
Female
/
Humans
/
Male
/
Middle aged
Language:
En
Journal:
Sci Rep
Year:
2024
Document type:
Article
Affiliation country:
China
Country of publication:
Reino Unido