RESUMEN
BACKGROUND: Timely diagnosis is crucial for sepsis treatment. Current machine learning (ML) models suffer from high complexity and limited applicability. We therefore created an ML model using only complete blood count (CBC) diagnostics. METHODS: We collected non-intensive care unit (non-ICU) data from a German tertiary care centre (January 2014 to December 2021). Using patient age, sex, and CBC parameters (haemoglobin, platelets, mean corpuscular volume, white and red blood cells), we trained a boosted random forest, which predicts sepsis with ICU admission. Two external validations were conducted using data from another German tertiary care centre and the Medical Information Mart for Intensive Care IV database (MIMIC-IV). Using the subset of laboratory orders also including procalcitonin (PCT), an analogous model was trained with PCT as an additional feature. RESULTS: After exclusion, 1 381 358 laboratory requests (2016 from sepsis cases) were available. The CBC model shows an area under the receiver operating characteristic (AUROC) of 0.872 (95% CI, 0.857-0.887). External validations show AUROCs of 0.805 (95% CI, 0.787-0.824) for University Medicine Greifswald and 0.845 (95% CI, 0.837-0.852) for MIMIC-IV. The model including PCT revealed a significantly higher AUROC (0.857; 95% CI, 0.836-0.877) than PCT alone (0.790; 95% CI, 0.759-0.821; P < 0.001). CONCLUSIONS: Our results demonstrate that routine CBC results could significantly improve diagnosis of sepsis when combined with ML. The CBC model can facilitate early sepsis prediction in non-ICU patients with high robustness in external validations. Its implementation in clinical decision support systems has strong potential to provide an essential time advantage and increase patient safety.