Evaluating Prognostic Bias of Critical Illness Severity Scores Based on Age, Sex, and Primary Language in the United States: A Retrospective Multicenter Study.

Liu, Xiaoli; Shen, Max; Lie, Margaret; Zhang, Zhongheng; Liu, Chao; Li, Deyu; Mark, Roger G; Zhang, Zhengbo; Celi, Leo Anthony

Liu, Xiaoli; Shen, Max; Lie, Margaret; Zhang, Zhongheng; Liu, Chao; Li, Deyu; Mark, Roger G; Zhang, Zhengbo; Celi, Leo Anthony.

Afiliación

Liu X; Center for Artificial Intelligence in Medicine, The General Hospital of PLA, Beijing, China.
Shen M; School of Biological Science and Medical Engineering, Beihang University, Beijing, China.
Lie M; Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA.
Zhang Z; Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA.
Liu C; Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA.
Li D; Department of Emergency Medicine, Key Laboratory of Precision Medicine in Diagnosis and Monitoring Research of Zhejiang Province, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China.
Mark RG; Department of Critical Care Medicine, The First Medical Center, The General Hospital of PLA, Beijing, China.
Zhang Z; School of Biological Science and Medical Engineering, Beihang University, Beijing, China.
Celi LA; Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA.

Crit Care Explor ; 6(1): e1033, 2024 Jan.

Article en En | MEDLINE | ID: mdl-38239408

ABSTRACT

ABSTRACT

OBJECTIVES:

Although illness severity scoring systems are widely used to support clinical decision-making and assess ICU performance, their potential bias across different age, sex, and primary language groups has not been well-studied. DESIGN SETTING AND PATIENTS We aimed to identify potential bias of Sequential Organ Failure Assessment (SOFA) and Acute Physiology and Chronic Health Evaluation (APACHE) IVa scores via large ICU databases. SETTING/PATIENTS This multicenter, retrospective study was conducted using data from the Medical Information Mart for Intensive Care (MIMIC) and eICU Collaborative Research Database. SOFA and APACHE IVa scores were obtained from ICU admission. Hospital mortality was the primary outcome. Discrimination (area under receiver operating characteristic [AUROC] curve) and calibration (standardized mortality ratio [SMR]) were assessed for all subgroups.

INTERVENTIONS:

Not applicable. MEASUREMENTS AND MAIN

RESULTS:

A total of 196,310 patient encounters were studied. Discrimination for both scores was worse in older patients compared with younger patients and female patients rather than male patients. In MIMIC, discrimination of SOFA in non-English primary language speakers patients was worse than that of English speakers (AUROC 0.726 vs. 0.783, p < 0.0001). Evaluating calibration via SMR showed statistically significant underestimations of mortality when compared with overall cohort in the oldest patients for both SOFA and APACHE IVa, female patients (1.09) for SOFA, and non-English primary language patients (1.38) for SOFA in MIMIC.

CONCLUSIONS:

Differences in discrimination and calibration of two scores across varying age, sex, and primary language groups suggest illness severity scores are prone to bias in mortality predictions. Caution must be taken when using them for quality benchmarking and decision-making among diverse real-world populations.

Palabras clave

bias evaluation; calibration; discrimination; hospital mortality; illness severity scores

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Base de datos: MEDLINE Tipo de estudio: Clinical_trials / Observational_studies / Prognostic_studies / Risk_factors_studies Idioma: En Revista: Crit Care Explor Año: 2024 Tipo del documento: Article País de afiliación: China

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google