RESUMO
Racial and ethnic minorities bear a disproportionate burden of type 2 diabetes (T2D) and its complications, with social determinants of health (SDoH) recognized as key drivers of these disparities. Implementing efficient and effective social needs management strategies is crucial. We propose a machine learning analytic pipeline to calculate the individualized polysocial risk score (iPsRS), which can identify T2D patients at high social risk for hospitalization, incorporating explainable AI techniques and algorithmic fairness optimization. We use electronic health records (EHR) data from T2D patients in the University of Florida Health Integrated Data Repository, incorporating both contextual SDoH (e.g., neighborhood deprivation) and person-level SDoH (e.g., housing instability). After fairness optimization across racial and ethnic groups, the iPsRS achieved a C statistic of 0.71 in predicting 1-year hospitalization. Our iPsRS can fairly and accurately screen patients with T2D who are at increased social risk for hospitalization.
Assuntos
Diabetes Mellitus Tipo 2 , Hospitalização , Determinantes Sociais da Saúde , Adulto , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Diabetes Mellitus Tipo 2/epidemiologia , Registros Eletrônicos de Saúde , Etnicidade , Florida/epidemiologia , Hospitalização/estatística & dados numéricos , Aprendizado de Máquina , Medição de Risco/métodos , Fatores de Risco , Grupos RaciaisRESUMO
Background: Racial and ethnic minority groups and individuals facing social disadvantages, which often stem from their social determinants of health (SDoH), bear a disproportionate burden of type 2 diabetes (T2D) and its complications. It is crucial to implement effective social risk management strategies at the point of care. Objective: To develop an electronic health records (EHR)-based machine learning (ML) analytical pipeline to address unmet social needs associated with hospitalization risk in patients with T2D. Methods: We identified real-world patients with T2D from the EHR data from University of Florida (UF) Health Integrated Data Repository (IDR), incorporating both contextual SDoH (e.g., neighborhood deprivation) and individual-level SDoH (e.g., housing instability). The 2015-2020 data were used for training and validation and 2021-2022 data for independent testing. We developed a machine learning analytic pipeline, namely individualized polysocial risk score (iPsRS), to identify high social risk associated with hospitalizations in T2D patients, along with explainable AI (XAI) and fairness optimization. Results: The study cohort included 10,192 real-world patients with T2D, with a mean age of 59 years and 58% female. Of the cohort, 50% were non-Hispanic White, 39% were non-Hispanic Black, 6% were Hispanic, and 5% were other races/ethnicities. Our iPsRS, including both contextual and individual-level SDoH as input factors, achieved a C statistic of 0.72 in predicting 1-year hospitalization after fairness optimization across racial and ethnic groups. The iPsRS showed excellent utility for capturing individuals at high hospitalization risk because of SDoH, that is, the actual 1-year hospitalization rate in the top 5% of iPsRS was 28.1%, ~13 times as high as the bottom decile (2.2% for 1-year hospitalization rate). Conclusion: Our ML pipeline iPsRS can fairly and accurately screen for patients who have increased social risk leading to hospitalization in real word patients with T2D.