Your browser doesn't support javascript.
loading
Multimorbidity in middle-aged women and COVID-19: binary data clustering for unsupervised binning of rare multimorbidity features and predictive modeling.
Benny, Dayana; Giacobini, Mario; Costa, Giuseppe; Gnavi, Roberto; Ricceri, Fulvio.
Afiliación
  • Benny D; Centre for Biostatistics, Epidemiology, and Public Health, Department of Clinical and Biological Sciences, University of Turin, Orbassano, Turin, 10043, Piedmont, Italy. dayana.benny@unito.it.
  • Giacobini M; Modeling and Data Science, Department of Mathematics, University of Turin, Via Carlo Alberto 10, Turin, 10123, Piedmont, Italy. dayana.benny@unito.it.
  • Costa G; Data Analysis and Modeling Unit, Department of Veterinary Sciences, University of Turin, Turin, Italy.
  • Gnavi R; Centre for Biostatistics, Epidemiology, and Public Health, Department of Clinical and Biological Sciences, University of Turin, Orbassano, Turin, 10043, Piedmont, Italy.
  • Ricceri F; Unit of Epidemiology, Regional Health Service, Local Health Unit Torino 3, Grugliasco, Turin, Italy.
BMC Med Res Methodol ; 24(1): 95, 2024 Apr 24.
Article en En | MEDLINE | ID: mdl-38658821
ABSTRACT

BACKGROUND:

Multimorbidity is typically associated with deficient health-related quality of life in mid-life, and the likelihood of developing multimorbidity in women is elevated. We address the issue of data sparsity in non-prevalent features by clustering the binary data of various rare medical conditions in a cohort of middle-aged women. This study aims to enhance understanding of how multimorbidity affects COVID-19 severity by clustering rare medical conditions and combining them with prevalent features for predictive modeling. The insights gained can guide the development of targeted interventions and improved management strategies for individuals with multiple health conditions.

METHODS:

The study focuses on a cohort of 4477 female patients, (aged 45-60) in Piedmont, Italy, and utilizes their multimorbidity data prior to the COVID-19 pandemic from their medical history from 2015 to 2019. The COVID-19 severity is determined by the hospitalization status of the patients from February to May 2020. Each patient profile in the dataset is depicted as a binary vector, where each feature denotes the presence or absence of a specific multimorbidity condition. By clustering the sparse medical data, newly engineered features are generated as a bin of features, and they are combined with the prevalent features for COVID-19 severity predictive modeling.

RESULTS:

From sparse data consisting of 174 input features, we have created a low-dimensional feature matrix of 17 features. Machine Learning algorithms are applied to the reduced sparsity-free data to predict the Covid-19 hospital admission outcome. The performance obtained for the corresponding models are as follows Logistic Regression (accuracy 0.72, AUC 0.77, F1-score 0.69), Linear Discriminant Analysis (accuracy 0.7, AUC 0.77, F1-score 0.67), and Ada Boost (accuracy 0.7, AUC 0.77, F1-score 0.68).

CONCLUSION:

Mapping higher-dimensional data to a low-dimensional space can result in information loss, but reducing sparsity can be beneficial for Machine Learning modeling due to improved predictive ability. In this study, we addressed the issue of data sparsity in electronic health records and created a model that incorporates both prevalent and rare medical conditions, leading to more accurate and effective predictive modeling. The identification of complex associations between multimorbidity and the severity of COVID-19 highlights potential areas of focus for future research, including long COVID and intervention efforts.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Multimorbilidad / SARS-CoV-2 / COVID-19 Límite: Female / Humans / Middle aged País/Región como asunto: Europa Idioma: En Revista: BMC Med Res Methodol Asunto de la revista: MEDICINA Año: 2024 Tipo del documento: Article País de afiliación: Italia

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Multimorbilidad / SARS-CoV-2 / COVID-19 Límite: Female / Humans / Middle aged País/Región como asunto: Europa Idioma: En Revista: BMC Med Res Methodol Asunto de la revista: MEDICINA Año: 2024 Tipo del documento: Article País de afiliación: Italia