Your browser doesn't support javascript.
loading
Building gender-specific sexually transmitted infection risk prediction models using CatBoost algorithm and NHANES data.
Hu, Mengjie; Peng, Han; Zhang, Xuan; Wang, Lefeng; Ren, Jingjing.
Afiliación
  • Hu M; Department of General Practice, First Affiliated Hospital, Zhejiang University School of Medicine, 310003, Hangzhou, China.
  • Peng H; Clinical Research Institute, Zhejiang Provincial People's Hospital (Affiliated People's Hospital of Hangzhou Medical College), Hangzhou, China.
  • Zhang X; Department of Cardiology, The First Affiliated Hospital, Zhejiang University School of Medicine, 310003, Hangzhou, China.
  • Wang L; Kidney Disease Center, the First Affiliated Hospital, College of Medicine, Zhejiang University, 310003, Hangzhou, China.
  • Ren J; Department of General Practice, First Affiliated Hospital, Zhejiang University School of Medicine, 310003, Hangzhou, China. 3204092@zju.edu.cn.
BMC Med Inform Decis Mak ; 24(1): 24, 2024 Jan 24.
Article en En | MEDLINE | ID: mdl-38267946
ABSTRACT
BACKGROUND AND

AIMS:

Sexually transmitted infections (STIs) are a significant global public health challenge due to their high incidence rate and potential for severe consequences when early intervention is neglected. Research shows an upward trend in absolute cases and DALY numbers of STIs, with syphilis, chlamydia, trichomoniasis, and genital herpes exhibiting an increasing trend in age-standardized rate (ASR) from 2010 to 2019. Machine learning (ML) presents significant advantages in disease prediction, with several studies exploring its potential for STI prediction. The objective of this study is to build males-based and females-based STI risk prediction models based on the CatBoost algorithm using data from the National Health and Nutrition Examination Survey (NHANES) for training and validation, with sub-group analysis performed on each STI. The female sub-group also includes human papilloma virus (HPV) infection.

METHODS:

The study utilized data from the National Health and Nutrition Examination Survey (NHANES) program to build males-based and females-based STI risk prediction models using the CatBoost algorithm. Data was collected from 12,053 participants aged 18 to 59 years old, with general demographic characteristics and sexual behavior questionnaire responses included as features. The Adaptive Synthetic Sampling Approach (ADASYN) algorithm was used to address data imbalance, and 15 machine learning algorithms were evaluated before ultimately selecting the CatBoost algorithm. The SHAP method was employed to enhance interpretability by identifying feature importance in the model's STIs risk prediction.

RESULTS:

The CatBoost classifier achieved AUC values of 0.9995, 0.9948, 0.9923, and 0.9996 and 0.9769 for predicting chlamydia, genital herpes, genital warts, gonorrhea, and overall STIs infections among males. The CatBoost classifier achieved AUC values of 0.9971, 0.972, 0.9765, 1, 0.9485 and 0.8819 for predicting chlamydia, genital herpes, genital warts, gonorrhea, HPV and overall STIs infections among females. The characteristics of having sex with new partner/year, times having sex without condom/year, and the number of female vaginal sex partners/lifetime have been identified as the top three significant predictors for the overall risk of male STIs. Similarly, ever having anal sex with a man, age and the number of male vaginal sex partners/lifetime have been identified as the top three significant predictors for the overall risk of female STIs.

CONCLUSIONS:

This study demonstrated the effectiveness of the CatBoost classifier in predicting STI risks among both male and female populations. The SHAP algorithm revealed key predictors for each infection, highlighting consistent demographic characteristics and sexual behaviors across different STIs. These insights can guide targeted prevention strategies and interventions to alleviate the impact of STIs on public health.
Asunto(s)
Palabras clave

Texto completo: 1 Base de datos: MEDLINE Asunto principal: Verrugas / Gonorrea / Herpes Genital / Enfermedades de Transmisión Sexual / Infecciones por Papillomavirus Tipo de estudio: Etiology_studies / Prognostic_studies / Qualitative_research / Risk_factors_studies Idioma: En Revista: BMC Med Inform Decis Mak / BMC med. inform. decis. mak. (Online) / BMC medical informatics and decision making (Online) Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article

Texto completo: 1 Base de datos: MEDLINE Asunto principal: Verrugas / Gonorrea / Herpes Genital / Enfermedades de Transmisión Sexual / Infecciones por Papillomavirus Tipo de estudio: Etiology_studies / Prognostic_studies / Qualitative_research / Risk_factors_studies Idioma: En Revista: BMC Med Inform Decis Mak / BMC med. inform. decis. mak. (Online) / BMC medical informatics and decision making (Online) Asunto de la revista: INFORMATICA MEDICA Año: 2024 Tipo del documento: Article