RESUMO
OBJECTIVES: To enhance the identification of individuals at risk of developing clinically significant kidney stones. METHODS: In this study, data from the Fasa Adults Cohort Study were analyzed to explore factors linked to symptomatic and clinically significant kidney stone disease. After cleaning, 10,128 participants with 103 variables were studied. One outcome variable (presence of symptomatic kidney stones) and 102 predictor variables from surveys and tests were assessed. Five Machine learning (ML) algorithms (SVM, RF, KNN, GBM, XGB) were applied to examine kidney stone factors, with performance comparisons made. Data balancing was done using SMOTE, and metrics like accuracy, precision, sensitivity, specificity, F1 score, and AUC were evaluated for each algorithm. RESULTS: The XGB model outperformed others with AUC of 0.60, while RF, GBM, SVC, and KNN had AUC values of 0.58, 0.57, 0.54, and 0.52. RF, GBM, and XGB showed good accuracy at 0.81, 0.81, and 0.77. Top predictors for kidney stones were serum creatinine, salt intake, hospitalization history, sleep duration, and BUN levels. CONCLUSIONS: ML models show promise in evaluating an individual's risk of developing painful kidney stones and recommending early lifestyle changes to reduce this risk. Further research can enhance predictive accuracy and tailor interventions for better prevention/management.
Assuntos
Algoritmos , Cálculos Renais , Aprendizado de Máquina , Humanos , Cálculos Renais/diagnóstico , Feminino , Masculino , Adulto , Pessoa de Meia-Idade , Estudos de Coortes , Fatores de RiscoAssuntos
Insuficiência Cardíaca , Aprendizado de Máquina , Readmissão do Paciente , Humanos , Insuficiência Cardíaca/mortalidade , Insuficiência Cardíaca/terapia , Readmissão do Paciente/estatística & dados numéricos , Estudos Retrospectivos , Masculino , Feminino , Medição de Risco/métodos , Idoso , Fatores de Risco , Prognóstico , Pessoa de Meia-IdadeRESUMO
BACKGROUND: Factors contributing to the development of hypertension exhibit significant variations across countries and regions. Our objective was to predict individuals at risk of developing hypertension within a 5-year period in a rural Middle Eastern area. METHODS: This longitudinal study utilized data from the Fasa Adults Cohort Study (FACS). The study initially included 10,118 participants aged 35-70 years in rural districts of Fasa, Iran, with a follow-up of 3,000 participants after 5 years using random sampling. A total of 160 variables were included in the machine learning (ML) models, and feature scaling and one-hot encoding were employed for data processing. Ten supervised ML algorithms were utilized, namely logistic regression (LR), support vector machine (SVM), random forest (RF), Gaussian naive Bayes (GNB), linear discriminant analysis (LDA), k-nearest neighbors (KNN), gradient boosting machine (GBM), extreme gradient boosting (XGB), cat boost (CAT), and light gradient boosting machine (LGBM). Hyperparameter tuning was performed using various combinations of hyperparameters to identify the optimal model. Synthetic Minority Over-sampling Technology (SMOTE) was used to balance the training data, and feature selection was conducted using SHapley Additive exPlanations (SHAP). RESULTS: Out of 2,288 participants who met the criteria, 251 individuals (10.9%) were diagnosed with new hypertension. The LGBM model (determined to be the optimal model) with the top 30 features achieved an AUC of 0.67, an f1-score of 0.23, and an AUC-PR of 0.26. The top three predictors of hypertension were baseline systolic blood pressure (SBP), gender, and waist-to-hip ratio (WHR), with AUCs of 0.66, 0.58, and 0.63, respectively. Hematuria in urine tests and family history of hypertension ranked fourth and fifth. CONCLUSION: ML models have the potential to be valuable decision-making tools in evaluating the need for early lifestyle modification or medical intervention in individuals at risk of developing hypertension.
Assuntos
Hipertensão , Adulto , Humanos , Pressão Sanguínea , Teorema de Bayes , Estudos de Coortes , Seguimentos , Estudos Longitudinais , Hipertensão/diagnóstico , Hipertensão/epidemiologia , Aprendizado de MáquinaRESUMO
BACKGROUND: Heart failure (HF) is a global problem, affecting more than 26 million people worldwide. This study evaluated the performance of 10 machine learning (ML) algorithms and chose the best algorithm to predict mortality and readmission of HF patients by using The Fasa Registry on Systolic HF (FaRSH) database. HYPOTHESIS: ML algorithms may better identify patients at increased risk of HF readmission or death with demographic and clinical data. METHODS: Through comprehensive evaluation, the best-performing model was used for prediction. Finally, all the trained models were applied to the test data, which included 20% of the total data. For the final evaluation and comparison of the models, five metrics were used: accuracy, F1-score, sensitivity, specificity and Area Under Curve (AUC). RESULTS: Ten ML algorithms were evaluated. The CatBoost (CAT) algorithm uses a series of decision tree models to create a nonlinear model, and this CAT algorithm performed the best of the 10 models studied. According to the three final outcomes from this study, which involved 2488 participants, 366 (14.7%) of the patients were readmitted to the hospital, 97 (3.9%) of the patients died within 1 month of the follow-up, and 342 (13.7%) of the patients died within 1 year of the follow-up. The most significant variables to predict the events were length of stay in the hospital, hemoglobin level, and family history of MI. CONCLUSIONS: The ML-based risk stratification tool was able to assess the risk of 5-year all-cause mortality and readmission in patients with HF. ML could provide an explicit explanation of individualized risk prediction and give physicians an intuitive understanding of the influence of critical features in the model.
Assuntos
Insuficiência Cardíaca , Readmissão do Paciente , Humanos , Estudos Retrospectivos , Insuficiência Cardíaca/diagnóstico , Insuficiência Cardíaca/terapia , Aprendizado de Máquina , Fatores de RiscoRESUMO
INTRODUCTION: The application of machine learning (ML) is increasingly growing in biomedical sciences. This study aimed to evaluate factors associated with type 2 diabetes mellitus (T2DM) and compare the performance of ML methods in identifying individuals with the disease in an Iranian setting. METHODS: Using the baseline data from Fasa Adult Cohort Study (FACS) and in a sex-stratified manner, we studied factors associated with T2DM by applying seven different ML methods including Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbours (KNN), Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGB) and Bagging classifier (BAG). We further compared the performance of these methods; for each algorithm, accuracy, precision, sensitivity, specificity, F1 score, and Area Under Curve (AUC) were calculated. RESULTS: 10,112 participants were recruited between 2014 and 2016, of whom 1246 had T2DM at baseline. 4566 (45%) participants were males, aged between 35 and 70 years. For males, age, sugar consumption, and history of hospitalization were the most weighted variables regarding their importance in screening for T2DM using the GBM model, respectively; these variables were sugar consumption, urine blood, and age for females. GBM outperformed other models for both males and females with AUC of 0.75 (0.69-0.82) and 0.76 (0.71-0.80), and F1 score of 0.33 (0.27-0.39) and 0.42 (0.38-0.46), respectively. GBM also showed a sensitivity of 0.24 (0.19-0.29) and a specificity of 0.98 (0.96-1.0) in males and a sensitivity of 0.38 (0.34-0.42) and specificity of 0.92 (0.89-0.95) in females. Notably, close performance characteristics were detected among other ML models. CONCLUSIONS: GBM model might achieve better performance in screening for T2DM in a south Iranian population.
Assuntos
Diabetes Mellitus Tipo 2 , Adulto , Feminino , Masculino , Humanos , Pessoa de Meia-Idade , Idoso , Diabetes Mellitus Tipo 2/diagnóstico , Estudos de Coortes , Irã (Geográfico)/epidemiologia , Algoritmos , Aprendizado de Máquina , Açúcares da DietaRESUMO
OBJECTIVE: Accurate prediction of the morbidity and mortality outcomes of traumatic brain injury patients is still challenging. In the present study, we aimed to compare the predictive value of the Richmond and Rotterdam scoring systems as two novel computed tomography-based predictive models. METHODS: We retrospectively analyzed 1400 subjects who suffered from severe traumatic brain injury and were admitted to Emtiaz Hospital, a tertiary referral trauma center in Shiraz, south of Iran, from January 2018 to December 2019. We evaluated the 1-month results; considering two primary factors: mortality and morbidity. The patients' condition was the basis for this assessment. We conducted a logistic regression analysis to determine the association between scoring systems and outcomes. To determine the optimal threshold value, we utilized the receiver operating characteristic curve model. RESULTS: The mean age of participants was 36.61 ± 17.58 years, respectively. Concerning predicting the mortality rate, the area under the curve (AUC) for the Rotterdam score was relatively low 0.64 (95% confidence interval: 0.60, 0.67), while the Richmond score had a higher AUC 0.74 (0.71-0.77), which demonstrated the superiority of this scoring system. Moreover, the Richmond score was more accurate for predicting 1-month morbidity with AUC: 0.71 (0.69, 0.74) versus 0.62 (0.59, 0.65). CONCLUSIONS: The Richmond scoring system demonstrated more accurate predictions for the present outcomes. The simplicity and predictive value of the Richmond score make this system an ideal option for use in emergency settings and centers with high patient loads.
Assuntos
Lesões Encefálicas Traumáticas , Humanos , Adulto Jovem , Adulto , Pessoa de Meia-Idade , Estudos Retrospectivos , Lesões Encefálicas Traumáticas/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Morbidade , Centros de Atenção Terciária , PrognósticoRESUMO
Predicting treatment outcomes in traumatic brain injury (TBI) patients is challenging worldwide. The present study aimed to achieve the most accurate machine learning (ML) algorithms to predict the outcomes of TBI treatment by evaluating demographic features, laboratory data, imaging indices, and clinical features. We used data from 3347 patients admitted to a tertiary trauma centre in Iran from 2016 to 2021. After the exclusion of incomplete data, 1653 patients remained. We used ML algorithms such as random forest (RF) and decision tree (DT) with ten-fold cross-validation to develop the best prediction model. Our findings reveal that among different variables included in this study, the motor component of the Glasgow coma scale, the condition of pupils, and the condition of cisterns were the most reliable features for predicting in-hospital mortality, while the patients' age takes the place of cisterns condition when considering the long-term survival of TBI patients. Also, we found that the RF algorithm is the best model to predict the short-term mortality of TBI patients. However, the generalized linear model (GLM) algorithm showed the best performance (with an accuracy rate of 82.03 ± 2.34) in predicting the long-term survival of patients. Our results showed that using appropriate markers and with further development, ML has the potential to predict TBI patients' survival in the short- and long-term.