Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Diabetes Metab Syndr ; 17(12): 102919, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38091881

RESUMO

BACKGROUND AND OBJECTIVE: Diabetic retinopathy (DR) is a global health concern among diabetic patients. The objective of this study was to propose an explainable machine learning (ML)-based system for predicting the risk of DR. MATERIALS AND METHODS: This study utilized publicly available cross-sectional data in a Chinese cohort of 6374 respondents. We employed boruta and least absolute shrinkage and selection operator (LASSO) based feature selection methods to identify the common predictors of DR. Using the identified predictors, we trained and optimized four widly applicable models (artificial neural network, support vector machine, random forest, and extreme gradient boosting (XGBoost) to predict patients with DR. Moreover, shapely additive explanation (SHAP) was adopted to show the contribution of each predictor of DR in the prediction. RESULTS: Combining Boruta and LASSO method revealed that community, TCTG, HDLC, BUN, FPG, HbAlc, weight, and duration were the most important predictors of DR. The XGBoost-based model outperformed the other models, with an accuracy of 90.01%, precision of 91.80%, recall of 97.91%, F1 score of 94.86%, and AUC of 0.850. Moreover, SHAP method showed that HbA1c, community, FPG, TCTG, duration, and UA1b were the influencing predictors of DR. CONCLUSION: The proposed integrating system will be helpful as a tool for selecting significant predictors, which can predict patients who are at high risk of DR at an early stage in China.


Assuntos
Diabetes Mellitus , Retinopatia Diabética , Humanos , Retinopatia Diabética/diagnóstico , Retinopatia Diabética/epidemiologia , Retinopatia Diabética/etiologia , Estudos Transversais , Algoritmos , Aprendizado de Máquina , Fatores de Risco
2.
PLoS One ; 18(8): e0289613, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37616271

RESUMO

BACKGROUND AND OBJECTIVES: Hypertension (HTN), a major global health concern, is a leading cause of cardiovascular disease, premature death and disability, worldwide. It is important to develop an automated system to diagnose HTN at an early stage. Therefore, this study devised a machine learning (ML) system for predicting patients with the risk of developing HTN in Ethiopia. MATERIALS AND METHODS: The HTN data was taken from Ethiopia, which included 612 respondents with 27 factors. We employed Boruta-based feature selection method to identify the important risk factors of HTN. The four well-known models [logistics regression, artificial neural network, random forest, and extreme gradient boosting (XGB)] were developed to predict HTN patients on the training set using the selected risk factors. The performances of the models were evaluated by accuracy, precision, recall, F1-score, and area under the curve (AUC) on the testing set. Additionally, the SHapley Additive exPlanations (SHAP) method is one of the explainable artificial intelligences (XAI) methods, was used to investigate the associated predictive risk factors of HTN. RESULTS: The overall prevalence of HTN patients is 21.2%. This study showed that XGB-based model was the most appropriate model for predicting patients with the risk of HTN and achieved the accuracy of 88.81%, precision of 89.62%, recall of 97.04%, F1-score of 93.18%, and AUC of 0. 894. The XBG with SHAP analysis reveal that age, weight, fat, income, body mass index, diabetes mulitas, salt, history of HTN, drinking, and smoking were the associated risk factors of developing HTN. CONCLUSIONS: The proposed framework provides an effective tool for accurately predicting individuals in Ethiopia who are at risk for developing HTN at an early stage and may help with early prevention and individualized treatment.


Assuntos
Hipertensão , Humanos , Estudos Transversais , Etiópia/epidemiologia , Hipertensão/diagnóstico , Hipertensão/epidemiologia , Algoritmos , Aprendizado de Máquina , Fatores de Risco
3.
Health Syst (Basingstoke) ; 12(2): 243-254, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37234468

RESUMO

This study identified the risk factors for type 2 diabetes (T2D) and proposed a machine learning (ML) technique for predicting T2D. The risk factors for T2D were identified by multiple logistic regression (MLR) using p-value (p<0.05). Then, five ML-based techniques, including logistic regression, naïve Bayes, J48, multilayer perceptron, and random forest (RF) were employed to predict T2D. This study utilized two publicly available datasets, derived from the National Health and Nutrition Examination Survey, 2009-2010 and 2011-2012. About 4922 respondents with 387 T2D patients were included in 2009-2010 dataset, whereas 4936 respondents with 373 T2D patients were included in 2011-2012. This study identified six risk factors (age, education, marital status, SBP, smoking, and BMI) for 2009-2010 and nine risk factors (age, race, marital status, SBP, DBP, direct cholesterol, physical activity, smoking, and BMI) for 2011-2012. RF-based classifier obtained 95.9% accuracy, 95.7% sensitivity, 95.3% F-measure, and 0.946 area under the curve.

4.
PLoS One ; 17(10): e0276718, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36301890

RESUMO

BACKGROUND AND OBJECTIVE: Low birth weight (LBW) is a major risk factor of child mortality and morbidity during infancy (0-3 years) and early childhood (3-8 years) in low and lower-middle-income countries, including Bangladesh. LBW is a vital public health concern in Bangladesh. The objective of the research was to investigate the socioeconomic inequality in the prevalence of LBW among singleton births and identify the significantly associated determinants of singleton LBW in Bangladesh. MATERIALS AND METHODS: The data utilized in this research was derived from the latest nationally representative Bangladesh Demographic and Health Survey, 2017-18, and included a total of 2327 respondents. The concentration index (C-index) and concentration curve were used to investigate the socioeconomic inequality in LBW among the singleton newborn babies. Additionally, an adjusted binary logistic regression model was utilized for calculating adjusted odds ratio and p-value (<0.05) to identify the significant determinants of LBW. RESULTS: The overall prevalence of LBW among singleton births in Bangladesh was 14.27%. We observed that LBW rates were inequitably distributed across the socioeconomic groups (C-index: -0.096, 95% confidence interval: [-0.175, -0.016], P = 0.029), with a higher concentration of LBW infants among mothers living in the lowest wealth quintile (poorest). Regression analysis revealed that maternal age, region, maternal education level, wealth index, height, age at 1st birth, and the child's aliveness (alive or died) at the time of the survey were significantly associated determinants of LBW in Bangladesh. CONCLUSION: In this study, socioeconomic disparity in the prevalence of singleton LBW was evident in Bangladesh. Incidence of LBW might be reduced by improving the socioeconomic status of poor families, paying special attention to mothers who have no education and live in low-income households in the eastern divisions (e.g., Sylhet, Chittagong). Governments, agencies, and non-governmental organizations should address the multifaceted issues and implement preventive programs and policies in Bangladesh to reduce LBW.


Assuntos
Recém-Nascido de Baixo Peso , Mães , Lactente , Recém-Nascido , Criança , Feminino , Pré-Escolar , Humanos , Prevalência , Bangladesh/epidemiologia , Classe Social , Fatores de Risco , Fatores Socioeconômicos , Peso ao Nascer
5.
Diabetes Metab Syndr ; 15(5): 102263, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34482122

RESUMO

AIMS: This research work presented a comparative study of machine learning (ML), including two objectives: (i) determination of the risk factors of diabetic nephropathy (DN) based on principal component analysis (PCA) via different cutoffs; (ii) prediction of DN patients using ML-based techniques. METHODS: The combination of PCA and ML-based techniques has been implemented to select the best features at different PCA cutoff values and choose the optimal PCA cutoff in which ML-based techniques give the highest accuracy. These optimum features are fed into six ML-based techniques: linear discriminant analysis, support vector machine (SVM), logistic regression, K-nearest neighborhood, naïve Bayes, and artificial neural network. The leave-one-out cross-validation protocol is executed and compared ML-based techniques performance using accuracy and area under the curve (AUC). RESULTS: The data utilized in this work consists of 133 respondents having 73 DN patients with an average age of 69.6±10.2 years and 54.2% of DN patients are female. Our findings illustrate that PCA combined with SVM-RBF classifier yields 88.7% accuracy and 0.91 AUC at 0.96 PCA cutoff. CONCLUSIONS: This study also suggests that PCA combined with SVM-RBF classifier may correctly classify DN patients with the highest accuracy when compared to the models published in the existing research. Prospective studies are warranted to further validate the applicability of our model in clinical settings.


Assuntos
Teorema de Bayes , Diabetes Mellitus Tipo 2/complicações , Nefropatias Diabéticas/diagnóstico , Aprendizado de Máquina , Análise de Componente Principal , Medição de Risco/métodos , Máquina de Vetores de Suporte , Estudos de Casos e Controles , Nefropatias Diabéticas/etiologia , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Projetos Piloto , Prognóstico , Reprodutibilidade dos Testes
6.
PLoS One ; 16(6): e0253172, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34138925

RESUMO

AIMS: Malnutrition is a major health issue among Bangladeshi under-five (U5) children. Children are malnourished if the calories and proteins they take through their diet are not sufficient for their growth and maintenance. The goal of the research was to use machine learning (ML) algorithms to detect the risk factors of malnutrition (stunted, wasted, and underweight) as well as their prediction. METHODS: This work utilized malnutrition data that was derived from Bangladesh Demographic and Health Survey which was conducted in 2014. The selected dataset consisted of 7079 children with 13 factors. The potential risks of malnutrition have been identified by logistic regression (LR). Moreover, 3 ML classifiers (support vector machine (SVM), random forest (RF), and LR) have been implemented for predicting malnutrition and the performance of these ML algorithms were assessed on the basis of accuracy. RESULTS: The average prevalence of stunted, wasted, and underweight was 35.4%, 15.4%, and 32.8%, respectively. It was noted that LR identified five risk factors for stunting and underweight, as well as four factors for wasting. Results illustrated that RF can be accurately classified as stunted, wasted, and underweight children and obtained the highest accuracy of 88.3% for stunted, 87.7% for wasted, and 85.7% for underweight. CONCLUSION: This research focused on the identification and prediction of major risk factors for stunting, wasting, and underweight using ML algorithms which will aid policymakers in reducing malnutrition among Bangladesh's U5 children.


Assuntos
Transtornos do Crescimento/etiologia , Desnutrição/etiologia , Magreza/etiologia , Síndrome de Emaciação/etiologia , Fatores Etários , Algoritmos , Bangladesh , Pré-Escolar , Dieta , Feminino , Transtornos do Crescimento/epidemiologia , Humanos , Lactente , Aprendizado de Máquina , Masculino , Desnutrição/epidemiologia , Prevalência , Fatores de Risco , Fatores Socioeconômicos , Magreza/epidemiologia , Síndrome de Emaciação/epidemiologia
7.
Diabetes Metab Syndr ; 15(3): 877-884, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33892404

RESUMO

BACKGROUND AND AIMS: Hypertension has become a major public health issue as the prevalence and risk of premature death and disability among adults due to hypertension has increased globally. The main objective is to characterize the risk factors of hypertension among adults in Bangladesh using machine learning (ML) algorithms. MATERIALS AND METHODS: The hypertension data was derived from Bangladesh demographic and health survey, 2017-18, which included 6965 people aged 35 and above. Two most promising risk factor identification methods, namely least absolute shrinkage operator (LASSO) and support vector machine recursive feature elimination (SVMRFE) are implemented to detect the critical risk factors of hypertension. Additionally, four well-known ML algorithms as artificial neural network, decision tree, random forest, and gradient boosting (GB) have been used to predict hypertension. Performance scores of these algorithms were evaluated by accuracy, precision, recall, F-measure, and area under the curve (AUC). RESULTS: The results clarify that age, BMI, wealth index, working status, and marital status for LASSO and age, BMI, marital status, diabetes and region for SVMRFE appear to be the top-most five significant risk factors for hypertension. Our findings reveal that the combination of SVMRFE-GB gives the maximum accuracy (66.98%), recall (97.92%), F-measure (78.99%), and AUC (0.669) compared to others. CONCLUSION: GB-based algorithm confirms the best performer for prediction of hypertension, at an early stage in Bangladesh. Therefore, this study highly suggests that the policymakers make proper judgments for controlling hypertension using SVMRFE-GB-based combination to save time and reduce cost for Bangladeshi adults.


Assuntos
Algoritmos , Bases de Dados Factuais , Hipertensão/epidemiologia , Aprendizado de Máquina , Redes Neurais de Computação , Adulto , Idoso , Bangladesh/epidemiologia , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico , Fatores de Risco
8.
Diabetes Metab Syndr ; 14(3): 217-219, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32193086

RESUMO

BACKGROUND AND AIMS: Diabetes has been recognized as a continuing health challenge for the twenty-first century, both in developed and developing countries including Bangladesh. The main objective of this study is to use machine learning (ML) based classifiers for automated detection and classification of diabetes. METHODS: The diabetes dataset have taken from Bangladesh demographic and health survey, 2011 data having 1569 respondents are 127 diabetes. Two statistical tests as independent t for continuous and chi-square for categorical variables are used to determine the risk factors of diabetes. Six ML-based classifiers as support vector machine, random forest, linear discriminant analysis, logistic regression, k-nearest neighborhood, bagged classification and regression tree (Bagged CART) have been adopted to predict and classify of diabetes. RESULTS: Our findings show that 11 factors out of 15 factors are significantly associated with diabetes. Bagged CART provides the highest accuracy and area under the curve of 94.3% and 0.600. CONCLUSIONS: Bagged CART anticipates a very supportive computational resource for classification of diabetes and it would be very helpful to the doctors for making a decision to control diabetes disease in Bangladesh.


Assuntos
Automação , Diabetes Mellitus/classificação , Diabetes Mellitus/diagnóstico , Inquéritos Epidemiológicos/métodos , Aprendizado de Máquina , Adulto , Área Sob a Curva , Bangladesh/epidemiologia , Demografia , Diabetes Mellitus/epidemiologia , Diabetes Mellitus Tipo 2/classificação , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/epidemiologia , Análise Discriminante , Análise Fatorial , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte , Resultado do Tratamento
9.
Health Inf Sci Syst ; 8(1): 7, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31949894

RESUMO

BACKGROUND AND OBJECTIVES: Diabetes is a chronic disease characterized by high blood sugar. It may cause many complicated disease like stroke, kidney failure, heart attack, etc. About 422 million people were affected by diabetes disease in worldwide in 2014. The figure will be reached 642 million in 2040. The main objective of this study is to develop a machine learning (ML)-based system for predicting diabetic patients. MATERIALS AND METHODS: Logistic regression (LR) is used to identify the risk factors for diabetes disease based on p value and odds ratio (OR). We have adopted four classifiers like naïve Bayes (NB), decision tree (DT), Adaboost (AB), and random forest (RF) to predict the diabetic patients. Three types of partition protocols (K2, K5, and K10) have also adopted and repeated these protocols into 20 trails. Performances of these classifiers are evaluated using accuracy (ACC) and area under the curve (AUC). RESULTS: We have used diabetes dataset, conducted in 2009-2012, derived from the National Health and Nutrition Examination Survey. The dataset consists of 6561 respondents with 657 diabetic and 5904 controls. LR model demonstrates that 7 factors out of 14 as age, education, BMI, systolic BP, diastolic BP, direct cholesterol, and total cholesterol are the risk factors for diabetes. The overall ACC of ML-based system is 90.62%. The combination of LR-based feature selection and RF-based classifier gives 94.25% ACC and 0.95 AUC for K10 protocol. CONCLUSION: The combination of LR and RF-based classifier performs better. This combination will be very helpful for predicting diabetic patients.

10.
J Glob Health ; 8(1): 010417, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-29740501

RESUMO

BACKGROUND: Child and neonatal mortality is a serious problem in Bangladesh. The main objective of this study was to determine the most significant socio-economic factors (covariates) between the years 2011 and 2014 that influences on neonatal and child mortality and to further suggest the plausible policy proposals. METHODS: We modeled the neonatal and child mortality as categorical dependent variable (alive vs death of the child) while 16 covariates are used as independent variables using χ2 statistic and multiple logistic regression (MLR) based on maximum likelihood estimate. FINDINGS: Using the MLR, for neonatal mortality, diarrhea showed the highest positive coefficient (ß = 1.130; P < 0.010) leading to most significant covariate for both 2011 and 2014. The corresponding odds ratios were: 0.323 for both the years. The second most significant covariate in 2011 was birth order between 2-6 years (ß = 0.744; P < 0.001), while father's education was negative correlation (ß = -0.910; P < 0.050). In general, 10 covariates in 2011 and 5 covariates in 2014 were significant, so there was an improvement in socio-economic conditions for neonatal mortality. For child mortality, birth order between 2-6 years and 7 and above years showed the highest positive coefficients (ß = 1.042; P < 0.010) and (ß = 1.285; P < 0.050) for 2011. The corresponding odds ratios were: 2.835 and 3.614, respectively. Father's education showed the highest coefficient (ß = 0.770; P < 0.050) indicating the significant covariate for 2014 and the corresponding odds ratio was 2.160. In general, 6 covariates in 2011 and 4 covariates in 2014 were also significant, so there was also an improvement in socio-economic conditions for child mortality. This study allows policy makers to make appropriate decisions to reduce neonatal and child mortality in Bangladesh. CONCLUSIONS: In 2014, mother's age and father's education were also still significant covariates for child mortality. This study allows policy makers to make appropriate decisions to reduce neonatal and child mortality in Bangladesh.


Assuntos
Mortalidade da Criança/tendências , Mortalidade Infantil/tendências , Adolescente , Adulto , Bangladesh/epidemiologia , Pré-Escolar , Escolaridade , Pai/estatística & dados numéricos , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Idade Materna , Pessoa de Meia-Idade , Fatores de Risco , Fatores Socioeconômicos , Adulto Jovem
11.
J Med Syst ; 42(5): 92, 2018 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-29637403

RESUMO

Diabetes mellitus is a group of metabolic diseases in which blood sugar levels are too high. About 8.8% of the world was diabetic in 2017. It is projected that this will reach nearly 10% by 2045. The major challenge is that when machine learning-based classifiers are applied to such data sets for risk stratification, leads to lower performance. Thus, our objective is to develop an optimized and robust machine learning (ML) system under the assumption that missing values or outliers if replaced by a median configuration will yield higher risk stratification accuracy. This ML-based risk stratification is designed, optimized and evaluated, where: (i) the features are extracted and optimized from the six feature selection techniques (random forest, logistic regression, mutual information, principal component analysis, analysis of variance, and Fisher discriminant ratio) and combined with ten different types of classifiers (linear discriminant analysis, quadratic discriminant analysis, naïve Bayes, Gaussian process classification, support vector machine, artificial neural network, Adaboost, logistic regression, decision tree, and random forest) under the hypothesis that both missing values and outliers when replaced by computed medians will improve the risk stratification accuracy. Pima Indian diabetic dataset (768 patients: 268 diabetic and 500 controls) was used. Our results demonstrate that on replacing the missing values and outliers by group median and median values, respectively and further using the combination of random forest feature selection and random forest classification technique yields an accuracy, sensitivity, specificity, positive predictive value, negative predictive value and area under the curve as: 92.26%, 95.96%, 79.72%, 91.14%, 91.20%, and 0.93, respectively. This is an improvement of 10% over previously developed techniques published in literature. The system was validated for its stability and reliability. RF-based model showed the best performance when outliers are replaced by median values.


Assuntos
Diabetes Mellitus/classificação , Diabetes Mellitus/epidemiologia , Aprendizado de Máquina , Adulto , Distribuição por Idade , Inteligência Artificial , Teorema de Bayes , Glicemia , Pressão Sanguínea , Pesos e Medidas Corporais , Interpretação Estatística de Dados , Técnicas de Apoio para a Decisão , Feminino , Humanos , Indígenas Norte-Americanos , Masculino , Pessoa de Meia-Idade , Redes Neurais de Computação , Reprodutibilidade dos Testes , Distribuição por Sexo , Estados Unidos
12.
PLoS One ; 12(12): e0189677, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29261760

RESUMO

Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or ß-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized ß = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.


Assuntos
Mães , População Rural , Adulto , Bangladesh , Humanos , Recém-Nascido , Análise dos Mínimos Quadrados
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...