Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Lipids Health Dis ; 23(1): 152, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38773573

RESUMEN

BACKGROUND: Alzheimer's disease (AD) is a chronic neurodegenerative disorder that poses a substantial economic burden. The Random forest algorithm is effective in predicting AD; however, the key factors influencing AD onset remain unclear. This study aimed to analyze the key lipoprotein and metabolite factors influencing AD onset using machine-learning methods. It provides new insights for researchers and medical personnel to understand AD and provides a reference for the early diagnosis, treatment, and early prevention of AD. METHODS: A total of 603 participants, including controls and patients with AD with complete lipoprotein and metabolite data from the Alzheimer's disease Neuroimaging Initiative (ADNI) database between 2005 and 2016, were enrolled. Random forest, Lasso regression, and CatBoost algorithms were employed to rank and filter 213 lipoprotein and metabolite variables. Variables with consistently high importance rankings from any two methods were incorporated into the models. Finally, the variables selected from the three methods, with the participants' age, sex, and marital status, were used to construct a random forest predictive model. RESULTS: Fourteen lipoprotein and metabolite variables were screened using the three methods, and 17 variables were included in the AD prediction model based on age, sex, and marital status of the participants. The optimal random forest modeling was constructed with "mtry" set to 3 and "ntree" set to 300. The model exhibited an accuracy of 71.01%, a sensitivity of 79.59%, a specificity of 65.28%, and an AUC (95%CI) of 0.724 (0.645-0.804). When Mean Decrease Accuracy and Gini were used to rank the proteins, age, phospholipids to total lipids ratio in intermediate-density lipoproteins (IDL_PL_PCT), and creatinine were among the top five variables. CONCLUSIONS: Age, IDL_PL_PCT, and creatinine levels play crucial roles in AD onset. Regular monitoring of lipoproteins and their metabolites in older individuals is significant for early AD diagnosis and prevention.


Asunto(s)
Enfermedad de Alzheimer , Lipoproteínas , Aprendizaje Automático , Humanos , Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/sangre , Enfermedad de Alzheimer/metabolismo , Femenino , Masculino , Anciano , Lipoproteínas/sangre , Anciano de 80 o más Años , Algoritmos , Biomarcadores/sangre
2.
BMC Med Inform Decis Mak ; 24(1): 24, 2024 Jan 24.
Artículo en Inglés | MEDLINE | ID: mdl-38267946

RESUMEN

BACKGROUND AND AIMS: Sexually transmitted infections (STIs) are a significant global public health challenge due to their high incidence rate and potential for severe consequences when early intervention is neglected. Research shows an upward trend in absolute cases and DALY numbers of STIs, with syphilis, chlamydia, trichomoniasis, and genital herpes exhibiting an increasing trend in age-standardized rate (ASR) from 2010 to 2019. Machine learning (ML) presents significant advantages in disease prediction, with several studies exploring its potential for STI prediction. The objective of this study is to build males-based and females-based STI risk prediction models based on the CatBoost algorithm using data from the National Health and Nutrition Examination Survey (NHANES) for training and validation, with sub-group analysis performed on each STI. The female sub-group also includes human papilloma virus (HPV) infection. METHODS: The study utilized data from the National Health and Nutrition Examination Survey (NHANES) program to build males-based and females-based STI risk prediction models using the CatBoost algorithm. Data was collected from 12,053 participants aged 18 to 59 years old, with general demographic characteristics and sexual behavior questionnaire responses included as features. The Adaptive Synthetic Sampling Approach (ADASYN) algorithm was used to address data imbalance, and 15 machine learning algorithms were evaluated before ultimately selecting the CatBoost algorithm. The SHAP method was employed to enhance interpretability by identifying feature importance in the model's STIs risk prediction. RESULTS: The CatBoost classifier achieved AUC values of 0.9995, 0.9948, 0.9923, and 0.9996 and 0.9769 for predicting chlamydia, genital herpes, genital warts, gonorrhea, and overall STIs infections among males. The CatBoost classifier achieved AUC values of 0.9971, 0.972, 0.9765, 1, 0.9485 and 0.8819 for predicting chlamydia, genital herpes, genital warts, gonorrhea, HPV and overall STIs infections among females. The characteristics of having sex with new partner/year, times having sex without condom/year, and the number of female vaginal sex partners/lifetime have been identified as the top three significant predictors for the overall risk of male STIs. Similarly, ever having anal sex with a man, age and the number of male vaginal sex partners/lifetime have been identified as the top three significant predictors for the overall risk of female STIs. CONCLUSIONS: This study demonstrated the effectiveness of the CatBoost classifier in predicting STI risks among both male and female populations. The SHAP algorithm revealed key predictors for each infection, highlighting consistent demographic characteristics and sexual behaviors across different STIs. These insights can guide targeted prevention strategies and interventions to alleviate the impact of STIs on public health.


Asunto(s)
Gonorrea , Herpes Genital , Infecciones por Papillomavirus , Enfermedades de Transmisión Sexual , Verrugas , Femenino , Masculino , Humanos , Adolescente , Adulto Joven , Adulto , Persona de Mediana Edad , Encuestas Nutricionales , Enfermedades de Transmisión Sexual/epidemiología , Algoritmos
3.
Sensors (Basel) ; 23(15)2023 Jul 28.
Artículo en Inglés | MEDLINE | ID: mdl-37571525

RESUMEN

The internal structure of wind turbines is intricate and precise, although the challenging working conditions often give rise to various operational faults. This study aims to address the limitations of traditional machine learning algorithms in wind turbine fault detection and the imbalance of positive and negative samples in the fault detection dataset. To achieve the real-time detection of wind turbine group faults and to capture wind turbine fault state information, an enhanced ASL-CatBoost algorithm is proposed. Additionally, a crawling animal search algorithm that incorporates the Tent chaotic mapping and t-distribution mutation strategy is introduced to assess the sensitivity of the ASL-CatBoost algorithm toward hyperparameters and the difficulty of manual hyperparameter setting. The effectiveness of the proposed hyperparameter optimization strategy, termed the TtRSA algorithm, is demonstrated through a comparison of traditional intelligent optimization algorithms using 11 benchmark test functions. When applied to the hyperparameter optimization of the ASL-CatBoost algorithm, the TtRSA-ASL-CatBoost algorithm exhibits notable enhancements in accuracy, recall, and other performance measures compared with the ASL-CatBoost algorithm and other ensemble learning algorithms. The experimental results affirm that the proposed algorithm model improvement strategy effectively enhances the wind turbine fault detection classification recognition rate.

4.
Front Endocrinol (Lausanne) ; 14: 1292167, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38047114

RESUMEN

Objective: To screen for predictive obesity factors in overweight populations using an optimal and interpretable machine learning algorithm. Methods: This cross-sectional study was conducted between June 2011 and January 2012. The participants were randomly selected using a simple random sampling technique. Seven commonly used machine learning methods were employed to construct obesity risk prediction models. A total of 5,236 Chinese participants from Ningde City, Fujian Province, Southeast China, participated in this study. The best model was selected through appropriate verification and validation and suitably explained. Subsequently, a minimal set of significant predictors was identified. The Shapley additive explanation force plot was used to illustrate the model at the individual level. Results: Machine learning models for predicting obesity have demonstrated strong performance, with CatBoost emerging as the most effective in both model validity and net clinical benefit. Specifically, the CatBoost algorithm yielded the highest scores, registering 0.91 in the training set and an impressive 0.83 in the test set. This was further corroborated by the area under the curve (AUC) metrics, where CatBoost achieved 0.95 for the training set and 0.87 for the test set. In a rigorous five-fold cross-validation, the AUC for the CatBoost model ranged between 0.84 and 0.91, with an average AUC of ROC at 0.87 ± 0.022. Key predictors identified within these models included waist circumference, hip circumference, female gender, and systolic blood pressure. Conclusion: CatBoost may be the best machine learning method for prediction. Combining Shapley's additive explanation and machine learning methods can be effective in identifying disease risk factors for prevention and control.


Asunto(s)
Obesidad , Sobrepeso , Adulto , Femenino , Humanos , Sobrepeso/diagnóstico , Sobrepeso/epidemiología , Estudios Transversales , Obesidad/diagnóstico , Obesidad/epidemiología , Algoritmos , Aprendizaje Automático
5.
Bioengineering (Basel) ; 9(10)2022 Sep 30.
Artículo en Inglés | MEDLINE | ID: mdl-36290485

RESUMEN

Metal-organic frameworks (MOFs) have been widely researched as drug delivery systems due to their intrinsic porous structures. Herein, machine learning (ML) technologies were applied for the screening of MOFs with high drug loading capacity. To achieve this, first, a comprehensive dataset was gathered, including 40 data points from more than 100 different publications. The organic linkers, metal ions, and the functional groups, as well as the surface area and the pore volume of the investigated MOFs, were chosen as the model's inputs, and the output was the ibuprofen (IBU) loading capacity. Thereafter, various advanced and powerful machine learning algorithms, such as support vector regression (SVR), random forest (RF), adaptive boosting (AdaBoost), and categorical boosting (CatBoost), were employed to predict the ibuprofen loading capacity of MOFs. The coefficient of determination (R2) of 0.70, 0.72, 0.66, and 0.76 were obtained for the SVR, RF, AdaBoost, and CatBoost approaches, respectively. Among all the algorithms, CatBoost was the most reliable, exhibiting superior performance regarding the sparse matrices and categorical features. Shapley additive explanations (SHAP) analysis was employed to explore the impact of the eigenvalues of the model's outputs. Our initial results indicate that this methodology is a well generalized, straightforward, and cost-effective method that can be applied not only for the prediction of IBU loading capacity, but also in many other biomaterials projects.

SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda