Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 357
Filtrar
Mais filtros

Eixos temáticos
Tipo de documento
Intervalo de ano de publicação
1.
Cereb Cortex ; 34(4)2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38679476

RESUMO

Spinocerebellar ataxia type 12 is a hereditary and neurodegenerative illness commonly found in India. However, there is no established noninvasive automatic diagnostic system for its diagnosis and identification of imaging biomarkers. This work proposes a novel four-phase machine learning-based diagnostic framework to find spinocerebellar ataxia type 12 disease-specific atrophic-brain regions and distinguish spinocerebellar ataxia type 12 from healthy using a real structural magnetic resonance imaging dataset. Firstly, each brain region is represented in terms of statistics of coefficients obtained using 3D-discrete wavelet transform. Secondly, a set of relevant regions are selected using a graph network-based method. Thirdly, a kernel support vector machine is used to capture nonlinear relationships among the voxels of a brain region. Finally, the linear relationship among the brain regions is captured to build a decision model to distinguish spinocerebellar ataxia type 12 from healthy by using the regularized logistic regression method. A classification accuracy of 95% and a harmonic mean of precision and recall, i.e. F1-score of 94.92%, is achieved. The proposed framework provides relevant regions responsible for the atrophy. The importance of each region is captured using Shapley Additive exPlanations values. We also performed a statistical analysis to find volumetric changes in spinocerebellar ataxia type 12 group compared to healthy. The promising result of the proposed framework shows that clinicians can use it for early and timely diagnosis of spinocerebellar ataxia type 12.


Assuntos
Biomarcadores , Encéfalo , Imageamento por Ressonância Magnética , Ataxias Espinocerebelares , Máquina de Vetores de Suporte , Humanos , Imageamento por Ressonância Magnética/métodos , Ataxias Espinocerebelares/diagnóstico por imagem , Ataxias Espinocerebelares/genética , Ataxias Espinocerebelares/diagnóstico , Encéfalo/diagnóstico por imagem , Encéfalo/patologia , Encéfalo/metabolismo , Biomarcadores/análise , Masculino , Feminino , Adulto , Modelos Logísticos , Pessoa de Meia-Idade , Atrofia
2.
Pancreatology ; 24(3): 404-423, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38342661

RESUMO

Pancreatic cancer is one of digestive tract cancers with high mortality rate. Despite the wide range of available treatments and improvements in surgery, chemotherapy, and radiation therapy, the five-year prognosis for individuals diagnosed pancreatic cancer remains poor. There is still research to be done to see if immunotherapy may be used to treat pancreatic cancer. The goals of our research were to comprehend the tumor microenvironment of pancreatic cancer, found a useful biomarker to assess the prognosis of patients, and investigated its biological relevance. In this paper, machine learning methods such as random forest were fused with weighted gene co-expression networks for screening hub immune-related genes (hub-IRGs). LASSO regression model was used to further work. Thus, we got eight hub-IRGs. Based on hub-IRGs, we created a prognosis risk prediction model for PAAD that can stratify accurately and produce a prognostic risk score (IRG_Score) for each patient. In the raw data set and the validation data set, the five-year area under the curve (AUC) for this model was 0.9 and 0.7, respectively. And shapley additive explanation (SHAP) portrayed the importance of prognostic risk prediction influencing factors from a machine learning perspective to obtain the most influential certain gene (or clinical factor). The five most important factors were TRIM67, CORT, PSPN, SCAMP5, RFXAP, all of which are genes. In summary, the eight hub-IRGs had accurate risk prediction performance and biological significance, which was validated in other cancers. The result of SHAP helped to understand the molecular mechanism of pancreatic cancer.


Assuntos
Neoplasias Pancreáticas , Humanos , Área Sob a Curva , Redes Reguladoras de Genes , Imunoterapia , Aprendizado de Máquina , Microambiente Tumoral , Proteínas de Membrana
3.
Br J Clin Pharmacol ; 90(3): 691-699, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-37845041

RESUMO

AIMS: Heart failure with reduced ejection fraction (HFrEF) poses significant challenges for clinicians and researchers, owing to its multifaceted aetiology and complex treatment regimens. In light of this, artificial intelligence methods offer an innovative approach to identifying relationships within complex clinical datasets. Our study aims to explore the potential for machine learning algorithms to provide deeper insights into datasets of HFrEF patients. METHODS: To this end, we analysed a cohort of 386 HFrEF patients who had been initiated on sodium-glucose co-transporter-2 inhibitor treatment and had completed a minimum of a 6-month follow-up. RESULTS: In traditional frequentist statistical analyses, patients receiving the highest doses of beta-blockers (BBs) (chi-square test, P = .036) and those newly initiated on sacubitril-valsartan (chi-square test, P = .023) showed better outcomes. However, none of these pharmacological features stood out as independent predictors of improved outcomes in the Cox proportional hazards model. In contrast, when employing eXtreme Gradient Boosting (XGBoost) algorithms in conjunction with the data using Shapley additive explanations (SHAP), we identified several models with significant predictive power. The XGBoost algorithm inherently accommodates non-linear distribution, multicollinearity and confounding. Within this framework, pharmacological categories like 'newly initiated treatment with sacubitril/valsartan' and 'BB dose escalation' emerged as strong predictors of long-term outcomes. CONCLUSIONS: In this manuscript, we not only emphasize the strengths of this machine learning approach but also discuss its potential limitations and the risk of identifying statistically significant yet clinically irrelevant predictors.


Assuntos
Insuficiência Cardíaca , Humanos , Insuficiência Cardíaca/tratamento farmacológico , Insuficiência Cardíaca/induzido quimicamente , Tetrazóis/efeitos adversos , Inteligência Artificial , Volume Sistólico , Aprendizado de Máquina
4.
Dement Geriatr Cogn Disord ; : 1-11, 2024 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-38776891

RESUMO

INTRODUCTION: The prevalence of cognitive impairment and dementia in the older population is increasing, and thereby, early detection of cognitive decline is essential for effective intervention. METHODS: This study included 2,288 participants with normal cognitive function from the Ma'anshan Healthy Aging Cohort Study. Forty-two potential predictors, including demographic characteristics, chronic diseases, lifestyle factors, anthropometric indices, physical function, and baseline cognitive function, were selected based on clinical importance and previous research. The dataset was partitioned into training, validation, and test sets in a proportion of 60% for training, 20% for validation, and 20% for testing, respectively. Recursive feature elimination was used for feature selection, followed by six machine learning algorithms that were employed for model development. The performance of the models was evaluated using area under the curve (AUC), specificity, sensitivity, and accuracy. Moreover, SHapley Additive exPlanations (SHAP) was conducted to access the interpretability of the final selected model and to gain insights into the impact of features on the prediction outcomes. SHAP force plots were established to vividly show the application of the prediction model at the individual level. RESULTS: The final predictive model based on the Naive Bayes algorithm achieved an AUC of 0.820 (95% CI, 0.773-0.887) on the test set, outperforming other algorithms. The top ten influential features in the model included baseline Mini-Mental State Examination (MMSE), education, self-reported economic status, collective or social activities, Pittsburgh sleep quality index (PSQI), body mass index, systolic blood pressure, diastolic blood pressure, instrumental activities of daily living, and age. The model demonstrated the potential to identify individuals at a higher risk of cognitive impairment within 3 years from older adults. CONCLUSION: The predictive model developed in this study contributes to the early detection of cognitive impairment in older adults by primary healthcare staff in community settings.

5.
Environ Sci Technol ; 58(29): 13035-13046, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-38982681

RESUMO

Gaseous nitrous acid (HONO) is identified as a critical precursor of hydroxyl radicals (OH), influencing atmospheric oxidation capacity and the formation of secondary pollutants. However, large uncertainties persist regarding its formation and elimination mechanisms, impeding accurate simulation of HONO levels using chemical models. In this study, a deep neural network (DNN) model was established based on routine air quality data (O3, NO2, CO, and PM2.5) and meteorological parameters (temperature, relative humidity, solar zenith angle, and season) collected from four typical megacity clusters in China. The model exhibited robust performance on both the train sets [slope = 1.0, r2 = 0.94, root mean squared error (RMSE) = 0.29 ppbv] and two independent test sets (slope = 1.0, r2 = 0.79, and RMSE = 0.39 ppbv), demonstrated excellent capability in reproducing the spatiotemporal variations of HONO, and outperformed an observation-constrained box model incorporated with newly proposed HONO formation mechanisms. Nitrogen dioxide (NO2) was identified as the most impactful features for HONO prediction using the SHapely Additive exPlanation (SHAP) approach, highlighting the importance of NO2 conversion in HONO formation. The DNN model was further employed to predict the future change of HONO levels in different NOx abatement scenarios, which is expected to decrease 27-44% in summer as the result of 30-50% NOx reduction. These results suggest a dual effect brought by abatement of NOx emissions, leading to not only reduction of O3 and nitrate precursors but also decrease in HONO levels and hence primary radical production rates (PROx). In summary, this study demonstrates the feasibility of using deep learning approach to predict HONO concentrations, offering a promising supplement to traditional chemical models. Additionally, stringent NOx abatement would be beneficial for collaborative alleviation of O3 and secondary PM2.5.


Assuntos
Poluentes Atmosféricos , Aprendizado Profundo , Ácido Nitroso , Ácido Nitroso/química , Poluentes Atmosféricos/análise , China , Monitoramento Ambiental/métodos , Poluição do Ar
6.
J Intensive Care Med ; 39(5): 465-476, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-37964547

RESUMO

BACKGROUND: Sepsis-associated acute kidney injury (SA-AKI) is a critical condition with significant clinical implications, yet there is a need for a predictive model that can reliably assess the risk of its development. This study is undertaken to bridge a gap in healthcare by creating a predictive model for SA-AKI with the goal of empowering healthcare providers with a tool that can revolutionize patient care and ultimately lead to improved outcomes. METHODS: A cohort of 615 patients afflicted with sepsis, who were admitted to the intensive care unit, underwent random stratification into 2 groups: a training set (n = 435) and a validation set (n = 180). Subsequently, a multivariate logistic regression model, imbued with nonzero coefficients via LASSO regression, was meticulously devised for the prognostication of SA-AKI. This model was thoughtfully rendered in the form of a nomogram. The salience of individual risk factors was assessed and ranked employing Shapley Additive Interpretation (SHAP). Recursive partition analysis was performed to stratify the risk of patients with sepsis. RESULTS: Among the panoply of clinical variables examined, hypertension, diabetes mellitus, C-reactive protein, procalcitonin (PCT), activated partial thromboplastin time, and platelet count emerged as robust and independent determinants of SA-AKI. The receiver operating characteristic curve analysis for SA-AKI risk discrimination in both the training set and validation set yielded an area under the curve estimates of 0.843 (95% CI: 0.805 to 0.882) and 0.834 (95% CI: 0.775 to 0.893), respectively. Notably, PCT exhibited the most conspicuous influence on the model's predictive capacity. Furthermore, statistically significant disparities were observed in the incidence of SA-AKI and the 28-day survival rate across high-risk, medium-risk, and low-risk cohorts (P < .05). CONCLUSION: The composite predictive model, amalgamating the quintet of SA-AKI predictors, holds significant promise in facilitating the identification of high-risk patient subsets.


Assuntos
Injúria Renal Aguda , Sepse , Humanos , Curva ROC , Unidades de Terapia Intensiva , Modelos Logísticos , Pró-Calcitonina , Injúria Renal Aguda/etiologia , Injúria Renal Aguda/epidemiologia , Sepse/complicações , Sepse/epidemiologia , Estudos Retrospectivos
7.
Network ; : 1-38, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38511557

RESUMO

Interpretable machine learning models are instrumental in disease diagnosis and clinical decision-making, shedding light on relevant features. Notably, Boruta, SHAP (SHapley Additive exPlanations), and BorutaShap were employed for feature selection, each contributing to the identification of crucial features. These selected features were then utilized to train six machine learning algorithms, including LR, SVM, ETC, AdaBoost, RF, and LR, using diverse medical datasets obtained from public sources after rigorous preprocessing. The performance of each feature selection technique was evaluated across multiple ML models, assessing accuracy, precision, recall, and F1-score metrics. Among these, SHAP showcased superior performance, achieving average accuracies of 80.17%, 85.13%, 90.00%, and 99.55% across diabetes, cardiovascular, statlog, and thyroid disease datasets, respectively. Notably, the LGBM emerged as the most effective algorithm, boasting an average accuracy of 91.00% for most disease states. Moreover, SHAP enhanced the interpretability of the models, providing valuable insights into the underlying mechanisms driving disease diagnosis. This comprehensive study contributes significant insights into feature selection techniques and machine learning algorithms for disease diagnosis, benefiting researchers and practitioners in the medical field. Further exploration of feature selection methods and algorithms holds promise for advancing disease diagnosis methodologies, paving the way for more accurate and interpretable diagnostic models.

8.
Mol Ther ; 31(8): 2543-2551, 2023 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-37271991

RESUMO

5-methylcytosine (m5C) is indeed a critical post-transcriptional alteration that is widely present in various kinds of RNAs and is crucial to the fundamental biological processes. By correctly identifying the m5C-methylation sites on RNA, clinicians can more clearly comprehend the precise function of these m5C-sites in different biological processes. Due to their effectiveness and affordability, computational methods have received greater attention over the last few years for the identification of methylation sites in various species. To precisely identify RNA m5C locations in five different species including Homo sapiens, Arabidopsis thaliana, Mus musculus, Drosophila melanogaster, and Danio rerio, we proposed a more effective and accurate model named m5C-pred. To create m5C-pred, five distinct feature encoding techniques were combined to extract features from the RNA sequence, and then we used SHapley Additive exPlanations to choose the best features among them, followed by XGBoost as a classifier. We applied the novel optimization method called Optuna to quickly and efficiently determine the best hyperparameters. Finally, the proposed model was evaluated using independent test datasets, and we compared the results with the previous methods. Our approach, m5C- pred, is anticipated to be useful for accurately identifying m5C sites, outperforming the currently available state-of-the-art techniques.


Assuntos
Drosophila melanogaster , RNA , Animais , Camundongos , RNA/genética , Drosophila melanogaster/genética , Sequência de Bases
9.
Mol Divers ; 2024 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-38200203

RESUMO

Cyclooxygenase-2 (COX-2) inhibitors are nonsteroidal anti-inflammatory drugs that treat inflammation, pain and fever. This study determined the interaction mechanisms of COX-2 inhibitors and the molecular properties needed to design new drug candidates. Using machine learning and explainable AI methods, the inhibition activity of 1488 molecules was modelled, and essential properties were identified. These properties included aromatic rings, nitrogen-containing functional groups and aliphatic hydrocarbons. They affected the water solubility, hydrophobicity and binding affinity of COX-2 inhibitors. The binding mode, stability and ADME properties of 16 ligands bound to the Cyclooxygenase active site of COX-2 were investigated by molecular docking, molecular dynamics simulation and MM-GBSA analysis. The results showed that ligand 339,222 was the most stable and effective COX-2 inhibitor. It inhibited prostaglandin synthesis by disrupting the protein conformation of COX-2. It had good ADME properties and high clinical potential. This study demonstrated the potential of machine learning and bioinformatics methods in discovering COX-2 inhibitors.

10.
Metab Brain Dis ; 39(1): 29-42, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38153584

RESUMO

Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition characterized by altered brain connectivity and function. In this study, we employed advanced bioinformatics and explainable AI to analyze gene expression associated with ASD, using data from five GEO datasets. Among 351 neurotypical controls and 358 individuals with autism, we identified 3,339 Differentially Expressed Genes (DEGs) with an adjusted p-value (≤ 0.05). A subsequent meta-analysis pinpointed 342 DEGs (adjusted p-value ≤ 0.001), including 19 upregulated and 10 down-regulated genes across all datasets. Shared genes, pathogenic single nucleotide polymorphisms (SNPs), chromosomal positions, and their impact on biological pathways were examined. We identified potential biomarkers (HOXB3, NR2F2, MAPK8IP3, PIGT, SEMA4D, and SSH1) through text mining, meriting further investigation. Additionally, we shed light on the roles of RPS4Y1 and KDM5D genes in neurogenesis and neurodevelopment. Our analysis detected 1,286 SNPs linked to ASD-related conditions, of which 14 high-risk SNPs were located on chromosomes 10 and X. We highlighted potential missense SNPs associated with FGFR inhibitors, suggesting that it may serve as a promising biomarker for responsiveness to targeted therapies. Our explainable AI model identified the MID2 gene as a potential ASD biomarker. This research unveils vital genes and potential biomarkers, providing a foundation for novel gene discovery in complex diseases.


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Humanos , Transtorno do Espectro Autista/diagnóstico , Transtorno do Espectro Autista/genética , Biomarcadores , Encéfalo , Genômica , Antígenos de Histocompatibilidade Menor , Histona Desmetilases
11.
J Med Internet Res ; 26: e55913, 2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38758578

RESUMO

BACKGROUND: Suicide is the second-leading cause of death among adolescents and is associated with clusters of suicides. Despite numerous studies on this preventable cause of death, the focus has primarily been on single nations and traditional statistical methods. OBJECTIVE: This study aims to develop a predictive model for adolescent suicidal thinking using multinational data sets and machine learning (ML). METHODS: We used data from the Korea Youth Risk Behavior Web-based Survey with 566,875 adolescents aged between 13 and 18 years and conducted external validation using the Youth Risk Behavior Survey with 103,874 adolescents and Norway's University National General Survey with 19,574 adolescents. Several tree-based ML models were developed, and feature importance and Shapley additive explanations values were analyzed to identify risk factors for adolescent suicidal thinking. RESULTS: When trained on the Korea Youth Risk Behavior Web-based Survey data from South Korea with a 95% CI, the XGBoost model reported an area under the receiver operating characteristic (AUROC) curve of 90.06% (95% CI 89.97-90.16), displaying superior performance compared to other models. For external validation using the Youth Risk Behavior Survey data from the United States and the University National General Survey from Norway, the XGBoost model achieved AUROCs of 83.09% and 81.27%, respectively. Across all data sets, XGBoost consistently outperformed the other models with the highest AUROC score, and was selected as the optimal model. In terms of predictors of suicidal thinking, feelings of sadness and despair were the most influential, accounting for 57.4% of the impact, followed by stress status at 19.8%. This was followed by age (5.7%), household income (4%), academic achievement (3.4%), sex (2.1%), and others, which contributed less than 2% each. CONCLUSIONS: This study used ML by integrating diverse data sets from 3 countries to address adolescent suicide. The findings highlight the important role of emotional health indicators in predicting suicidal thinking among adolescents. Specifically, sadness and despair were identified as the most significant predictors, followed by stressful conditions and age. These findings emphasize the critical need for early diagnosis and prevention of mental health issues during adolescence.


Assuntos
Aprendizado de Máquina , Ideação Suicida , Humanos , Adolescente , Feminino , Masculino , República da Coreia , Algoritmos , Estudos de Coortes , Comportamento do Adolescente/psicologia , Suicídio/estatística & dados numéricos , Suicídio/psicologia , Noruega , Inquéritos e Questionários , Fatores de Risco , Assunção de Riscos
12.
BMC Med Inform Decis Mak ; 24(1): 40, 2024 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-38326769

RESUMO

BACKGROUND: Deep learning has demonstrated significant advancements across various domains. However, its implementation in specialized areas, such as medical settings, remains approached with caution. In these high-stake environments, understanding the model's decision-making process is critical. This study assesses the performance of different pretrained Bidirectional Encoder Representations from Transformers (BERT) models and delves into understanding its decision-making within the context of medical image protocol assignment. METHODS: Four different pre-trained BERT models (BERT, BioBERT, ClinicalBERT, RoBERTa) were fine-tuned for the medical image protocol classification task. Word importance was measured by attributing the classification output to every word using a gradient-based method. Subsequently, a trained radiologist reviewed the resulting word importance scores to assess the model's decision-making process relative to human reasoning. RESULTS: The BERT model came close to human performance on our test set. The BERT model successfully identified relevant words indicative of the target protocol. Analysis of important words in misclassifications revealed potential systematic errors in the model. CONCLUSIONS: The BERT model shows promise in medical image protocol assignment by reaching near human level performance and identifying key words effectively. The detection of systematic errors paves the way for further refinements to enhance its safety and utility in clinical settings.


Assuntos
Processamento de Linguagem Natural , Resolução de Problemas , Humanos
13.
BMC Med Inform Decis Mak ; 24(1): 106, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38649879

RESUMO

OBJECTIVES: This study aims to build a machine learning (ML) model to predict the recurrence probability for postoperative non-lactating mastitis (NLM) by Random Forest (RF) and XGBoost algorithms. It can provide the ability to identify the risk of NLM recurrence and guidance in clinical treatment plan. METHODS: This study was conducted on inpatients who were admitted to the Mammary Department of Shuguang Hospital affiliated to Shanghai University of Traditional Chinese Medicine between July 2019 to December 2021. Inpatient data follow-up has been completed until December 2022. Ten features were selected in this study to build the ML model: age, body mass index (BMI), number of abortions, presence of inverted nipples, extent of breast mass, white blood cell count (WBC), neutrophil to lymphocyte ratio (NLR), albumin-globulin ratio (AGR) and triglyceride (TG) and presence of intraoperative discharge. We used two ML approaches (RF and XGBoost) to build models and predict the NLM recurrence risk of female patients. Totally 258 patients were randomly divided into a training set and a test set according to a 75%-25% proportion. The model performance was evaluated based on Accuracy, Precision, Recall, F1-score and AUC. The Shapley Additive Explanations (SHAP) method was used to interpret the model. RESULTS: There were 48 (18.6%) NLM patients who experienced recurrence during the follow-up period. Ten features were selected in this study to build the ML model. For the RF model, BMI is the most important influence factor and for the XGBoost model is intraoperative discharge. The results of tenfold cross-validation suggest that both the RF model and the XGBoost model have good predictive performance, but the XGBoost model has a better performance than the RF model in our study. The trends of SHAP values of all features in our models are consistent with the trends of these features' clinical presentation. The inclusion of these ten features in the model is necessary to build practical prediction models for recurrence. CONCLUSIONS: The results of tenfold cross-validation and SHAP values suggest that the models have predictive ability. The trend of SHAP value provides auxiliary validation in our models and makes it have more clinical significance.


Assuntos
Aprendizado de Máquina , Mastite , Recidiva , Humanos , Feminino , Adulto , Pessoa de Meia-Idade , Complicações Pós-Operatórias , China
14.
Sensors (Basel) ; 24(11)2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38894306

RESUMO

The recent advancements in autonomous driving come with the associated cybersecurity issue of compromising networks of autonomous vehicles (AVs), motivating the use of AI models for detecting anomalies on these networks. In this context, the usage of explainable AI (XAI) for explaining the behavior of these anomaly detection AI models is crucial. This work introduces a comprehensive framework to assess black-box XAI techniques for anomaly detection within AVs, facilitating the examination of both global and local XAI methods to elucidate the decisions made by XAI techniques that explain the behavior of AI models classifying anomalous AV behavior. By considering six evaluation metrics (descriptive accuracy, sparsity, stability, efficiency, robustness, and completeness), the framework evaluates two well-known black-box XAI techniques, SHAP and LIME, involving applying XAI techniques to identify primary features crucial for anomaly classification, followed by extensive experiments assessing SHAP and LIME across the six metrics using two prevalent autonomous driving datasets, VeReMi and Sensor. This study advances the deployment of black-box XAI methods for real-world anomaly detection in autonomous driving systems, contributing valuable insights into the strengths and limitations of current black-box XAI methods within this critical domain.

15.
Sensors (Basel) ; 24(10)2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38794052

RESUMO

Recently, explainability in machine and deep learning has become an important area in the field of research as well as interest, both due to the increasing use of artificial intelligence (AI) methods and understanding of the decisions made by models. The explainability of artificial intelligence (XAI) is due to the increasing consciousness in, among other things, data mining, error elimination, and learning performance by various AI algorithms. Moreover, XAI will allow the decisions made by models in problems to be more transparent as well as effective. In this study, models from the 'glass box' group of Decision Tree, among others, and the 'black box' group of Random Forest, among others, were proposed to understand the identification of selected types of currant powders. The learning process of these models was carried out to determine accuracy indicators such as accuracy, precision, recall, and F1-score. It was visualized using Local Interpretable Model Agnostic Explanations (LIMEs) to predict the effectiveness of identifying specific types of blackcurrant powders based on texture descriptors such as entropy, contrast, correlation, dissimilarity, and homogeneity. Bagging (Bagging_100), Decision Tree (DT0), and Random Forest (RF7_gini) proved to be the most effective models in the framework of currant powder interpretability. The measures of classifier performance in terms of accuracy, precision, recall, and F1-score for Bagging_100, respectively, reached values of approximately 0.979. In comparison, DT0 reached values of 0.968, 0.972, 0.968, and 0.969, and RF7_gini reached values of 0.963, 0.964, 0.963, and 0.963. These models achieved classifier performance measures of greater than 96%. In the future, XAI using agnostic models can be an additional important tool to help analyze data, including food products, even online.


Assuntos
Algoritmos , Inteligência Artificial , Aprendizado de Máquina , Pós , Ribes , Pós/química , Ribes/química , Árvores de Decisões
16.
J Clin Ultrasound ; 52(3): 305-314, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38149658

RESUMO

OBJECTIVES: Radiomics-based eXtreme gradient boosting (XGBoost) model was developed to differentiate benign thyroid nodules from malignant thyroid nodules and to prevent unnecessary thyroid biopsies, including positive and negative effects. METHODS: The study evaluated a data set of ultrasound images of thyroid nodules in patients retrospectively, who initially received ultrasound-guided fine-needle aspiration biopsy (FNAB) for diagnostic purposes. According to ACR TI-RADS, a total of five ultrasound feature categories and the maximum size of the nodule were determined by four radiologists. A radiomics score was developed by the LASSO algorithm from the ultrasound-based radiomics features. An interpretative method based on Shapley additive explanation (SHAP) was developed. XGBoost was compared with ACR TI-RADS for its diagnostic performance and FNAB rate and was compared with six other machine learning models to evaluate the model performance. RESULTS: Finally, 191 thyroid nodules were examined from 177 patients. The radiomics score were calculated using 8 features, which were selected among 789 candidate features generated from the ultrasound images. The model yielded an AUC of 93% in the training cohort and 92% in the test cohort. It outperformed traditional machine learning models in assessing the nature of thyroid nodules. Compared with ACR TI-RADS, the FNAB rate decreased from 34% to 30% in training and from 35% to 41% in test. CONCLUSIONS: The radiomics-based XGBoost model proposed could distinguish benign and malignant thyroid nodules, thereby reduced significantly the number of unnecessary FNAB. It was effective in making preoperative decisions and managing selected patients using the SHAP visual interpretation tools.


Assuntos
Nódulo da Glândula Tireoide , Humanos , Nódulo da Glândula Tireoide/diagnóstico por imagem , Nódulo da Glândula Tireoide/patologia , Estudos Retrospectivos , Radiômica , Diagnóstico Diferencial , Ultrassonografia/métodos , Biópsia por Agulha Fina
17.
J Stroke Cerebrovasc Dis ; 33(7): 107729, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38657830

RESUMO

BACKGROUND: Acute kidney injury (AKI) is not only a complication but also a serious threat to patients with cerebral infarction (CI). This study aimed to explore the application of interpretable machine learning algorithms in predicting AKI in patients with cerebral infarction. METHODS: The study included 3920 patients with CI admitted to the Intensive Care Unit and Emergency Medicine of the Central Hospital of Lishui City, Zhejiang Province. Nine machine learning techniques, including XGBoost, logistics, LightGBM, random forest (RF), AdaBoost, GaussianNB (GNB), Multi-Layer Perceptron (MLP), support vector machine (SVM), and k-nearest neighbors (KNN) classification, were used to develop a predictive model for AKI in these patients. SHapley Additive exPlanations (SHAP) analysis provided visual explanations for each patient. Finally, model effectiveness was assessed using metrics such as average precision (AP), sensitivity, specificity, accuracy, F1 score, precision-recall (PR) curve, calibration plot, and decision curve analysis (DCA). RESULTS: The XGBoost model performed better in the internal validation set and the external validation set, with an AUC of 0.940 and 0.887, respectively. The five most important variables in the model were, in order, glomerular filtration rate, low-density lipoprotein, total cholesterol, hemiplegia and serum kalium. CONCLUSION: This study demonstrates the potential of interpretable machine learning algorithms in predicting CI patients with AKI.


Assuntos
Injúria Renal Aguda , Infarto Cerebral , Unidades de Terapia Intensiva , Aprendizado de Máquina , Valor Preditivo dos Testes , Humanos , Injúria Renal Aguda/diagnóstico , Injúria Renal Aguda/sangue , Injúria Renal Aguda/terapia , Masculino , Feminino , Idoso , Pessoa de Meia-Idade , Infarto Cerebral/diagnóstico , Infarto Cerebral/etiologia , Fatores de Risco , Medição de Risco , China/epidemiologia , Prognóstico , Reprodutibilidade dos Testes , Idoso de 80 Anos ou mais , Técnicas de Apoio para a Decisão , Estudos Retrospectivos , Diagnóstico por Computador
18.
J Environ Manage ; 360: 121189, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38759553

RESUMO

Pyrolysis, a thermochemical conversion approach of transforming plastic waste to energy has tremendous potential to manage the exponentially increasing plastic waste. However, understanding the process kinetics is fundamental to engineering a sustainable process. Conventional analysis techniques do not provide insights into the influence of characteristics of feedstock on the process kinetics. Present study exemplifies the efficacy of using machine learning for predictive modeling of pyrolysis of waste plastics to understand the complexities of the interrelations of predictor variables and their influence on activation energy. The activation energy for pyrolysis of waste plastics was evaluated using machine learning models namely Random Forest, XGBoost, CatBoost, and AdaBoost regression models. Feature selection based on the multicollinearity of data and hyperparameter tuning of the models utilizing RandomizedSearchCV was conducted. Random forest model outperformed the other models with coefficient of determination (R2) value of 0.941, root mean square error (RMSE) value of 14.69 and mean absolute error (MAE) value of 8.66 for the testing dataset. The explainable artificial intelligence-based feature importance plot and the summary plot of the shapely additive explanations projected fixed carbon content, ash content, conversion value, and carbon content as significant parameters of the model in the order; fixed carbon > carbon > ash content > degree of conversion. Present study highlighted the potential of machine learning as a powerful tool to understand the influence of the characteristics of plastic waste and the degree of conversion on the activation energy of a process that is essential for designing the large-scale operations and future scale-up of the process.


Assuntos
Inteligência Artificial , Plásticos , Pirólise , Plásticos/química , Aprendizado de Máquina , Modelos Teóricos
19.
J Environ Manage ; 354: 120309, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38377759

RESUMO

Land subsidence induced by coal mining (MLS) has posed a huge threat to the ecological environment, buildings, roads, and other infrastructure safety in mining areas. However, the prediction and evaluation of MLS is relatively complex, and the reliability of the prediction results is closely related to factors such as the professional knowledge and engineering experience of researchers. This paper aims to combine intelligent optimization algorithms: ant lion optimizer (ALO), bald eagle search (BES), bird swarm algorithm (BSA), harris hawks optimization (HHO), and sparrow search algorithm (SSA), with machine learning model of gradient boosting with categorical features support algorithm (CatBoost) to predict MLS. To achieve this goal, five hybrid models based CatBoost were developed and the prediction accuracy and reliability of the models were compared and analyzed. The prediction performance of the hybrid models has been significantly improved on the basis of a single model, of which the SSA-CatBoost model has the most obvious improvement (from R2 = 0.927 to 0.965, RMSE = 0.541 to 0.377, MAE = 0.386 to 0.297, VAF = 92.720 to 95.837). The importance and predictive contribution of all input features to predictive labels were studied with the Shapley method. The research results indicate that hybrid model technology is a reliable MLS prediction method. This study can help mining technicians use machine learning methods to study the degree of MLS damage to the surface environment and provide scientific advanced prediction and evaluation for the protection and management of the ecological environment in mining areas and the formulation of safety production measures.


Assuntos
Algoritmos , Engenharia , Reprodutibilidade dos Testes , Meio Ambiente , Conhecimento
20.
Plant Foods Hum Nutr ; 79(1): 209-218, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38340238

RESUMO

The active ingredient group is a prominent feature reflecting the inherent characteristics of plant-based functional foods. Chinese hawthorn leaf (CHL), a tea substitute possessing intrinsic nutritional properties in anti-hyperlipidemia, was first found to be adulterated with Malus doumeri leaf (MDL) owing to similar commercial labels. In this context, the above-mentioned two contrasting species were explored through phytochemical profiling and activity assessment. The amelioration effect of CHL on free fatty acids-elicited lipid deposition in HepG2 cells was significantly better than that of MDL. Molecular networking-based metabolic profiles identified 68 and 67 components in CHL and MDL, with 33 shared components. Extreme gradient boosting (XGBoost) algorithm with outstanding performance was selected to screen candidate components contributing to hypolipidemic activity, and the output was later interpreted by Shapley additive explanations (SHAP) method. Twelve and eight components were separately screened as hyperlipidemic inhibitors in CHL and MDL, while only four constituents were shared. The bioactivity evaluation of selected ingredients and combinations further confirmed their anti-hyperlipidemia capacity. These findings emphasized the feasibility of filtering bioactivity-related compounds using interpretable machine learning approaches and illustrated that related species may contain different hypolipidemic contributors, even if shared constituents existed.


Assuntos
Crataegus , Malus , Alimento Funcional , Folhas de Planta , China
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa