Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 51
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Plant Cell Environ ; 2024 Aug 21.
Article in English | MEDLINE | ID: mdl-39166340

ABSTRACT

Mesophyll conductance ( g m ${g}_{{\rm{m}}}$ ) describes the efficiency with which CO 2 ${\mathrm{CO}}_{2}$ moves from substomatal cavities to chloroplasts. Despite the stipulated importance of leaf architecture in affecting g m ${g}_{{\rm{m}}}$ , there remains a considerable ambiguity about how and whether leaf anatomy influences g m ${g}_{{\rm{m}}}$ . Here, we employed nonlinear machine-learning models to assess the relationship between 10 leaf architecture traits and g m ${g}_{{\rm{m}}}$ . These models used leaf architecture traits as predictors and achieved excellent predictability of g m ${g}_{{\rm{m}}}$ . Dissection of the importance of leaf architecture traits in the models indicated that cell wall thickness and chloroplast area exposed to internal airspace have a large impact on interspecific variation in g m ${g}_{{\rm{m}}}$ . Additionally, other leaf architecture traits, such as leaf thickness, leaf density and chloroplast thickness, emerged as important predictors of g m ${g}_{{\rm{m}}}$ . We also found significant differences in the predictability between models trained on different plant functional types. Therefore, by moving beyond simple linear and exponential models, our analyses demonstrated that a larger suite of leaf architecture traits drive differences in g m ${g}_{{\rm{m}}}$ than has been previously acknowledged. These findings pave the way for modulating g m ${g}_{{\rm{m}}}$ by strategies that modify its leaf architecture determinants.

2.
Allergy ; 79(8): 2173-2185, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38995241

ABSTRACT

BACKGROUND: There is evidence that global anthropogenic climate change may be impacting floral phenology and the temporal and spatial characteristics of aero-allergenic pollen. Given the extent of current and future climate uncertainty, there is a need to strengthen predictive pollen forecasts. METHODS: The study aims to use CatBoost (CB) and deep learning (DL) models for predicting the daily total pollen concentration up to 14 days in advance for 23 cities, covering all five continents. The model includes the projected environmental parameters, recent concentrations (1, 2 and 4 weeks), and the past environmental explanatory variables, and their future values. RESULTS: The best pollen forecasts include Mexico City (R2(DL_7) ≈ .7), and Santiago (R2(DL_7) ≈ .8) for the 7th forecast day, respectively; while the weakest pollen forecasts are made for Brisbane (R2(DL_7) ≈ .4) and Seoul (R2(DL_7) ≈ .1) for the 7th forecast day. The global order of the five most important environmental variables in determining the daily total pollen concentrations is, in decreasing order: the past daily total pollen concentration, future 2 m temperature, past 2 m temperature, past soil temperature in 28-100 cm depth, and past soil temperature in 0-7 cm depth. City-related clusters of the most similar distribution of feature importance values of the environmental variables only slightly change on consecutive forecast days for Caxias do Sul, Cape Town, Brisbane, and Mexico City, while they often change for Sydney, Santiago, and Busan. CONCLUSIONS: This new knowledge of the ecological relationships of the most remarkable variables importance for pollen forecast models according to clusters, cities and forecast days is important for developing and improving the accuracy of airborne pollen forecasts.


Subject(s)
Allergens , Forecasting , Pollen , Pollen/immunology , Forecasting/methods , Humans , Climate Change , Models, Theoretical , Environmental Monitoring/methods
3.
BMC Gastroenterol ; 24(1): 267, 2024 Aug 15.
Article in English | MEDLINE | ID: mdl-39148020

ABSTRACT

PURPOSE: Irritable bowel syndrome (IBS) is a diagnosis defined by gastrointestinal (GI) symptoms like abdominal pain and changes associated with defecation. The condition is classified as a disorder of the gut-brain interaction (DGBI), and patients with IBS commonly experience psychological distress. The present study focuses on this distress, defined from reports of fatigue, anxiety, depression, sleep disturbances, and performance on cognitive tests. The aim was to investigate the joint contribution of these features of psychological distress in predicting IBS versus healthy controls (HCs) and to disentangle clinically meaningful subgroups of IBS patients. METHODS: IBS patients ( n = 49 ) and HCs ( n = 28 ) completed the Chalder Fatigue Scale (CFQ), the Hamilton Anxiety and Depression Scale (HADS), and the Bergen Insomnia Scale (BIS), and performed tests of memory function and attention from the Repeatable Battery Assessing Neuropsychological Symptoms (RBANS). An initial exploratory data analysis was followed by supervised (Random Forest) and unsupervised (K-means) classification procedures. RESULTS: The explorative data analysis showed that the group of IBS patients obtained significantly more severe scores than HCs on all included measures, with the strongest pairwise correlation between fatigue and a quality measure of sleep disturbances. The supervised classification model correctly predicted belongings to the IBS group in 80% of the cases in a test set of unseen data. Two methods for calculating feature importance in the test set gave mental and physical fatigue and anxiety the strongest weights. An unsupervised procedure with K = 3 showed that one cluster contained 24% of the patients and all but two HCs. In the two other clusters, their IBS members were overall more impaired, with the following differences. One of the two clusters showed more severe cognitive problems and anxiety symptoms than the other, which experienced more severe problems related to the quality of sleep and fatigue. The three clusters were not different on a severity measure of IBS and age. CONCLUSION: The results showed that psychological distress is an integral component of IBS symptomatology. The study should inspire future longitudinal studies to further dissect clinical patterns of IBS to improve the assessment and personalized treatment for this and other patient groups defined as disorders of the gut-brain interaction. The project is registered at https://classic. CLINICALTRIALS: gov/ct2/show/NCT04296552 20/05/2019.


Subject(s)
Anxiety , Brain-Gut Axis , Depression , Fatigue , Irritable Bowel Syndrome , Machine Learning , Psychological Distress , Humans , Female , Male , Irritable Bowel Syndrome/psychology , Irritable Bowel Syndrome/physiopathology , Irritable Bowel Syndrome/complications , Adult , Anxiety/psychology , Anxiety/diagnosis , Middle Aged , Fatigue/psychology , Fatigue/diagnosis , Fatigue/physiopathology , Fatigue/etiology , Depression/psychology , Depression/diagnosis , Sleep Wake Disorders/psychology , Sleep Wake Disorders/physiopathology , Sleep Wake Disorders/diagnosis , Case-Control Studies , Neuropsychological Tests , Stress, Psychological/psychology , Stress, Psychological/diagnosis
4.
Environ Sci Technol ; 58(26): 11492-11503, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38904357

ABSTRACT

Soil organic carbon (SOC) plays a vital role in global carbon cycling and sequestration, underpinning the need for a comprehensive understanding of its distribution and controls. This study explores the importance of various covariates on SOC spatial distribution at both local (up to 1.25 km) and continental (USA) scales using a deep learning approach. Our findings highlight the significant role of terrain attributes in predicting SOC concentration distribution with terrain, contributing approximately one-third of the overall prediction at the local scale. At the continental scale, climate is only 1.2 times more important than terrain in predicting SOC distribution, whereas at the local scale, the structural pattern of terrain is 14 and 2 times more important than climate and vegetation, respectively. We underscore that terrain attributes, while being integral to the SOC distribution at all scales, are stronger predictors at the local scale with explicit spatial arrangement information. While this observational study does not assess causal mechanisms, our analysis nonetheless presents a nuanced perspective about SOC spatial distribution, which suggests disparate predictors of SOC at local and continental scales. The insights gained from this study have implications for improved SOC mapping, decision support tools, and land management strategies, aiding in the development of effective carbon sequestration initiatives and enhancing climate mitigation efforts.


Subject(s)
Carbon , Climate , Soil , Soil/chemistry , Carbon Cycle , Carbon Sequestration
5.
Article in English | MEDLINE | ID: mdl-38985398

ABSTRACT

This study presents a methodology for predicting the duration of surgical procedures using Machine Learning (ML). The methodology incorporates a new set of predictors emphasizing the significance of surgical team dynamics and composition, including experience, familiarity, social behavior, and gender diversity. By applying ML techniques to a comprehensive dataset of over 77,000 surgeries, we achieved a 24% improvement in the mean absolute error (MAE) over a model that mimics the current approach of the decision maker. Our results also underscore the critical role of surgeon experience and team composition dynamics in enhancing prediction accuracy. These advancements can lead to more efficient operational planning and resource allocation in hospitals, potentially reducing downtime in operating rooms and improving healthcare delivery.

6.
Scand J Public Health ; : 14034948241249519, 2024 Jun 11.
Article in English | MEDLINE | ID: mdl-38860312

ABSTRACT

AIMS: We contribute to the methodological literature on the assessment of health inequalities by applying an algorithmic approach to evaluate the capabilities of socioeconomic variables in predicting the prevalence of non-communicable diseases in a Norwegian health survey. METHODS: We use data from the seventh survey of the population based Tromsø Study (2015-2016), including 11,074 women and 10,009 men aged 40 years and above. We apply the random forest algorithm to predict four non-communicable disease outcomes (heart attack, cancer, diabetes and stroke) based on information on a number of social root causes and health behaviours. We evaluate our results using the classification error, the mean decrease in accuracy, partial dependence statistics. RESULTS: Results suggest that education, household income and occupation to a variable extent contribute to predicting non-communicable disease outcomes. Prediction misclassification ranges between 25.1% and 35.4% depending on the non-communicable diseases under study. Partial dependences reveal mostly expected health gradients, with some examples of complex functional relationships. Out-of-sample model validation shows that predictions translate to new data input. CONCLUSIONS: Algorithmic modelling can provide additional empirical detail and metrics for evaluating heterogeneous inequalities in morbidity. The extent to which education, income and occupation contribute to predicting binary non-communicable disease outcomes depends on both non-communicable diseases and socioeconomic indicator. Partial dependences reveal that social gradients in non-communicable disease outcomes vary in shape between combinations of non-communicable disease outcome and socioeconomic status indicator. Misclassification rates highlight the extent of variation within socioeconomic groups, suggesting that future studies may improve predictive accuracy by exploring further subpopulation heterogeneity.

7.
Behav Res Methods ; 56(6): 6067-6081, 2024 Sep.
Article in English | MEDLINE | ID: mdl-38453828

ABSTRACT

Conventionally, event-related potential (ERP) analysis relies on the researcher to identify the sensors and time points where an effect is expected. However, this approach is prone to bias and may limit the ability to detect unexpected effects or to investigate the full range of the electroencephalography (EEG) signal. Data-driven approaches circumvent this limitation, however, the multiple comparison problem and the statistical correction thereof affect both the sensitivity and specificity of the analysis. In this study, we present SHERPA - a novel approach based on explainable artificial intelligence (XAI) designed to provide the researcher with a straightforward and objective method to find relevant latency ranges and electrodes. SHERPA is comprised of a convolutional neural network (CNN) for classifying the conditions of the experiment and SHapley Additive exPlanations (SHAP) as a post hoc explainer to identify the important temporal and spatial features. A classical EEG face perception experiment is employed to validate the approach by comparing it to the established researcher- and data-driven approaches. Likewise, SHERPA identified an occipital cluster close to the temporal coordinates for the N170 effect expected. Most importantly, SHERPA allows quantifying the relevance of an ERP for a psychological mechanism by calculating an "importance score". Hence, SHERPA suggests the presence of a negative selection process at the early and later stages of processing. In conclusion, our new method not only offers an analysis approach suitable in situations with limited prior knowledge of the effect in question but also an increased sensitivity capable of distinguishing neural processes with high precision.


Subject(s)
Electroencephalography , Evoked Potentials , Humans , Electroencephalography/methods , Evoked Potentials/physiology , Adult , Artificial Intelligence , Female , Male , Neural Networks, Computer , Young Adult , Brain/physiology , Signal Processing, Computer-Assisted
8.
Entropy (Basel) ; 26(7)2024 Jun 22.
Article in English | MEDLINE | ID: mdl-39056900

ABSTRACT

Rapid and precise detection of significant data streams within a network is crucial for efficient traffic management. This study leverages the TabNet deep learning architecture to identify large-scale flows, known as elephant flows, by analyzing the information in the 5-tuple fields of the initial packet header. The results demonstrate that employing a TabNet model can accurately identify elephant flows right at the start of the flow and makes it possible to reduce the number of flow table entries by up to 20 times while still effectively managing 80% of the network traffic through individual flow entries. The model was trained and tested on a comprehensive dataset from a campus network, demonstrating its robustness and potential applicability to varied network environments.

9.
Rev Cardiovasc Med ; 24(11): 330, 2023 Nov.
Article in English | MEDLINE | ID: mdl-39076440

ABSTRACT

Background: Cardiovascular diseases (CVD) remain the predominant global cause of mortality, with both low and high temperatures increasing CVD-related mortalities. Climate change impacts human health directly through temperature fluctuations and indirectly via factors like disease vectors. Elevated and reduced temperatures have been linked to increases in CVD-related hospitalizations and mortality, with various studies worldwide confirming the significant health implications of temperature variations and air pollution on cardiovascular outcomes. Methods: A database of daily Emergency Room admissions at the Giovanni XIII Polyclinic in Bari (Southern Italy) was developed, spanning from 2013 to 2019, including weather and air quality data. A Random Forest (RF) supervised machine learning model was used to simulate the trend of hospital admissions for CVD. The Seasonal and Trend decomposition using Loess (STL) decomposition model separated the trend component, while cross-validation techniques were employed to prevent overfitting. Model performance was assessed using specific metrics and error analysis. Additionally, the SHapley Additive exPlanations (SHAP) method, a feature importance technique within the eXplainable Artificial Intelligence (XAI) framework, was used to identify the feature importance. Results: An R 2 of 0.97 and a Mean Absolute Error of 0.36 admissions were achieved by the model. Atmospheric pressure, minimum temperature, and carbon monoxide were found to collectively contribute about 74% to the model's predictive power, with atmospheric pressure being the dominant factor at 37%. Conclusions: This research underscores the significant influence of weather-climate variables on cardiovascular diseases. The identified key climate factors provide a practical framework for policymakers and healthcare professionals to mitigate the adverse effects of climate change on CVD and devise preventive strategies.

10.
Front Big Data ; 7: 1298029, 2024.
Article in English | MEDLINE | ID: mdl-38562649

ABSTRACT

Introduction: Studies from different parts of the world have shown that some comorbidities are associated with fatal cases of COVID-19. However, the prevalence rates of comorbidities are different around the world, therefore, their contribution to COVID-19 mortality is different. Socioeconomic factors may influence the prevalence of comorbidities; therefore, they may also influence COVID-19 mortality. Methods: This study conducted feature analysis using two supervised machine learning classification algorithms, Random Forest and XGBoost, to examine the comorbidities and level of economic inequalities associated with fatal cases of COVID-19 in Mexico. The dataset used was collected by the National Epidemiology Center from February 2020 to November 2022, and includes more than 20 million observations and 40 variables describing the characteristics of the individuals who underwent COVID-19 testing or treatment. In addition, socioeconomic inequalities were measured using the normalized marginalization index calculated by the National Population Council and the deprivation index calculated by NASA. Results: The analysis shows that diabetes and hypertension were the main comorbidities defining the mortality of COVID-19, furthermore, socioeconomic inequalities were also important characteristics defining the mortality. Similar features were found with Random Forest and XGBoost. Discussion: It is imperative to implement programs aimed at reducing inequalities as well as preventable comorbidities to make the population more resilient to future pandemics. The results apply to regions or countries with similar levels of inequality or comorbidity prevalence.

11.
Sci Rep ; 14(1): 5905, 2024 03 11.
Article in English | MEDLINE | ID: mdl-38467662

ABSTRACT

To explore a robust tool for advancing digital breeding practices through an artificial intelligence-driven phenotype prediction expert system, we undertook a thorough analysis of 11 non-linear regression models. Our investigation specifically emphasized the significance of Support Vector Regression (SVR) and SHapley Additive exPlanations (SHAP) in predicting soybean branching. By using branching data (phenotype) of 1918 soybean accessions and 42 k SNP (Single Nucleotide Polymorphism) polymorphic data (genotype), this study systematically compared 11 non-linear regression AI models, including four deep learning models (DBN (deep belief network) regression, ANN (artificial neural network) regression, Autoencoders regression, and MLP (multilayer perceptron) regression) and seven machine learning models (e.g., SVR (support vector regression), XGBoost (eXtreme Gradient Boosting) regression, Random Forest regression, LightGBM regression, GPs (Gaussian processes) regression, Decision Tree regression, and Polynomial regression). After being evaluated by four valuation metrics: R2 (R-squared), MAE (Mean Absolute Error), MSE (Mean Squared Error), and MAPE (Mean Absolute Percentage Error), it was found that the SVR, Polynomial Regression, DBN, and Autoencoder outperformed other models and could obtain a better prediction accuracy when they were used for phenotype prediction. In the assessment of deep learning approaches, we exemplified the SVR model, conducting analyses on feature importance and gene ontology (GO) enrichment to provide comprehensive support. After comprehensively comparing four feature importance algorithms, no notable distinction was observed in the feature importance ranking scores across the four algorithms, namely Variable Ranking, Permutation, SHAP, and Correlation Matrix, but the SHAP value could provide rich information on genes with negative contributions, and SHAP importance was chosen for feature selection. The results of this study offer valuable insights into AI-mediated plant breeding, addressing challenges faced by traditional breeding programs. The method developed has broad applicability in phenotype prediction, minor QTL (quantitative trait loci) mining, and plant smart-breeding systems, contributing significantly to the advancement of AI-based breeding practices and transitioning from experience-based to data-based breeding.


Subject(s)
Artificial Intelligence , Glycine max , Glycine max/genetics , Plant Breeding , Algorithms , Benchmarking
12.
J Affect Disord ; 352: 87-100, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38360368

ABSTRACT

BACKGROUND: Suicide has been recognized as a major global public health issue. Depressed adolescents are more prone to experiencing it. We explore risk factors and their differences on suicidal ideation and suicide attempts to further enhance our understanding of suicidal behavior. METHODS: 2343 depressed adolescents aged 12-18 from 9 provinces/cities in China participated in this cross-sectional study. We utilized decision tree model, incorporating 32 factors encompassing participants' suicidal behavior. The feature importance of each factor was measured using Gini coefficients. RESULTS: The decision tree model demonstrated a good fit with high accuracy (SI = 0.86, SA = 0.85 and F-Score (SI = 0.85, SA = 0.83). The predictive importance of each factor varied between groups with suicidal ideation and with suicide attempts. The most significant risk factor in both groups was depression (SI = 16.7 %, SA = 19.8 %). However, factors such as academic stress (SI = 7.2 %, SA = 1.6 %), hopelessness (SI = 9.1 %, SA = 5.0 %), and age (SI = 7.1 %, SA = 3.2 %) were more closely associated with suicidal ideation than suicide attempts. Factors related to the schooling status (SI = 3.5 %, SA = 10.1 %), total years of education (SI = 2.6 %, SA = 8.6 %), and loneliness (SI = 2.3 %, SA = 7.4 %) were relatively more important in the suicide attempt stage compared to suicidal ideation. LIMITATIONS: The cross-sectional design limited the ability to capture changes in suicidal behavior among depressed adolescents over time. Possible bias may exist in the measurement of suicidal ideation. CONCLUSION: The relative importance of each risk factor for suicidal ideation and attempted suicide varies. These findings provide further empirical evidence for understanding suicide behavior. Targeted treatment measures should be taken for different stages of suicide in clinical interventions.


Subject(s)
Suicidal Ideation , Suicide, Attempted , Humans , Adolescent , Cross-Sectional Studies , Risk Factors , Decision Trees
13.
Bioresour Technol ; 399: 130519, 2024 May.
Article in English | MEDLINE | ID: mdl-38437964

ABSTRACT

This study developed six machine learning models to predict the biochar properties from the dry torrefaction of lignocellulosic biomass by using biomass characteristics and torrefaction conditions as input variables. After optimization, gradient boosting machines were the optimal model, with the highest coefficient of determination ranging from 0.89 to 0.94. Torrefaction conditions exhibited a higher relative contribution to the yield and higher heating value (HHV) of biochar than biomass characteristics. Temperature was the dominant contributor to the elemental and proximate composition and the yield and HHV of biochar. Feature importance and SHapley Additive exPlanations revealed the effect of each influential factor on the target variables and the interactions between these factors in torrefaction. Software that can accurately predict the element, yield, and HHV of biochar was developed. These findings provide a comprehensive understanding of the key factors and their interactions influencing the torrefaction process and biochar properties.


Subject(s)
Charcoal , Machine Learning , Biomass , Temperature
14.
Chemosphere ; 352: 141472, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38382719

ABSTRACT

Wastewater Treatment Plants (WWTPs) present complex biochemical processes of high variability and difficult prediction. This study presents an innovative approach using Machine Learning (ML) models to predict wastewater quality parameters. In particular, the models are applied to datasets from both a simulated wastewater treatment plant (WWTP), using DHI WEST software (WEST WWTP), and a real-world WWTP database from Santa Catarina Brewery AMBEV, located in Lages/SC - Brazil (AMBEV WWTP). A distinctive aspect is the evaluation of predictive performance in continuous data scenarios and the impact of changes in WWTP operations on predictive model performance, including changes in plant layout. For both plants, three different scenarios were addressed, and the quality of predictions by random forest (RF), support vector machine (SVM), and multilayer perceptron (MLP) models were evaluated. The prediction quality by the MLP model reached an R2 of 0.72 for TN prediction in the WEST WWTP output, and the RF model better adapted to the real data of the AMBEV WWTP, despite the significant discrepancy observed between the real and the predicted data. Techniques such as Partial Dependence Plots (PDP) and Permutation Importance (PI) were used to assess the importance of features, particularly in the simulated WEST tool scenario, showing a strong correlation of prediction results with influent parameters related to nitrogen content. The results of this study highlight the importance of collecting and storing high-quality data and the need for information on changes in WWTP operation for predictive model performance. These contributions advance the understanding of predictive modeling for wastewater quality and provide valuable insights for future practice in wastewater treatment.


Subject(s)
Wastewater , Water Purification , Water Purification/methods , Machine Learning , Nitrogen/analysis , Neural Networks, Computer , Waste Disposal, Fluid/methods
15.
Res Q Exerc Sport ; : 1-13, 2024 Jun 14.
Article in English | MEDLINE | ID: mdl-38875156

ABSTRACT

Purpose: With the popularity of recreational activities, the study aimed to develop prediction models for recreational activity participation and explore the key factors affecting participation in recreational activities. Methods: A total of 12,712 participants, excluding individuals under 20, were selected from the National Health and Nutrition Examination Survey (NHANES) from 2011 to 2018. The mean age of the sample was 46.86 years (±16.97), with a gender distribution of 6,721 males and 5,991 females. The variables included demographic, physical-related variables, and lifestyle variables. This study developed 42 prediction models using six machine learning methods, including logistic regression, Support Vector Machine (SVM), decision tree, random forest, eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM). The relative importance of each variable was evaluated by permutation feature importance. Results: The results illustrated that the LightGBM was the most effective algorithm for predicting recreational activity participation (accuracy: .838, precision: .783, recall: .967, F1-score: .865, AUC: .826). In particular, prediction performance increased when the demographic and lifestyle datasets were used together. Next, as the result of the permutation feature importance based on the top models, education level and moderate-vigorous physical activity (MVPA) were found to be essential variables. Conclusion: These findings demonstrated the potential of a data-driven approach utilizing machine learning in a recreational discipline. Furthermore, this study interpreted the prediction model through feature importance analysis to overcome the limitation of machine learning interpretability.

16.
Sci Rep ; 14(1): 11503, 2024 05 20.
Article in English | MEDLINE | ID: mdl-38769382

ABSTRACT

This study aimed to present a new approach to predict to delirium admitted to the acute palliative care unit. To achieve this, this study employed machine learning model to predict delirium in patients in palliative care and identified the significant features that influenced the model. A multicenter, patient-based registry cohort study in South Korea between January 1, 2019, and December 31, 2020. Delirium was identified by reviewing the medical records based on the criteria of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition. The study dataset included 165 patients with delirium among 2314 patients with advanced cancer admitted to the acute palliative care unit. Seven machine learning models, including extreme gradient boosting, adaptive boosting, gradient boosting, light gradient boosting, logistic regression, support vector machine, and random forest, were evaluated to predict delirium in patients with advanced cancer admitted to the acute palliative care unit. An ensemble approach was adopted to determine the optimal model. For k-fold cross-validation, the combination of extreme gradient boosting and random forest provided the best performance, achieving the following accuracy metrics: 68.83% sensitivity, 70.85% specificity, 69.84% balanced accuracy, and 74.55% area under the receiver operating characteristic curve. The performance of the isolated testing dataset was also validated, and the machine learning model was successfully deployed on a public website ( http://ai-wm.khu.ac.kr/Delirium/ ) to provide public access to delirium prediction results in patients with advanced cancer. Furthermore, using feature importance analysis, sex was determined to be the top contributor in predicting delirium, followed by a history of delirium, chemotherapy, smoking status, alcohol consumption, and living with family. Based on a large-scale, multicenter, patient-based registry cohort, a machine learning prediction model for delirium in patients with advanced cancer was developed in South Korea. We believe that this model will assist healthcare providers in treating patients with delirium and advanced cancer.


Subject(s)
Delirium , Machine Learning , Neoplasms , Palliative Care , Registries , Humans , Delirium/diagnosis , Delirium/etiology , Palliative Care/methods , Male , Female , Neoplasms/complications , Aged , Middle Aged , Republic of Korea/epidemiology , Cohort Studies , ROC Curve , Aged, 80 and over
17.
Psychol Res Behav Manag ; 17: 1057-1071, 2024.
Article in English | MEDLINE | ID: mdl-38505352

ABSTRACT

Background: Sleep problems are prevalent among university students, yet there is a lack of effective models to assess the risk of sleep disturbance. Artificial intelligence (AI) provides an opportunity to develop a platform for evaluating the risk. This study aims to develop and validate an AI platform to stratify the risk of experiencing sleep disturbance for university students. Methods: A total of 2243 university students were included, with 1882 students from five universities comprising the model derivation group and 361 students from two additional universities forming the external validation group. Six machine learning techniques, including extreme gradient boosting machine (eXGBM), decision tree (DT), k-nearest neighbor (KNN), random forest (RF), neural network (NN), and support vector machine (SVM), were employed to train models using the same set of features. The models' prediction performance was assessed based on discrimination and calibration, and feature importance was determined using Shapley Additive exPlanations (SHAP) analysis. Results: The prevalence of sleep disturbance was 44.69% in the model derivation group and 49.58% in the external validation group. Among the developed models, eXGBM exhibited superior performance, surpassing other models in metrics such as area under the curve (0.779, 95% CI: 0.728-0.830), accuracy (0.710), precision (0.737), F1 score (0.692), Brier score (0.193), and log loss (0.569). Calibration and decision curve analyses demonstrated favorable calibration ability and clinical net benefits, respectively. SHAP analysis identified five key features: stress score, severity of depression, vegetable consumption, age, and sedentary time. The AI platform was made available online at https://sleepdisturbancestudents-xakgzwectsw85cagdgkax9.streamlit.app/, enabling users to calculate individualized risk of sleep disturbance. Conclusion: Sleep disturbance is prevalent among university students. This study presents an AI model capable of identifying students at high risk for sleep disturbance. The AI platform offers a valuable resource to guide interventions and improve sleep outcomes for university students.

18.
J Appl Genet ; 65(2): 283-286, 2024 May.
Article in English | MEDLINE | ID: mdl-38170439

ABSTRACT

Best linear unbiased prediction (BLUP) is widely used in plant research to address experimental variation. For phenotypic values, BLUP accuracy is largely dependent on properly controlled experimental repetition and how variable components are outlined in the model. Thus, determining BLUP robustness implies the need to evaluate contributions from each repetition. Here, we assessed the robustness of BLUP values for simulated or empirical phenotypic datasets, where the BLUP value and each experimental repetition served as dependent and independent (feature) variables, respectively. Our technique incorporated machine learning and partial dependence. First, we compared the feature importance estimated with the neural networks. Second, we compared estimated average marginal effects of individual repetitions, calculated with a partial dependence analysis. We showed that contributions of experimental repetitions are unequal in a phenotypic dataset, suggesting that the calculated BLUP value is likely to be influenced by some repetitions more than others (such as failing to detect simulated true positive associations). To resolve disproportionate sources, variable components in the BLUP model must be further outlined.


Subject(s)
Machine Learning , Models, Genetic , Genotype , Linear Models , Phenotype
19.
Comput Biol Med ; 169: 107871, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38154157

ABSTRACT

BACKGROUND: During lung cancer screening, indeterminate pulmonary nodules (IPNs) are a frequent finding. We aim to predict whether IPNs are resolving or non-resolving to reduce follow-up examinations, using machine learning (ML) models. We incorporated dedicated techniques to enhance prediction explainability. METHODS: In total, 724 IPNs (size 50-500 mm3, 575 participants) from the Dutch-Belgian Randomized Lung Cancer Screening Trial were used. We implemented six ML models and 14 factors to predict nodule disappearance. Random search was applied to determine the optimal hyperparameters on the training set (579 nodules). ML models were trained using 5-fold cross-validation and tested on the test set (145 nodules). Model predictions were evaluated by utilizing the recall, precision, F1 score, and the area under the receiver operating characteristic curve (AUC). The best-performing model was used for three feature importance techniques: mean decrease in impurity (MDI), permutation feature importance (PFI), and SHAPley Additive exPlanations (SHAP). RESULTS: The random forest model outperformed the other ML models with an AUC of 0.865. This model achieved a recall of 0.646, a precision of 0.816, and an F1 score of 0.721. The evaluation of feature importance achieved consistent ranking across all three methods for the most crucial factors. The MDI, PFI, and SHAP methods highlighted volume, maximum diameter, and minimum diameter as the top three factors. However, the remaining factors revealed discrepant ranking across methods. CONCLUSION: ML models effectively predict IPN disappearance using participant demographics and nodule characteristics. Explainable techniques can assist clinicians in developing understandable preliminary assessments.


Subject(s)
Lung Neoplasms , Humans , Early Detection of Cancer , Machine Learning , ROC Curve , Randomized Controlled Trials as Topic
20.
J Hazard Mater ; 471: 134426, 2024 Jun 05.
Article in English | MEDLINE | ID: mdl-38688220

ABSTRACT

Nanoplastics (NPs) aggregation determines their bioavailability and risks in natural aquatic environments, which is driven by multiple environmental and polymer factors. The back propagation artificial neural network (BP-ANN) model in machine learning (R2 = 0.814) can fit the complex NPs aggregation, and the feature importance was in the order of surface charge of NPs > dissolved organic matter (DOM) > functional group of NPs > ionic strength and pH > concentration of NPs. Meta-analysis results specified low surface charge (0 ≤ |ζ| < 10 mV) of NPs, low concentration (< 1 mg/L) and low molecular weight (< 10 kg/mol) of DOM, NPs with amino groups, high ionic strength (IS > 700 mM) and acidic solution, and high concentration (≥ 20 mg/L) of NPs with smaller size (< 100 nm) contribute to NPs aggregation, which is consistent with the prediction in machine learning. Feature interaction synergistically (e.g., DOM and pH) or antagonistically (e.g., DOM and cation potential) changed NPs aggregation. Therefore, NPs were predicted to aggregate in the dry period and estuary of Poyang Lake. Research on aggregation of NPs with different particle size,shapes, and functional groups, heteroaggregation of NPs with coexisting particles and aging effects should be strengthened in the future. This study supports better assessments of the NPs fate and risks in environments.

SELECTION OF CITATIONS
SEARCH DETAIL