Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 110
Filter
1.
Sci Rep ; 14(1): 23028, 2024 Oct 03.
Article in English | MEDLINE | ID: mdl-39362913

ABSTRACT

The accurate prediction of uneven rock mass classes is crucial for intelligent operation in tunnel-boring machine (TBM) tunneling. However, the classification of rock masses presents significant challenges due to the variability and complexity of geological conditions. To address these challenges, this study introduces an innovative predictive model combining the improved EWOA (IEWOA) and the light gradient boosting machine (LightGBM). The proposed IEWOA algorithm incorporates a novel parameter l for more effective position updates during the exploration stage and utilizes sine functions during the exploitation stage to optimize the search process. Additionally, the model integrates a minority class technique enhanced with a random walk strategy (MCT-RW) to extend the boundaries of minority classes, such as Classes II, IV, and V. This approach significantly improves the recall and F1-score for these rock mass classes. The proposed methodology was rigorously evaluated against other predictive algorithms, demonstrating superior performance with an accuracy of 94.74%. This innovative model not only enhances the accuracy of rock mass classification but also contributes significantly to the intelligent and efficient construction of TBM tunnels, providing a robust solution to one of the key challenges in underground engineering.

2.
Appl Plant Sci ; 12(5): e11576, 2024.
Article in English | MEDLINE | ID: mdl-39360189

ABSTRACT

Premise: Plant functional traits are often used to describe the spectra of ecological strategies used by different species. Here, we demonstrate a machine learning approach for identifying the traits that contribute most to interspecific phenotypic divergence in a multivariate trait space. Methods: Descriptive and predictive machine learning approaches were applied to trait data for the genus Helianthus, including random forest and gradient boosting machine classifiers and recursive feature elimination. These approaches were applied at the genus level as well as within each of the three major clades within the genus to examine the variability in the major axes of trait divergence in three independent species radiations. Results: Machine learning models were able to predict species identity from functional traits with high accuracy, and differences in functional trait importance were observed between the genus and clade levels indicating different axes of phenotypic divergence. Conclusions: Applying machine learning approaches to identify divergent traits can provide insights into the predictability or repeatability of evolution through the comparison of parallel diversifications of clades within a genus. These approaches can be implemented in a range of contexts across basic and applied plant science from interspecific divergence to intraspecific variation across time, space, and environmental conditions.

3.
J Card Fail ; 2024 Sep 17.
Article in English | MEDLINE | ID: mdl-39299541

ABSTRACT

INTRODUCTION: Optimal management of outpatients with heart failure (HF) requires serially updating the estimates of their risk for adverse clinical outcomes to guide treatment. Patient-reported outcomes (PROs) are becoming increasingly used in clinical care. The purpose of this study was to determine whether inclusion of PROs can improve the risk prediction for HF hospitalization and death in ambulatory HF patients. METHODS: We included consecutive patients with HF with reduced ejection fraction (HFrEF) and HF with preserved ejection fraction (HFpEF) seen in a HF clinic between 2015 and 2019 who completed PROs as part of routine care. Cox regression with a least absolute shrinkage and selection operator (LASSO) regularization and gradient boosting machine (GBM) analyses were used to estimate risk for a combined outcome of HF hospitalization, heart transplant, left ventricular assist device implantation or death. The performance of the prediction models was evaluated with the time-dependent concordance index (Cτ). RESULTS: Among 1165 patients with HFrEF (mean age 59.1±16.1, 68% male) the median follow-up was 487 days and among 456 patients with HFpEF (mean age: 64.2±16.0 years, 55% male) the median follow-up was 494 days. Gradient boosting regression that included PROs had the best prediction performance - Cτ 0.73 for patients with HFrEF and 0.74 in patients with HFpEF, and showed very good stratification of risk by time to event analysis by quintile of risk. The Kansas City Cardiomyopathy Questionnaire overall summary score (KCCQ-12 OSS), Visual Analogue Scale (VAS) and Patient Reported Outcomes Measurement Information System (PROMIS) dimensions of Satisfaction with social roles and Physical function had high variable importance measure in the models. CONCLUSIONS: PROs improve risk prediction in both HFrEF and HFpEF, independent of traditional clinical factors. Routine assessment of PROs and leveraging the comprehensive data in the electronic health record in routine clinical care could help more accurately assess risk and support the intensification of treatment in patients with HF.

4.
Environ Sci Technol ; 2024 Sep 13.
Article in English | MEDLINE | ID: mdl-39271478

ABSTRACT

Granular activated carbon (GAC) adsorption is frequently used to remove recalcitrant organic micropollutants (MPs) from water. The overarching aim of this research was to develop machine learning (ML) models to predict GAC performance from adsorbent, adsorbate, and background water matrix properties. For model calibration, MP breakthrough curves were compiled and analyzed to determine the bed volumes of water that can be treated until MP breakthrough reaches ten percent of the influent MP concentration (BV10). Over 400 data points were split into training, validation, and testing sets. Seventeen variables describing MP, background water matrix, and GAC properties were explored in ML models to predict log10-transformed BV10 values. Using the ML models on the testing set, predicted BV10 values exhibited mean absolute errors of ∼0.12 log units and were highly correlated with experimentally determined values (R2 ≥ 0.88). The top three drivers influencing BV10 predictions were the air-hexadecane partition coefficient and hydrogen bond acidity (Abraham parameters L and A) of the MPs and the dissolved organic carbon concentration of the GAC influent water. The model can be used to rapidly estimate the GAC bed life, select effective GAC products for a given treatment scenario, and explore the suitability of GAC treatment for remediating emerging MPs.

5.
Mar Pollut Bull ; 208: 116946, 2024 Sep 17.
Article in English | MEDLINE | ID: mdl-39293369

ABSTRACT

Maritime operations face significant challenges in environmental stewardship, particularly in managing oil discharges from tankers as mandated by the International Convention for the Prevention of Pollution from Ships (MARPOL) Annex I, Regulation 34. Traditional Oil Discharge Monitoring Equipment (ODME) methods rely on manual decision-making, often failing to accurately identify MARPOL-defined no-go zones, estimate operation completion times, and recommend course alterations during decanting operations. This study introduces a novel approach by integrating advanced machine learning techniques-Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM)-to enhance ODME operations. Specifically, these models automate the identification of no-go zones and optimize operational decisions, leading to a 99 % accuracy rate in compliance with MARPOL regulations and an operational time estimation error margin of <1 %. Unlike traditional methods, our approach leverages large datasets and real-time GPS (Global Positioning System) data, significantly reducing human error and enhancing both environmental compliance and operational efficiency. To our knowledge, this is the first study to specifically address the application of machine learning to decanting operations under MARPOL Annex I, marking a significant advancement in maritime environmental management.

6.
Biomedicines ; 12(8)2024 Aug 15.
Article in English | MEDLINE | ID: mdl-39200323

ABSTRACT

(1) Background: Liver metastases (LM) are the leading cause of death in colorectal cancer (CRC) patients. Despite advancements, relapse rates remain high and current prognostic nomograms lack accuracy. Our objective is to develop an interpretable neoadjuvant algorithm based on mathematical models to accurately predict individual risk, ensuring mathematical transparency and auditability. (2) Methods: We retrospectively evaluated 86 CRC patients with LM treated with neoadjuvant systemic therapy followed by complete surgical resection. A comprehensive analysis of 155 individual patient variables was performed. Logistic regression (LR) was utilized to develop the predictive model for relapse risk through significance testing and ANOVA analysis. Due to data limitations, gradient boosting machine (GBM) and synthetic data were also used. (3) Results: The model was based on data from 74 patients (12 were excluded). After a median follow-up of 58 months, 5-year relapse-free survival (RFS) rate was 33% and 5-year overall survival (OS) rate was 60.7%. Fifteen key variables were used to train the GBM model, which showed promising accuracy (0.82), sensitivity (0.59), and specificity (0.96) in predicting relapse. Similar results were obtained when external validation was performed as well. (4) Conclusions: This model offers an alternative for predicting individual relapse risk, aiding in personalized adjuvant therapy and follow-up strategies.

7.
Toxicol Mech Methods ; : 1-9, 2024 Aug 05.
Article in English | MEDLINE | ID: mdl-39104137

ABSTRACT

Per- and polyfluoroalkyl substances (PFASs), one of the persistent organic pollutants, have immunosuppressive effects. The evaluation of this effect has been the focus of regulatory toxicology. In this investigation, 146 PFASs (immunosuppressive or nonimmunosuppressive) and corresponding concentration gradients were collected from literature, and their structures were characterized by using Dragon descriptors. Feature importance analysis and stepwise feature elimination are used for feature selection. Three machine learning (ML) methods, namely Random Forest (RF), Extreme Gradient Boosting Machine (XGB), and Categorical Boosting Machine (CB), were utilized for model development. The model interpretability was explored by feature importance analysis and correlation analysis. The findings indicated that the three models developed have exhibited excellent performance. Among them, the best-performing RF model has an average AUC score of 0.9720 for the testing set. The results of the feature importance analysis demonstrated that concentration, SpPosA_X, IVDE, R2s, and SIC2 were the crucial molecular features. Applicability domain analysis was also performed to determine reliable prediction boundaries for the model. In conclusion, this study is the first application of ML models to investigate the immunosuppressive activity of PFASs. The variables used in the models can help understand the mechanism of the immunosuppressive activity of PFASs, allow researchers to more effectively assess the immunosuppressive potential of a large number of PFASs, and thus better guide environmental and health risk assessment efforts.

8.
Sensors (Basel) ; 24(15)2024 Jul 31.
Article in English | MEDLINE | ID: mdl-39124011

ABSTRACT

Load recognition remains not comprehensively explored in Home Energy Management Systems (HEMSs). There are gaps in current approaches to load recognition, such as enhancing appliance identification and increasing the overall performance of the load-recognition system through more robust models. To address this issue, we propose a novel approach based on the Analysis of Variance (ANOVA) F-test combined with SelectKBest and gradient-boosting machines (GBMs) for load recognition. The proposed approach improves the feature selection and consequently aids inter-class separability. Further, we optimized GBM models, such as the histogram-based gradient-boosting machine (HistGBM), light gradient-boosting machine (LightGBM), and XGBoost (extreme gradient boosting), to create a more reliable load-recognition system. Our findings reveal that the ANOVA-GBM approach achieves greater efficiency in training time, even when compared to Principal Component Analysis (PCA) and a higher number of features. ANOVA-XGBoost is approximately 4.31 times faster than PCA-XGBoost, ANOVA-LightGBM is about 5.15 times faster than PCA-LightGBM, and ANOVA-HistGBM is 2.27 times faster than PCA-HistGBM. The general performance results expose the impact on the overall performance of the load-recognition system. Some of the key results show that the ANOVA-LightGBM pair reached 96.42% accuracy, 96.27% F1, and a Kappa index of 0.9404; the ANOVA-HistGBM combination achieved 96.64% accuracy, 96.48% F1, and a Kappa index of 0.9434; and the ANOVA-XGBoost pair attained 96.75% accuracy, 96.64% F1, and a Kappa index of 0.9452; such findings overcome rival methods from the literature. In addition, the accuracy gain of the proposed approach is prominent when compared straight to its competitors. The higher accuracy gains were 13.09, 13.31, and 13.42 percentage points (pp) for the pairs ANOVA-LightGBM, ANOVA-HistGBM, and ANOVA-XGBoost, respectively. These significant improvements highlight the effectiveness and refinement of the proposed approach.

9.
BMC Med Inform Decis Mak ; 24(1): 223, 2024 Aug 08.
Article in English | MEDLINE | ID: mdl-39118128

ABSTRACT

BACKGROUND: There is a growing demand for advanced methods to improve the understanding and prediction of illnesses. This study focuses on Sepsis, a critical response to infection, aiming to enhance early detection and mortality prediction for Sepsis-3 patients to improve hospital resource allocation. METHODS: In this study, we developed a Machine Learning (ML) framework to predict the 30-day mortality rate of ICU patients with Sepsis-3 using the MIMIC-III database. Advanced big data extraction tools like Snowflake were used to identify eligible patients. Decision tree models and Entropy Analyses helped refine feature selection, resulting in 30 relevant features curated with clinical experts. We employed the Light Gradient Boosting Machine (LightGBM) model for its efficiency and predictive power. RESULTS: The study comprised a cohort of 9118 Sepsis-3 patients. Our preprocessing techniques significantly improved both the AUC and accuracy metrics. The LightGBM model achieved an impressive AUC of 0.983 (95% CI: [0.980-0.990]), an accuracy of 0.966, and an F1-score of 0.910. Notably, LightGBM showed a substantial 6% improvement over our best baseline model and a 14% enhancement over the best existing literature. These advancements are attributed to (I) the inclusion of the novel and pivotal feature Hospital Length of Stay (HOSP_LOS), absent in previous studies, and (II) LightGBM's gradient boosting architecture, enabling robust predictions with high-dimensional data while maintaining computational efficiency, as demonstrated by its learning curve. CONCLUSIONS: Our preprocessing methodology reduced the number of relevant features and identified a crucial feature overlooked in previous studies. The proposed model demonstrated high predictive power and generalization capability, highlighting the potential of ML in ICU settings. This model can streamline ICU resource allocation and provide tailored interventions for Sepsis-3 patients.


Subject(s)
Intensive Care Units , Machine Learning , Sepsis , Humans , Sepsis/mortality , Hospital Mortality , Male , Female , Middle Aged , Aged , Prognosis
10.
Sci Total Environ ; 948: 174462, 2024 Oct 20.
Article in English | MEDLINE | ID: mdl-38992374

ABSTRACT

This comprehensive study unveils the vast global potential of microalgae as a sustainable bioenergy source, focusing on the utilization of marginal lands and employing advanced machine learning techniques to predict biomass productivity. By identifying approximately 7.37 million square kilometers of marginal lands suitable for microalgae cultivation, this research uncovers the extensive potential of these underutilized areas, particularly within equatorial and low-latitude regions, for microalgae bioenergy development. This approach mitigates the competition for food resources and conserves freshwater supplies. Utilizing cutting-edge machine learning algorithms based on robust datasets from global microalgae cultivation experiments spanning 1994 to 2017, this study integrates essential environmental variables to map out a detailed projection of potential yields across a variety of landscapes. The analysis further delineates the bioenergy and carbon sequestration potential across two effective cultivation methods: Photobioreactors (PBRs), and Open Ponds, with PBRs showcasing exceptional productivity, with a global average daily biomass productivity of 142.81mgL-1d-1, followed by Open Ponds at 122.57mgL-1d-1. Projections based on optimal PBR conditions suggest an annual yield of 99.54 gigatons of microalgae biomass. This yield can be transformed into 64.70 gigatons of biodiesel, equivalent to 58.68 gigatons of traditional diesel, while sequestering 182.16 gigatons of CO2, equating to approximately 4.5 times the global CO2 emissions projected for 2023. Notably, Australia leads in microalgae biomass production, with an annual output of 16.19 gigatons, followed by significant contributions from Kazakhstan, Sudan, Brazil, the United States, and China, showcasing the diverse global potential for microalgae bioenergy across varying ecological and geographical landscapes. Through this rigorous investigation, the study emphasizes the strategic importance of microalgae cultivation in achieving sustainable energy solutions and mitigating climate change, while also acknowledging the scalability challenges and the necessity for significant economic and energy investments.


Subject(s)
Biofuels , Biomass , Carbon Sequestration , Machine Learning , Microalgae , Microalgae/growth & development
11.
JMIR Form Res ; 8: e54097, 2024 Aug 21.
Article in English | MEDLINE | ID: mdl-38991090

ABSTRACT

BACKGROUND: Preoperative evaluation is important, and this study explored the application of machine learning methods for anesthetic risk classification and the evaluation of the contributions of various factors. To minimize the effects of confounding variables during model training, we used a homogenous group with similar physiological states and ages undergoing similar pelvic organ-related procedures not involving malignancies. OBJECTIVE: Data on women of reproductive age (age 20-50 years) who underwent gestational or gynecological surgery between January 1, 2017, and December 31, 2021, were obtained from the National Taiwan University Hospital Integrated Medical Database. METHODS: We first performed an exploratory analysis and selected key features. We then performed data preprocessing to acquire relevant features related to preoperative examination. To further enhance predictive performance, we used the log-likelihood ratio algorithm to generate comorbidity patterns. Finally, we input the processed features into the light gradient boosting machine (LightGBM) model for training and subsequent prediction. RESULTS: A total of 10,892 patients were included. Within this data set, 9893 patients were classified as having low anesthetic risk (American Society of Anesthesiologists physical status score of 1-2), and 999 patients were classified as having high anesthetic risk (American Society of Anesthesiologists physical status score of >2). The area under the receiver operating characteristic curve of the proposed model was 0.6831. CONCLUSIONS: By combining comorbidity information and clinical laboratory data, our methodology based on the LightGBM model provides more accurate predictions for anesthetic risk classification. TRIAL REGISTRATION: Research Ethics Committee of the National Taiwan University Hospital 202204010RINB; https://www.ntuh.gov.tw/RECO/Index.action.

12.
Food Chem ; 456: 140062, 2024 Oct 30.
Article in English | MEDLINE | ID: mdl-38876073

ABSTRACT

Differences in moisture and protein content impact both nutritional value and processing efficiency of corn kernels. Near-infrared (NIR) spectroscopy can be used to estimate kernel composition, but models trained on a few environments may underestimate error rates and bias. We assembled corn samples from diverse international environments and used NIR with chemometrics and partial least squares regression (PLSR) to determine moisture and protein. The potential of five feature selection methods to improve prediction accuracy was assessed by extracting sensitive wavelengths. Gradient boosting machines (GBMs), particularly CatBoost and LightGBM, were found to effectively select crucial wavelengths for moisture (1409, 1900, 1908, 1932, 1953, 2174 nm) and protein (887, 1212, 1705, 1891, 2097, 2456 nm). SHAP plots highlighted significant wavelength contributions to model prediction. These results illustrate GBMs' effectiveness in feature engineering for agricultural and food sector applications, including developing multi-country global calibration models for moisture and protein in corn kernels.


Subject(s)
Plant Proteins , Spectroscopy, Near-Infrared , Water , Zea mays , Zea mays/chemistry , Spectroscopy, Near-Infrared/methods , Plant Proteins/analysis , Plant Proteins/chemistry , Least-Squares Analysis , Water/chemistry , Water/analysis , Seeds/chemistry
13.
Am J Transl Res ; 16(5): 1740-1748, 2024.
Article in English | MEDLINE | ID: mdl-38883341

ABSTRACT

OBJECTIVE: To identify factors influencing recurrence after percutaneous transhepatic choledochoscopic lithotripsy (PTCSL) and to develop a predictive model. METHODS: We retrospectively analyzed clinical data from 354 patients with intrahepatic and extrahepatic bile duct stones treated with PTCSL at Qinzhou First People's Hospital between February 2018 and January 2020. Patients were followed for three years and categorized into non-recurrence and recurrence groups based on postoperative outcome. Univariate analysis identified possible predictors of stone recurrence. Data were split using the gradient boosting machine (GBM) algorithm, assigning 70% as the training set and 30% as the test set. The predictive performance of the GBM model was assessed using the receiver operating characteristic (ROC) curve and calibration curve, and compared with a logistic regression model. RESULTS: Six factors were identified as significant predictors of recurrence: age, diabetes, total bilirubin, biliary stricture, number of stones, and stone diameter. The GBM model, developed based on these factors, showed high predictive accuracy. The area under the ROC curve (AUC) was 0.763 (95% CI: 0.695-0.830) for the training set and 0.709 (95% CI: 0.596-0.822) for the test set. Optimal cutoff values were 0.286 and 0.264, with sensitivities of 62.30% and 66.70%, and specificities of 77.20% and 68.50%, respectively. Calibration curves indicated good agreement between predicted probabilities and observed recurrence rates in both sets. DeLong's test revealed no significant differences between the GBM and logistic regression models in predictive performance (training set: D = 0.003, P = 0.997 > 0.05; test set: D = 0.075, P = 0.940 > 0.05). CONCLUSION: Biliary stricture, stone diameter, diabetes, stone number, age, and total bilirubin significantly influence stone recurrence after PTCSL. The GBM model, based on these factors, demonstrates robust accuracy and discrimination. Both GBM and logistic regression models effectively predicted stone recurrence post-PTCSL.

14.
Ying Yong Sheng Tai Xue Bao ; 35(5): 1321-1330, 2024 May.
Article in Chinese | MEDLINE | ID: mdl-38886431

ABSTRACT

Rapid acquisition of the data of soil moisture content (SMC) and soil organic matter (SOM) content is crucial for the improvement and utilization of saline alkali farmland soil. Based on field measurements of hyperspectral reflectance and soil properties of farmland soil in the Hetao Plain, we used a competitive adaptive reweighted sampling algorithm (CARS) to screen sensitive bands after transforming the original spectral reflectance (Ref) into a standard normal variable (SNV). Strategies Ⅰ, Ⅱ, and Ⅲ were used to model the input variables of Ref, Ref SNV, Ref-SNV+ soil covariate (SC), and digital elevation model (DEM). We constructed SMC and SOM estimation models based on random forest (RF) and light gradient boosting machine (LightGBM), and then verified and compared the accuracy of the models. The results showed that after CARS screening, the sensitive bands of SMC and SOM were compressed to below 3.3% of the entire band, which effectively optimized band selection and reduced redundant spectral information. Compared with the LightGBM model, the RF model had higher accuracy in SMC and SOM estimation, and the input variable strategy Ⅲ was better than Ⅱ and Ⅰ. The introduction of auxiliary variables effectively improved the estimation ability of the model. Based on comprehensive analysis, the coefficient of determination (Rp2), root mean square error (RMSE), and relative analysis error (RPD) of the SMC estimation model validation based on strategy Ⅲ-RF were 0.63, 3.16, and 2.01, respectively. The SOM estimation models based on strategy Ⅲ-RF had Rp2, RMSE, and RPD of 0.93, 1.15, and 3.52, respectively. The strategy Ⅲ-RF model was an effective method for estimating SMC and SOM. Our results could provide a new method for the rapid estimation of soil moisture and organic matter content in saline alkali farmland.


Subject(s)
Algorithms , Organic Chemicals , Soil , Water , Soil/chemistry , Organic Chemicals/analysis , Water/analysis , Crops, Agricultural/growth & development , Crops, Agricultural/chemistry , Alkalies/analysis , Alkalies/chemistry , China , Ecosystem
15.
Sci Rep ; 14(1): 12539, 2024 May 31.
Article in English | MEDLINE | ID: mdl-38822049

ABSTRACT

Mine water inrush is a serious threat to mine safety production. It is very important to identify water inrush source types quickly to prevent and control water damage. In this study, the aqueous chemical components Na+ + K+, Ca2+, Mg2+, Cl-, SO42- and HCO3- of different aquifers in Pingdingshan coalfield were selected as the characteristic values, and the Surface water, Quaternary pore water, Carboniferous limestone karst water, Permian sandstone water, and Cambrian limestone karst water were used as the labels. An intelligent water source discrimination model is proposed by combining data mining, classification models, and reinforcement learning. As outlier data in the samples may interfere with the model recognition ability, the data distribution range was analyzed using box plots, and 20 groups of abnormal samples were excluded. The processed water chemistry data were divided into 80% learning samples and 20% test samples, and the learning samples were fed into a light gradient boosting machine (LightGBM) for training. The tree-structured parson estimator (TPE) obtains the optimal values of the main parameters of LightGBM in a very short time. Substituting the hyperparameters back into the model yields a 13.9% improvement in the accuracy of the model, proving the effectiveness of the TPE algorithm. To further validate the performance of the model, TPE-LightGBM is compared and analyzed with a Random Search-Multi Layer Perceptron Machine (RS-MLP) and Genetic Algorithm-Extreme Gradient Boosting Tree (GA-SVM). The accuracy of TPE-LightGBM, RS-MLP, and GA-SVM is 0.931, 0.759, 0.724 in that order, and the generalization error RMSE is 0.415, 1.05, and 1.313 in that order. The results show that TPE-LightGBM is more advantageous in water source identification and is more resistant to overfitting. By calculating and comparing the information gain of each variable, the contribution of Ca2+ is the highest, so it is necessary to pay attention to the change in Ca2+ concentration. TPE-LightGBM's high accuracy and generalization ability have a good prospect for the identification of sudden water source types.

16.
Phys Med Biol ; 69(11)2024 May 30.
Article in English | MEDLINE | ID: mdl-38749471

ABSTRACT

Accurate diagnosis and treatment assessment of liver fibrosis face significant challenges, including inherent limitations in current techniques like sampling errors and inter-observer variability. Addressing this, our study introduces a novel machine learning (ML) framework, which integrates light gradient boosting machine and multivariate imputation by chained equations to enhance liver status assessment using biomechanical markers. Building upon our previously established multiscale mechanical characteristics in fibrotic and treated livers, this framework employs Gaussian Bayesian optimization for post-imputation, significantly improving classification performance. Our findings indicate a marked increase in the precision of liver fibrosis diagnosis and provide a novel, quantitative approach for assessing fibrosis treatment. This innovative combination of multiscale biomechanical markers with advanced ML algorithms represents a transformative step in liver disease diagnostics and treatment evaluation, with potential implications for other areas in medical diagnostics.


Subject(s)
Liver Cirrhosis , Machine Learning , Biomechanical Phenomena , Humans , Mechanical Phenomena , Bayes Theorem , Animals , Biomarkers/metabolism
17.
Mol Divers ; 28(4): 2153-2161, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38554168

ABSTRACT

Cancer, being the second leading cause of death globally. So, the development of effective anticancer treatments is crucial in the field of medicine. Anticancer peptides (ACPs) have shown promising therapeutic potential in cancer treatment compared to traditional methods. However, the process of identifying ACPs through experimental means is often time-intensive and expensive. To overcome this issue, we employed a machine learning-based approach for the first time to develop an anticancer model using small molecules. Anticancer small molecules (ACSMs) are compounds that have been developed to target and inhibit cancer cells. In this study, we used 10,000 compounds to develop the machine learning models using five algorithms such as, Random Forest (RF), Light gradient boosting machine (LightGBM), K-nearest neighbors (KNN), Decision tree (DT) and Extreme Gradient Boosting (XGB). The developed models were evaluated using the test set and top three models were identified (RF, LightGBM and XGB). Furthermore, to validate the predictive performance of our models, we have performed external validation using an FDA approved anticancer compounds/drugs. Following this analysis, we found that our LightGBM model correctly predicted 9 compounds as active. However, RF and XGB exhibited some limitations by predicting 8 and 7 compounds as active out of 10, respectively. These results demonstrate that, when compared to RF and XGB, the LightGBM model showcase robust prediction capabilities, achieving a superior accuracy of 79% with an AUC of 0.88. These findings provide promising insights into the potential of our approach for predicting anticancer small molecules, highlighting the role of machine learning in advancing cancer treatment research.


Subject(s)
Algorithms , Antineoplastic Agents , Machine Learning , Antineoplastic Agents/pharmacology , Antineoplastic Agents/chemistry , Humans , Small Molecule Libraries/pharmacology , Small Molecule Libraries/chemistry , Neoplasms/drug therapy , Drug Discovery/methods
18.
JMIR Form Res ; 8: e47803, 2024 Mar 11.
Article in English | MEDLINE | ID: mdl-38466973

ABSTRACT

BACKGROUND: Atrial fibrillation (AF) represents a hazardous cardiac arrhythmia that significantly elevates the risk of stroke and heart failure. Despite its severity, its diagnosis largely relies on the proficiency of health care professionals. At present, the real-time identification of paroxysmal AF is hindered by the lack of automated techniques. Consequently, a highly effective machine learning algorithm specifically designed for AF detection could offer substantial clinical benefits. We hypothesized that machine learning algorithms have the potential to identify and extract features of AF with a high degree of accuracy, given the intricate and distinctive patterns present in electrocardiogram (ECG) recordings of AF. OBJECTIVE: This study aims to develop a clinically valuable machine learning algorithm that can accurately detect AF and compare different leads' performances of AF detection. METHODS: We used 12-lead ECG recordings sourced from the 2020 PhysioNet Challenge data sets. The Welch method was used to extract power spectral features of the 12-lead ECGs within a frequency range of 0.083 to 24.92 Hz. Subsequently, various machine learning techniques were evaluated and optimized to classify sinus rhythm (SR) and AF based on these power spectral features. Furthermore, we compared the effects of different frequency subbands and different lead selections on machine learning performances. RESULTS: The light gradient boosting machine (LightGBM) was found to be the most effective in classifying AF and SR, achieving an average F1-score of 0.988 across all ECG leads. Among the frequency subbands, the 0.083 to 4.92 Hz range yielded the highest F1-score of 0.985. In interlead comparisons, aVR had the highest performance (F1=0.993), with minimal differences observed between leads. CONCLUSIONS: In conclusion, this study successfully used machine learning methodologies, particularly the LightGBM model, to differentiate SR and AF based on power spectral features derived from 12-lead ECGs. The performance marked by an average F1-score of 0.988 and minimal interlead variation underscores the potential of machine learning algorithms to bolster real-time AF detection. This advancement could significantly improve patient care in intensive care units as well as facilitate remote monitoring through wearable devices, ultimately enhancing clinical outcomes.

19.
Micromachines (Basel) ; 15(2)2024 Jan 30.
Article in English | MEDLINE | ID: mdl-38398939

ABSTRACT

Detecting inclusions in materials at small scales is of high importance to ensure the quality, structural integrity and performance efficiency of microelectromechanical machines and products. Ultrasound waves are commonly used as a non-destructive method to find inclusions or structural flaws in a material. Mathematical continuum models can be used to enable ultrasound techniques to provide quantitative information about the change in the mechanical properties due to the presence of inclusions. In this paper, a nonlocal size-dependent poroelasticity model integrated with machine learning is developed for the description of the mechanical behaviour of spherical inclusions under uniform radial compression. The scale effects on fluid pressure and radial displacement are captured using Eringen's theory of nonlocality. The conservation of mass law is utilised for both the solid matrix and fluid content of the poroelastic material to derive the storage equation. The governing differential equations are derived by decoupling the equilibrium equation and effective stress-strain relations in the spherical coordinate system. An accurate numerical solution is obtained using the Galerkin discretisation technique and a precise integration method. A Dormand-Prince solution is also developed for comparison purposes. A light gradient boosting machine learning model in conjunction with the nonlocal model is used to extract the pattern of changes in the mechanical response of the poroelastic inclusion. The optimised hyperparameters are calculated by a grid search cross validation. The modelling estimation power is enhanced by considering nonlocal effects and applying machine learning processes, facilitating the detection of ultrasmall inclusions within a poroelastic medium at micro/nanoscales.

20.
Heliyon ; 10(4): e25406, 2024 Feb 29.
Article in English | MEDLINE | ID: mdl-38370176

ABSTRACT

Objective: This study aims to develop a predictive model using artificial intelligence to estimate the ICU length of stay (LOS) for Congenital Heart Defects (CHD) patients after surgery, improving care planning and resource management. Design: We analyze clinical data from 2240 CHD surgery patients to create and validate the predictive model. Twenty AI models are developed and evaluated for accuracy and reliability. Setting: The study is conducted in a Brazilian hospital's Cardiovascular Surgery Department, focusing on transplants and cardiopulmonary surgeries. Participants: Retrospective analysis is conducted on data from 2240 consecutive CHD patients undergoing surgery. Interventions: Ninety-three pre and intraoperative variables are used as ICU LOS predictors. Measurements and main results: Utilizing regression and clustering methodologies for ICU LOS (ICU Length of Stay) estimation, the Light Gradient Boosting Machine, using regression, achieved a Mean Squared Error (MSE) of 15.4, 11.8, and 15.2 days for training, testing, and unseen data. Key predictors included metrics such as "Mechanical Ventilation Duration", "Weight on Surgery Date", and "Vasoactive-Inotropic Score". Meanwhile, the clustering model, Cat Boost Classifier, attained an accuracy of 0.6917 and AUC of 0.8559 with similar key predictors. Conclusions: Patients with higher ventilation times, vasoactive-inotropic scores, anoxia time, cardiopulmonary bypass time, and lower weight, height, BMI, age, hematocrit, and presurgical oxygen saturation have longer ICU stays, aligning with existing literature.

SELECTION OF CITATIONS
SEARCH DETAIL