Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 873
Filtrar
1.
Food Chem ; 463(Pt 1): 141053, 2024 Aug 31.
Artigo em Inglês | MEDLINE | ID: mdl-39241414

RESUMO

Near-infrared (NIR) spectroscopy has been widely utilized to predict multi-constituents of corn in agriculture. However, directly extracting constituent information from the NIR spectra is challenging due to many issues such as broad absorption band, overlapping and non-specific nature. To solve these problems and extract implicit features from the raw data of NIR spectra to improve performance of quantitative models, a one-dimensional shallow convolutional neural network (CNN) model based on an eXtreme Gradient Boosting (XGBoost) feature extraction method was proposed in this paper. The leaf node feature information in the XGBoost was encoded and reconstructed to obtain the implicit features of raw data in the NIR spectra. A two-parametric Swish (TSwish or TS) activation function was proposed to improve the performance of CNN, and the elastic net (EN) was also applied to avoid the overfitting problem of the CNN model. Performance of the developed XGBoost-CNN-TS-EN model was evaluated using two public NIR spectroscopy datasets of corn and soil, and the obtained determination coefficients (R2) for moisture, oil, protein, and starch of the corn on test set were 0.993, 0.991, 0.998, and 0.992, respectively, with that of the soil organic matter being 0.992. The XGBoost-CNN-TS-EN model exhibits superior stability, good prediction accuracy, and generalization ability, demonstrating its great potentials for quantitative analysis of multi-constituents in spectroscopic applications.

2.
PeerJ ; 12: e17991, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39253604

RESUMO

Most computational methods for predicting driver mutations have been trained using positive samples, while negative samples are typically derived from statistical methods or putative samples. The representativeness of these negative samples in capturing the diversity of passenger mutations remains to be determined. To tackle these issues, we curated a balanced dataset comprising driver mutations sourced from the COSMIC database and high-quality passenger mutations obtained from the Cancer Passenger Mutation database. Subsequently, we encoded the distinctive features of these mutations. Utilizing feature correlation analysis, we developed a cancer driver missense mutation predictor called CDMPred employing feature selection through the ensemble learning technique XGBoost. The proposed CDMPred method, utilizing the top 10 features and XGBoost, achieved an area under the receiver operating characteristic curve (AUC) value of 0.83 and 0.80 on the training and independent test sets, respectively. Furthermore, CDMPred demonstrated superior performance compared to existing state-of-the-art methods for cancer-specific and general diseases, as measured by AUC and area under the precision-recall curve. Including high-quality passenger mutations in the training data proves advantageous for CDMPred's prediction performance. We anticipate that CDMPred will be a valuable tool for predicting cancer driver mutations, furthering our understanding of personalized therapy.


Assuntos
Mutação de Sentido Incorreto , Neoplasias , Humanos , Neoplasias/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Curva ROC , Aprendizado de Máquina
3.
Open Res Eur ; 4: 29, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39219787

RESUMO

Background: Identifying stars belonging to different classes is vital in order to build up statistical samples of different phases and pathways of stellar evolution. In the era of surveys covering billions of stars, an automated method of identifying these classes becomes necessary. Methods: Many classes of stars are identified based on their emitted spectra. In this paper, we use a combination of the multi-class multi-label Machine Learning (ML) method XGBoost and the PySSED spectral-energy-distribution fitting algorithm to classify stars into nine different classes, based on their photometric data. The classifier is trained on subsets of the SIMBAD database. Particular challenges are the very high sparsity (large fraction of missing values) of the underlying data as well as the high class imbalance. We discuss the different variables available, such as photometric measurements on the one hand, and indirect predictors such as Galactic position on the other hand. Results: We show the difference in performance when excluding certain variables, and discuss in which contexts which of the variables should be used. Finally, we show that increasing the number of samples of a particular type of star significantly increases the performance of the model for that particular type, while having little to no impact on other types. The accuracy of the main classifier is ∼0.7 with a macro F1 score of 0.61. Conclusions: While the current accuracy of the classifier is not high enough to be reliably used in stellar classification, this work is an initial proof of feasibility for using ML to classify stars based on photometry.


Astronomy is at the forefront of the 'Big Data' regime, with telescopes collecting increasingly large volumes of data. The tools astronomers use to analyse and draw conclusions from these data need to be able to keep up, with machine learning providing many of the solutions. Being able to classify different astronomical objects by type helps to disentangle the astrophysics making them unique, offering new insights into how the Universe works. Here, we present how machine learning can be used to classify different kinds of stars, in order to augment large databases of the sky. This will allow astronomers to more easily extract the data they need to perform their scientific analyses.

4.
Sci Rep ; 14(1): 20716, 2024 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-39237729

RESUMO

The evaluation of creep rupture life is complex due to its variable formation mechanism. In this paper, machine learning algorithms are applied to explore the creep rupture life span as a function of 27 physical properties to address this issue. By training several classical machine learning models and comparing their prediction performance, XGBoost is finally selected as the predictive model for creep rupture life. Moreover, we introduce an interpretable method, Shapley additive explanations (SHAP), to explain the creep rupture life predicted by the XGBoost model. The SHAP values are then calculated, and the feature importance of the creep rupture life yielded by the XGBoost model is discussed. Finally, the creep fracture life is optimized by using the chaotic sparrow optimization algorithm. We then show that our proposed method can accurately predict and optimize creep properties in a cheaper and faster way than other approaches in the experiments. The proposed method can also be used to optimize the material design across various engineering domains.

5.
Sci Rep ; 14(1): 20819, 2024 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-39242695

RESUMO

RNA modifications play an important role in actively controlling recently created formation in cellular regulation mechanisms, which link them to gene expression and protein. The RNA modifications have numerous alterations, presenting broad glimpses of RNA's operations and character. The modification process by the TET enzyme oxidation is the crucial change associated with cytosine hydroxymethylation. The effect of CR is an alteration in specific biochemical ways of the organism, such as gene expression and epigenetic alterations. Traditional laboratory systems that identify 5-hydroxymethylcytosine (5hmC) samples are expensive and time-consuming compared to other methods. To address this challenge, the paper proposed XGB5hmC, a machine learning algorithm based on a robust gradient boosting algorithm (XGBoost), with different residue based formulation methods to identify 5hmC samples. Their results were amalgamated, and six different frequency residue based encoding features were fused to form a hybrid vector in order to enhance model discrimination capabilities. In addition, the proposed model incorporates SHAP (Shapley Additive Explanations) based feature selection to demonstrate model interpretability by highlighting the high contributory features. Among the applied machine learning algorithms, the XGBoost ensemble model using the tenfold cross-validation test achieved improved results than existing state-of-the-art models. Our model reported an accuracy of 89.97%, sensitivity of 87.78%, specificity of 94.45%, F1-score of 0.8934%, and MCC of 0.8764%. This study highlights the potential to provide valuable insights for enhancing medical assessment and treatment protocols, representing a significant advancement in RNA modification analysis.


Assuntos
5-Metilcitosina , Algoritmos , Aprendizado de Máquina , 5-Metilcitosina/análogos & derivados , 5-Metilcitosina/metabolismo , Humanos , Citosina/análogos & derivados , Citosina/metabolismo
6.
Int J Biometeorol ; 2024 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-39249522

RESUMO

The prediction of evapotranspiration (ET0) is crucial for agricultural ecosystems, irrigation management, and environmental climate regulation. Traditional methods for predicting ET0 require a variety of meteorological parameters. However, obtaining data for these multiple parameters can be challenging, leading to inaccuracies or inability to predict ET0 using traditional methods. This affects decision-making in critical applications such as agricultural irrigation scheduling and water management, consequently impacting the development of agricultural ecosystems. This issue is particularly pronounced in economically underdeveloped regions. Therefore, this paper proposes a machine learning-based evapotranspiration estimation method adapted to evapotranspiration conditions. Compared to traditional methods, our approach relies less on the variety of meteorological parameters and yields higher prediction accuracy. Additionally, we introduce a 'region of evapotranspiration adaptability' division method, which takes into account geographical differences in ET0 prediction. This effectively mitigates the negative impact of anomalies or missing data from individual meteorological stations, making our method more suitable for practical agricultural irrigation and ecosystem water resource management. We validated our approach using meteorological data from 25 stations in Heilongjiang, China. Our results indicate that non-adjacent geographical areas, despite different climatic conditions, can have similar impacts on ET0 prediction. In summary, our method facilitates accurate ET0 prediction, offering new insights for the development of agricultural irrigation and ecosystems, and further contributes to agricultural food supply.

7.
Sci Rep ; 14(1): 20366, 2024 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-39223239

RESUMO

Vitrinite reflectance (VR) is a critical measure of source rock maturity in geochemistry. Although VR is a widely accepted measure of maturity, its accurate measurement often proves challenging and costly. Rock-Eval pyrolysis offers the advantages of being cost-effective, fast, and providing accurate data. Previous studies have employed empirical equations and traditional machine learning methods using T-max data for VR prediction, but these approaches often yielded subpar results. Therefore, the quest to develop a precise method for predicting vitrinite reflectance based on Rock-Eval data becomes particularly valuable. This study presents a novel approach to predicting VR using advanced machine learning models, namely ExtraTree and XGBoost, along with new ways to prepare the data, such as winsorization for outlier treatment and principal component analysis (PCA) for dimensionality reduction. The depth and three Rock-Eval parameters (T-max, S1/TOC, and HI) were used as input variables. Three model sets were examined: Set 1, which involved both Winsorization and PCA; Set 2, which only included Winsorization; and Set 3, which did not include either. The results indicate that the ExtraTree model in Set 1 demonstrated the highest level of predictive accuracy, whereas Set 3 exhibited the lowest level of accuracy, confirming the methodology's effectiveness. The ExtraTree model obtained an overall R2 score of 0.997, surpassing traditional methods by a significant margin. This approach improves the accuracy and dependability of virtual reality predictions, showing significant advancements compared to conventional empirical equations and traditional machine learning methods.

8.
Sci Rep ; 14(1): 20490, 2024 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-39227405

RESUMO

MicroRNAs (miRNAs) are a key class of endogenous non-coding RNAs that play a pivotal role in regulating diseases. Accurately predicting the intricate relationships between miRNAs and diseases carries profound implications for disease diagnosis, treatment, and prevention. However, these prediction tasks are highly challenging due to the complexity of the underlying relationships. While numerous effective prediction models exist for validating these associations, they often encounter information distortion due to limitations in efficiently retaining information during the encoding-decoding process. Inspired by Multi-layer Heterogeneous Graph Transformer and Machine Learning XGboost classifier algorithm, this study introduces a novel computational approach based on multi-layer heterogeneous encoder-machine learning decoder structure for miRNA-disease association prediction (MHXGMDA). First, we employ the multi-view similarity matrices as the input coding for MHXGMDA. Subsequently, we utilize the multi-layer heterogeneous encoder to capture the embeddings of miRNAs and diseases, aiming to capture the maximum amount of relevant features. Finally, the information from all layers is concatenated to serve as input to the machine learning classifier, ensuring maximal preservation of encoding details. We conducted a comprehensive comparison of seven different classifier models and ultimately selected the XGBoost algorithm as the decoder. This algorithm leverages miRNA embedding features and disease embedding features to decode and predict the association scores between miRNAs and diseases. We applied MHXGMDA to predict human miRNA-disease associations on two benchmark datasets. Experimental findings demonstrate that our approach surpasses several leading methods in terms of both the area under the receiver operating characteristic curve and the area under the precision-recall curve.


Assuntos
Algoritmos , Biologia Computacional , Aprendizado de Máquina , MicroRNAs , MicroRNAs/genética , Humanos , Biologia Computacional/métodos , Predisposição Genética para Doença
9.
J Environ Manage ; 369: 122330, 2024 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-39226808

RESUMO

Extreme meteorological events and rapid urbanization have led to serious urban flooding problems. Characterizing spatial variations in flooding susceptibility and elucidating its driving factors are essential for preventing damages from urban pluvial flooding. However, conventional methods, limited by spatial heterogeneity and the intricate mechanisms of urban flooding, frequently demonstrated a deficiency in precision when assessing flooding susceptibility in dense urban areas. Therefore, this study proposed a novel framework for an integrated assessment of urban flood susceptibility, based on a comprehensive cascade modeling chain consisting of XGBoost, SHapley Additive exPlanations (SHAP), and Partial Dependence Plots (PDP) in combination with K-means. It aimed to recognize the specific influence of urban morphology and the spatial patterns of flooding risk agglomeration under different rainfall scenarios in high-density urban areas. The XGBoost model demonstrated enhanced accuracy and robustness relative to other three benchmark models: RF, SVR, and BPDNN. This superiority was effectively validated during both training and independent testing in Shenzhen. The results indicated that urban 3D morphology characteristics were the dominant factors for waterlogging magnitude, which occupied 46.02 % of relative contribution. Through PDP analysis, multi-staged trends highlighted critical thresholds and interactions between significant indicators like building congestion degree (BCD) and floor area ratio (FAR). Specifically, optimal intervals like BCD between 0 and 0.075 coupled with FAR values between 0.5 and 1 have the potential to substantially mitigate flooding risks. These findings emphasize the need for strategic building configuration within urban planning frameworks. In terms of the spatial-temporal assessment, a significant aggregation effect of high-risk areas that prone to prolonged duration or high-intensity rainfall scenarios emerged in the old urban districts. The approach in the present study provides quantitative insights into waterlogging adaptation strategies for sustainable urban planning and design.

10.
Heliyon ; 10(16): e35871, 2024 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-39220969

RESUMO

Slope instability through can cause catastrophic consequences, so slope stability analysis has been a key topic in the field of geotechnical engineering. Traditional analysis methods have shortcomings such as high operational difficulty and time-consuming, for this reason many researchers have carried out slope stability analysis based on AI. However, the current relevant studies only judged the importance of each factor and did not specifically quantify the correlation between factors and slope stability. For this purpose, this paper carried out a sensitivity analysis based on the XGBoost and SHAP. The sensitivity analysis results of SHAP were also validated using GeoStudio software. The selected influence factors included slope height ( H ), slope angle ( ß ), unit weight ( γ ), cohesion ( c ), angle of internal friction ( φ ) and pore water pressure coefficient ( r u ). The results showed that c and γ were the most and least important influential parameters, respectively. GeoStudio simulation results showed a negative correlation between γ , ß , H , r u and slope stability, while a positive correlation between c , φ and slope stability. However, for real data, SHAP misjudged the correlation between γ and slope stability. Because current AI lacked common sense knowledge and, leading SHAP unable to effectively explain the real mechanism of slope instability. For this reason, this paper overcame this challenge based on the priori data-driven approach. The method provided more reliable and accurate interpretation of the results than a real sample, especially with limited or low-quality data. In addition, the results of this method showed that the critical values of c , φ , ß , H , and r u in slope destabilization are 18 Kpa, 28°, 32°, 30 m, and 0.28, respectively. These results were closer to GeoStudio simulations than real samples.

11.
Accid Anal Prev ; 207: 107746, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-39153425

RESUMO

Road traffic crashes are common occurrences that create substantial losses and hazards to society. A complex interaction of components, including drivers, vehicles, roads, and the environment, can impact the causes of these crashes. Due to its complexity, crash identification, and prediction research over large-scale areas faces several obstacles, including high costs and challenging data collecting. This study offers a method for large-scale road network crash risk identification based on open-source data, given that roadways' horizontal and vertical geometric alignment is crucial in highway traffic crashes. This methodology includes a comprehensive technique for feature extraction from horizontal curves (H-curves) and vertical curves (V-curves) and a novel way of combining the XGBoost model's attributes with the Harris Hawks Optimization (HHO) algorithm-referred to as the HHO-XGBoost model. Using this model on the road geometry-crash risk dataset developed specifically for this study, the HHO approach adaptively identifies the optimal set of XGBoost hyperparameters and yields favorable outcomes. This study creates a three-dimensional road geometry database that may be utilized for various road infrastructure management, operation, and safety in addition to completing a tiered risk analysis of "region-road-segment" for large-scale road networks. It also offers direction on using swarm intelligence algorithms in integrated learning models.


Assuntos
Acidentes de Trânsito , Algoritmos , Acidentes de Trânsito/prevenção & controle , Acidentes de Trânsito/estatística & dados numéricos , Humanos , Medição de Risco/métodos , Planejamento Ambiental , Bases de Dados Factuais
12.
Int J Legal Med ; 2024 Aug 06.
Artigo em Inglês | MEDLINE | ID: mdl-39103637

RESUMO

Necrophagous flies, particularly blowflies, serve as vital indicators in forensic entomology and ecological studies, contributing to minimum postmortem interval estimations and environmental monitoring. The study investigates variations in the predominant cuticular hydrocarbons (CHCs) viz. n-C25, n-C27, n-C28, and n-C29 of empty puparia of Calliphora vicina Robineau-Desvoidy, 1830, (Diptera: Calliphoridae) across diverse environmental conditions, including burial, above-ground and indoor settings, over 90 days. Notable trends include a significant decrease in n-C25 concentrations in buried and above-ground conditions over time, while n-C27 concentrations decline in buried and above-ground conditions but remain stable indoors. Burial conditions show significant declines in n-C27 and n-C29 concentrations over time, indicating environmental influences. Conversely, above-ground conditions exhibit uniform declines in all hydrocarbons. Indoor conditions remain relatively stable, with weak correlations between weathering time and CHC concentrations. Additionally, machine learning techniques, specifically Extreme Gradient Boosting (XGBoost), are employed for age estimation of empty puparia, yielding accurate predictions across different outdoor and indoor conditions. These findings highlight the subtle responses of CHC profiles to environmental stimuli, underscoring the importance of considering environmental factors in forensic entomology and ecological research. The study advances the understanding of insect remnant degradation processes and their forensic implications. Furthermore, integrating machine learning with entomological expertise offers standardized methodologies for age determination, enhancing the reliability of entomological evidence in legal contexts and paving the way for future research and development.

13.
Sci Rep ; 14(1): 18452, 2024 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-39117728

RESUMO

As artificial intelligence (AI) becomes widespread, there is increasing attention on investigating bias in machine learning (ML) models. Previous research concentrated on classification problems, with little emphasis on regression models. This paper presents an easy-to-apply and effective methodology for mitigating bias in bagging and boosting regression models, that is also applicable to any model trained through minimizing a differentiable loss function. Our methodology measures bias rigorously and extends the ML model's loss function with a regularization term to penalize high correlations between model errors and protected attributes. We applied our approach to three popular tree-based ensemble models: a random forest model (RF), a gradient-boosted model (GBT), and an extreme gradient boosting model (XGBoost). We implemented our methodology on a case study for predicting road-level traffic volume, where RF, GBT, and XGBoost models were shown to have high accuracy. Despite high accuracy, the ML models were shown to perform poorly on roads in minority-populated areas. Our bias mitigation approach reduced minority-related bias by over 50%.

14.
Healthcare (Basel) ; 12(15)2024 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-39120200

RESUMO

The primary objective of this study was to develop a risk-based readmission prediction model using the EMR data available at discharge. This model was then validated with the LACE plus score. The study cohort consisted of about 310,000 hospital admissions of patients with cardiovascular and cerebrovascular conditions. The EMR data of the patients consisted of lab results, vitals, medications, comorbidities, and admit/discharge settings. These data served as the input to an XGBoost model v1.7.6, which was then used to predict the number of days until the next readmission. Our model achieved remarkable results, with a precision score of 0.74 (±0.03), a recall score of 0.75 (±0.02), and an overall accuracy of approximately 82% (±5%). Notably, the model demonstrated a high accuracy rate of 78.39% in identifying the patients readmitted within 30 days and 80.81% accuracy for those with readmissions exceeding six months. The model was able to outperform the LACE plus score; of the people who were readmitted within 30 days, only 47.70 percent had a LACE plus score greater than 70, and, for people with greater than 6 months, only 10.09 percent had a LACE plus score less than 30. Furthermore, our analysis revealed that the patients with a higher comorbidity burden and lower-than-normal hemoglobin levels were associated with increased readmission rates. This study opens new doors to the world of differential patient care, helping both clinical decision makers and healthcare providers make more informed and effective decisions. This model is comparatively more robust and can potentially substitute the LACE plus score in cardiovascular and cerebrovascular settings for predicting the readmission risk.

15.
J Asthma Allergy ; 17: 783-789, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39157425

RESUMO

Asthma is a chronic inflammatory airway disease with significant burden; exacerbations can severely affect quality of life and healthcare costs. Advances in big data analysis and artificial intelligence have made it easier to predict future exacerbations more accurately. This study used an integrated dataset of Korean National Health Insurance, meteorological, air pollution, and viral data from national public databases to develop a model to predict asthma exacerbations on a daily basis in South Korea. We merged these sources and applied random forest, AdaBoost, XGBoost, and LightGBM machine learning models to compare their performances at predicting future exacerbations. Of the models, XGBoost (AUROC of 0.68 and accuracy of 0.96) and LightGBM (AUROC of 0.67 and accuracy of 0.96) were the most promising. Common important variables were the number of visits and exacerbations per year, and medical resource utilization, including the prescription of asthma medications. Comorbid diabetes, hypertension, gastroesophageal reflux, arthritis, metabolic syndrome, osteoporosis, and ischemic heart disease were also associated with elevated exacerbation risk. The models examined in this study highlight the importance of previous exacerbations, use of medical resources, and comorbidities in the prediction of future exacerbations in patients with asthma.

16.
BMC Biol ; 22(1): 172, 2024 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-39148051

RESUMO

BACKGROUND: Plenty of clinical and biomedical research has unequivocally highlighted the tremendous significance of the human microbiome in relation to human health. Identifying microbes associated with diseases is crucial for early disease diagnosis and advancing precision medicine. RESULTS: Considering that the information about changes in microbial quantities under fine-grained disease states helps to enhance a comprehensive understanding of the overall data distribution, this study introduces MSignVGAE, a framework for predicting microbe-disease sign associations using signed message propagation. MSignVGAE employs a graph variational autoencoder to model noisy signed association data and extends the multi-scale concept to enhance representation capabilities. A novel strategy for propagating signed message in signed networks addresses heterogeneity and consistency among nodes connected by signed edges. Additionally, we utilize the idea of denoising autoencoder to handle the noise in similarity feature information, which helps overcome biases in the fused similarity data. MSignVGAE represents microbe-disease associations as a heterogeneous graph using similarity information as node features. The multi-class classifier XGBoost is utilized to predict sign associations between diseases and microbes. CONCLUSIONS: MSignVGAE achieves AUROC and AUPR values of 0.9742 and 0.9601, respectively. Case studies on three diseases demonstrate that MSignVGAE can effectively capture a comprehensive distribution of associations by leveraging signed information.


Assuntos
Microbiota , Humanos , Biologia Computacional/métodos , Algoritmos , Doença
17.
Intern Emerg Med ; 2024 Aug 14.
Artigo em Inglês | MEDLINE | ID: mdl-39141286

RESUMO

Sepsis triggers a harmful immune response due to infection, causing high mortality. Predicting sepsis outcomes early is vital. Despite machine learning's (ML) use in medical research, local validation within the Medical Information Mart for Intensive Care IV (MIMIC-IV) database is lacking. We aimed to devise a prognostic model, leveraging MIMIC-IV data, to predict sepsis mortality and validate it in a Chinese teaching hospital. MIMIC-IV provided patient data, split into training and internal validation sets. Four ML models logistic regression (LR), support vector machine (SVM), deep neural networks (DNN), and extreme gradient boosting (XGBoost) were employed. Shapley additive interpretation offered early and interpretable mortality predictions. Area under the ROC curve (AUROC) gaged predictive performance. Results were cross verified in a Chinese teaching hospital. The study included 27,134 sepsis patients from MIMIC-IV and 487 from China. After comparing, 52 clinical indicators were selected for ML model development. All models exhibited excellent discriminative ability. XGBoost surpassed others, with AUROC of 0.873 internally and 0.844 externally. XGBoost outperformed other ML models (LR: 0.829; SVM: 0.830; DNN: 0.837) and clinical scores (Simplified Acute Physiology Score II: 0.728; Sequential Organ Failure Assessment: 0.728; Oxford Acute Severity of Illness Score: 0.738; Glasgow Coma Scale: 0.691). XGBoost's hospital mortality prediction achieved AUROC 0.873, sensitivity 0.818, accuracy 0.777, specificity 0.768, and F1 score 0.551. We crafted an interpretable model for sepsis death risk prediction. ML algorithms surpassed traditional scores for sepsis mortality forecast. Validation in a Chinese teaching hospital echoed these findings.

18.
Sci Rep ; 14(1): 18651, 2024 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-39134571

RESUMO

As cities continue to grow globally, characterizing the built environment is essential to understanding human populations, projecting energy usage, monitoring urban heat island impacts, preventing environmental degradation, and planning for urban development. Buildings are a key component of the built environment and there is currently a lack of data on building height at the global level. Current methodologies for developing building height models that utilize remote sensing are limited in scale due to the high cost of data acquisition. Other approaches that leverage 2D features are restricted based on the volume of ancillary data necessary to infer height. Here, we find, through a series of experiments covering 74.55 million buildings from the United States, France, and Germany, it is possible, with 95% accuracy, to infer building height within 3 m of the true height using footprint morphology data. Our results show that leveraging individual building footprints can lead to accurate building height predictions while not requiring ancillary data, thus making this method applicable wherever building footprints are available. The finding that it is possible to infer building height from footprint data alone provides researchers a new method to leverage in relation to various applications.

19.
Int J Gen Med ; 17: 3443-3452, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39139709

RESUMO

Objective: This study aims to investigate their correlation and predictive utility for in-stent restenosis (ISR) in patients with acute coronary syndrome (ACS) following percutaneous coronary intervention (PCI). Methods: We collected medical records of 668 patients who underwent PCI treatment from January 2022 to December 2022. Based on follow-up results (ISR defined as luminal narrowing ≥ 50% on angiography), all participants were divided into ISR and non-ISR groups. The XGBoost machine learning (ML) model was employed to identify the optimal predictive variables from a set of 31 variables. Discriminatory ability was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), while calibration and performance of the prediction models were assessed using the Hosmer-Lemeshow (HL) test and calibration plots. Clinical utility of each model was evaluated using decision curve analysis (DCA). Results: In the XGBoost importance ranking of predictive factors, LMR and RC ranked first and fourth, respectively. The AUC of the entire XGBoost ML model was 0.8098, whereas the model using traditional stepwise backward regression, comprising five predictive factors, had an AUC of 0.706. The XGBoost model showed superior predictive performance with a higher AUC, indicating better discrimination and predictive accuracy for ISR compared to traditional methods. Conclusion: LMR and RC are identified as cost-effective and reliable biomarkers for predicting ISR risk in ACS patients following drug-eluting stent (DES) implantation. LMR and RC represent cost-effective and reliable biomarkers for predicting ISR risk in ACS patients following drug-eluting stent implantation. Enhances the accuracy and clinical utility of ISR prediction models, offering clinicians a robust tool for risk stratification and personalized patient management.

20.
Sensors (Basel) ; 24(15)2024 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-39124007

RESUMO

Tremor, defined as an "involuntary, rhythmic, oscillatory movement of a body part", is a key feature of many neurological conditions including Parkinson's disease and essential tremor. Clinical assessment continues to be performed by visual observation with quantification on clinical scales. Methodologies for objectively quantifying tremor are promising but remain non-standardized across centers. Our center performs full-body behavioral testing with 3D motion capture for clinical and research purposes in patients with Parkinson's disease, essential tremor, and other conditions. The objective of this study was to assess the ability of several candidate processing pipelines to identify the presence or absence of tremor in kinematic data from patients with confirmed movement disorders and compare them to expert ratings from movement disorders specialists. We curated a database of 2272 separate kinematic data recordings from our center, each of which was contemporaneously annotated as tremor present or absent by a movement physician. We compared the ability of six separate processing pipelines to recreate clinician ratings based on F1 score, in addition to accuracy, precision, and recall. The performance across algorithms was generally comparable. The average F1 score was 0.84±0.02 (mean ± SD; range 0.81-0.87). The second highest performing algorithm (cross-validated F1=0.87) was a hybrid that used engineered features adapted from an algorithm in longstanding clinical use with a modern Support Vector Machine classifier. Taken together, our results suggest the potential to update legacy clinical decision support systems to incorporate modern machine learning classifiers to create better-performing tools.


Assuntos
Algoritmos , Transtornos dos Movimentos , Tremor , Humanos , Tremor/diagnóstico , Tremor/fisiopatologia , Transtornos dos Movimentos/diagnóstico , Transtornos dos Movimentos/fisiopatologia , Doença de Parkinson/diagnóstico , Doença de Parkinson/fisiopatologia , Fenômenos Biomecânicos , Tremor Essencial/diagnóstico , Tremor Essencial/fisiopatologia , Masculino , Feminino , Pessoa de Meia-Idade , Idoso
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...