Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 4.655
Filtrar
1.
Front Public Health ; 12: 1382354, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39086805

RESUMEN

Background: Precise prediction of out-of-pocket (OOP) costs to improve health policy design is important for governments of countries with national health insurance. Controlling the medical expenses for hypertension, one of the leading causes of stroke and ischemic heart disease, is an important issue for the Japanese government. This study aims to explore the importance of OOP costs for outpatients with hypertension. Methods: To obtain a precise prediction of the highest quartile group of OOP costs of hypertensive outpatients, we used nationwide longitudinal data, and estimated a random forest (RF) model focusing on complications with other lifestyle-related diseases and the nonlinearities of the data. Results: The results of the RF models showed that the prediction accuracy of OOP costs for hypertensive patients without activities of daily living (ADL) difficulties was slightly better than that for all hypertensive patients who continued physician visits during the past two consecutive years. Important variables of the highest quartile of OOP costs were age, diabetes or lipidemia, lack of habitual exercise, and moderate or vigorous regular exercise. Conclusion: As preventing complications of diabetes or lipidemia is important for reducing OOP costs in outpatients with hypertension, regular exercise of moderate or vigorous intensity is recommended for hypertensive patients that do not have ADL difficulty. For hypertensive patients with ADL difficulty, habitual exercise is not recommended.


Asunto(s)
Gastos en Salud , Hipertensión , Humanos , Hipertensión/economía , Femenino , Masculino , Persona de Mediana Edad , Japón , Anciano , Gastos en Salud/estadística & datos numéricos , Actividades Cotidianas , Estudios Longitudinales , Adulto , Bosques Aleatorios
2.
Artículo en Inglés | MEDLINE | ID: mdl-39090299

RESUMEN

Floods are among the natural hazards that have seen a rapid increase in frequency in recent decades. The damage caused by floods, including human and financial losses, poses a serious threat to human life. This study evaluates two machine learning (ML) techniques for flood susceptibility mapping (FSM) in the Gamasyab watershed in Iran. We utilized random forest (RF), support vector machine (SVM), ensemble models, and a geographic information system (GIS) to predict FSM. The application of these models involved 10 effective factors in flooding, as well as 82 flood locations integrated into the GIS. The SVM and RF models were trained and tested, followed by the implementation of resampling techniques (RT) using bootstrap and subsampling methods in three repetitions. The results highlighted the importance of elevation, slope, and precipitation as primary factors influencing flood occurrence. Additionally, the ensemble model outperformed both the RF and SVM models, achieving an area under the curve (AUC) of 0.9, a correlation coefficient (COR) of 0.79, a true skill statistic (TSS) of 0.83, and a standard deviation (SD) of 0.71 in the test phase. The tested models were adapted to available input data to map the FSM across the study watershed. These findings underscore the potential of integrating an ensemble model with GIS as an effective tool for flood susceptibility mapping.

3.
World J Gastroenterol ; 30(28): 3403-3417, 2024 Jul 28.
Artículo en Inglés | MEDLINE | ID: mdl-39091717

RESUMEN

BACKGROUND: There is currently a shortage of accurate, efficient, and precise predictive instruments for rectal neuroendocrine neoplasms (NENs). AIM: To develop a predictive model for individuals with rectal NENs (R-NENs) using data from a large cohort. METHODS: Data from patients with primary R-NENs were retrospectively collected from 17 large-scale referral medical centers in China. Random forest and Cox proportional hazard models were used to identify the risk factors for overall survival and progression-free survival, and two nomograms were constructed. RESULTS: A total of 1408 patients with R-NENs were included. Tumor grade, T stage, tumor size, age, and a prognostic nutritional index were important risk factors for prognosis. The GATIS score was calculated based on these five indicators. For overall survival prediction, the respective C-indexes in the training set were 0.915 (95% confidence interval: 0.866-0.964) for overall survival prediction and 0.908 (95% confidence interval: 0.872-0.944) for progression-free survival prediction. According to decision curve analysis, net benefit of the GATIS score was higher than that of a single factor. The time-dependent area under the receiver operating characteristic curve showed that the predictive power of the GATIS score was higher than that of the TNM stage and pathological grade at all time periods. CONCLUSION: The GATIS score had a good predictive effect on the prognosis of patients with R-NENs, with efficacy superior to that of the World Health Organization grade and TNM stage.


Asunto(s)
Estadificación de Neoplasias , Tumores Neuroendocrinos , Nomogramas , Neoplasias del Recto , Humanos , Masculino , Femenino , Persona de Mediana Edad , Neoplasias del Recto/mortalidad , Neoplasias del Recto/patología , Neoplasias del Recto/terapia , Tumores Neuroendocrinos/mortalidad , Tumores Neuroendocrinos/patología , Tumores Neuroendocrinos/terapia , Tumores Neuroendocrinos/diagnóstico , Estudios Retrospectivos , China/epidemiología , Pronóstico , Anciano , Factores de Riesgo , Adulto , Curva ROC , Supervivencia sin Progresión , Clasificación del Tumor , Medición de Riesgo/métodos , Modelos de Riesgos Proporcionales , Valor Predictivo de las Pruebas , Evaluación Nutricional , Pueblos del Este de Asia
4.
Ecotoxicol Environ Saf ; 283: 116856, 2024 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-39151373

RESUMEN

Air pollution in industrial environments, particularly in the chrome plating process, poses significant health risks to workers due to high concentrations of hazardous pollutants. Exposure to substances like hexavalent chromium, volatile organic compounds (VOCs), and particulate matter can lead to severe health issues, including respiratory problems and lung cancer. Continuous monitoring and timely intervention are crucial to mitigate these risks. Traditional air quality monitoring methods often lack real-time data analysis and predictive capabilities, limiting their effectiveness in addressing pollution hazards proactively. This paper introduces a real-time air pollution monitoring and forecasting system specifically designed for the chrome plating industry. The system, supported by Internet of Things (IoT) sensors and AI approaches, detects a wide range of air pollutants, including NH3, CO, NO2, CH4, CO2, SO2, O3, PM2.5, and PM10, and provides real-time data on pollutant concentration levels. Data collected by the sensors are processed using LSTM, Random Forest, and Linear Regression models to predict pollution levels. The LSTM model achieved a coefficient of variation (R²) of 99 % and a mean absolute percentage error (MAE) of 0.33 for temperature and humidity forecasting. For PM2.5, the Random Forest model outperformed others, achieving an R² of 84 % and an MAE of 10.11. The system activates factory exhaust fans to circulate air when high pollution levels are predicted to occur in the next hours, allowing for proactive measures to improve air quality before issues arise. This innovative approach demonstrates significant advancements in industrial environmental monitoring, enabling dynamic responses to pollution and improving air quality in industrial settings.

5.
MethodsX ; 13: 102866, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-39157818

RESUMEN

Color-blind is a generic disability whereby the affected individuals are not given the opportunity to benefit from the various functions provided by color that would impact humans physically and psychologically. Although this disability is not fatal, it brought plenty of turbulence in the affected individuals' daily activities. This paper aims to develop a system for recognizing and detecting colors of clothes in images, improve accuracy by using advanced algorithms to handle lighting variations, and provide color matching recommendations to assist color-blind individuals in making informed choices when purchasing shirts. The proposed methodology for color recognition involves:•retrieving the RGB values of a given point from the input image and converting them into HSV values.•creating web application integrated with a machine learning model to classify and predict the corresponding color based on the HSV values.•predicting the color name with suggestions of matching colors will be displayed on the interface.

6.
BMC Med Inform Decis Mak ; 24(1): 228, 2024 Aug 16.
Artículo en Inglés | MEDLINE | ID: mdl-39152423

RESUMEN

PROBLEM: Sepsis, a life-threatening condition, accounts for the deaths of millions of people worldwide. Accurate prediction of sepsis outcomes is crucial for effective treatment and management. Previous studies have utilized machine learning for prognosis, but have limitations in feature sets and model interpretability. AIM: This study aims to develop a machine learning model that enhances prediction accuracy for sepsis outcomes using a reduced set of features, thereby addressing the limitations of previous studies and enhancing model interpretability. METHODS: This study analyzes intensive care patient outcomes using the MIMIC-IV database, focusing on adult sepsis cases. Employing the latest data extraction tools, such as Google BigQuery, and following stringent selection criteria, we selected 38 features in this study. This selection is also informed by a comprehensive literature review and clinical expertise. Data preprocessing included handling missing values, regrouping categorical variables, and using the Synthetic Minority Over-sampling Technique (SMOTE) to balance the data. We evaluated several machine learning models: Decision Trees, Gradient Boosting, XGBoost, LightGBM, Multilayer Perceptrons (MLP), Support Vector Machines (SVM), and Random Forest. The Sequential Halving and Classification (SHAC) algorithm was used for hyperparameter tuning, and both train-test split and cross-validation methodologies were employed for performance and computational efficiency. RESULTS: The Random Forest model was the most effective, achieving an area under the receiver operating characteristic curve (AUROC) of 0.94 with a confidence interval of ±0.01. This significantly outperformed other models and set a new benchmark in the literature. The model also provided detailed insights into the importance of various clinical features, with the Sequential Organ Failure Assessment (SOFA) score and average urine output being highly predictive. SHAP (Shapley Additive Explanations) analysis further enhanced the model's interpretability, offering a clearer understanding of feature impacts. CONCLUSION: This study demonstrates significant improvements in predicting sepsis outcomes using a Random Forest model, supported by advanced machine learning techniques and thorough data preprocessing. Our approach provided detailed insights into the key clinical features impacting sepsis mortality, making the model both highly accurate and interpretable. By enhancing the model's practical utility in clinical settings, we offer a valuable tool for healthcare professionals to make data-driven decisions, ultimately aiming to minimize sepsis-induced fatalities.


Asunto(s)
Unidades de Cuidados Intensivos , Aprendizaje Automático , Sepsis , Humanos , Sepsis/mortalidad , Pronóstico , Adulto , Masculino , Persona de Mediana Edad , Femenino , Anciano
7.
Thromb J ; 22(1): 76, 2024 Aug 16.
Artículo en Inglés | MEDLINE | ID: mdl-39152448

RESUMEN

PURPOSE: To identify the key risk factors for venous thromboembolism (VTE) in urological inpatients based on the Caprini scale using an interpretable machine learning method. METHODS: VTE risk data of urological inpatients were obtained based on the Caprini scale in the case hospital. Based on the data, the Boruta method was used to further select the key variables from the 37 variables in the Caprini scale. Furthermore, decision rules corresponding to each risk level were generated using the rough set (RS) method. Finally, random forest (RF), support vector machine (SVM), and backpropagation artificial neural network (BPANN) were used to verify the data accuracy and were compared with the RS method. RESULTS: Following the screening, the key risk factors for VTE in urology were "(C1) Age," "(C2) Minor Surgery planned," "(C3) Obesity (BMI > 25)," "(C8) Varicose veins," "(C9) Sepsis (< 1 month)," (C10) "Serious lung disease incl. pneumonia (< 1month) " (C11) COPD," "(C16) Other risk," "(C18) Major surgery (> 45 min)," "(C19) Laparoscopic surgery (> 45 min)," "(C20) Patient confined to bed (> 72 h)," "(C18) Malignancy (present or previous)," "(C23) Central venous access," "(C31) History of DVT/PE," "(C32) Other congenital or acquired thrombophilia," and "(C34) Stroke (< 1 month." According to the decision rules of different risk levels obtained using the RS method, "(C1) Age," "(C18) Major surgery (> 45 minutes)," and "(C21) Malignancy (present or previous)" were the main factors influencing mid- and high-risk levels, and some suggestions on VTE prevention were indicated based on these three factors. The average accuracies of the RS, RF, SVM, and BPANN models were 79.5%, 87.9%, 92.6%, and 97.2%, respectively. In addition, BPANN had the highest accuracy, recall, F1-score, and precision. CONCLUSIONS: The RS model achieved poorer accuracy than the other three common machine learning models. However, the RS model provides strong interpretability and allows for the identification of high-risk factors and decision rules influencing high-risk assessments of VTE in urology. This transparency is very important for clinicians in the risk assessment process.

8.
Toxicol Mech Methods ; : 1-9, 2024 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-39104137

RESUMEN

Per- and polyfluoroalkyl substances (PFASs), one of the persistent organic pollutants, have immunosuppressive effects. The evaluation of this effect has been the focus of regulatory toxicology. In this investigation, 146 PFASs (immunosuppressive or nonimmunosuppressive) and corresponding concentration gradients were collected from literature, and their structures were characterized by using Dragon descriptors. Feature importance analysis and stepwise feature elimination are used for feature selection. Three machine learning (ML) methods, namely Random Forest (RF), Extreme Gradient Boosting Machine (XGB), and Categorical Boosting Machine (CB), were utilized for model development. The model interpretability was explored by feature importance analysis and correlation analysis. The findings indicated that the three models developed have exhibited excellent performance. Among them, the best-performing RF model has an average AUC score of 0.9720 for the testing set. The results of the feature importance analysis demonstrated that concentration, SpPosA_X, IVDE, R2s, and SIC2 were the crucial molecular features. Applicability domain analysis was also performed to determine reliable prediction boundaries for the model. In conclusion, this study is the first application of ML models to investigate the immunosuppressive activity of PFASs. The variables used in the models can help understand the mechanism of the immunosuppressive activity of PFASs, allow researchers to more effectively assess the immunosuppressive potential of a large number of PFASs, and thus better guide environmental and health risk assessment efforts.

9.
J Alzheimers Dis ; 2024 Aug 08.
Artículo en Inglés | MEDLINE | ID: mdl-39121117

RESUMEN

Background: Mild cognitive impairment (MCI) patients are at a high risk of developing Alzheimer's disease and related dementias (ADRD) at an estimated annual rate above 10%. It is clinically and practically important to accurately predict MCI-to-dementia conversion time. Objective: It is clinically and practically important to accurately predict MCI-to-dementia conversion time by using easily available clinical data. Methods: The dementia diagnosis often falls between two clinical visits, and such survival outcome is known as interval-censored data. We utilized the semi-parametric model and the random forest model for interval-censored data in conjunction with a variable selection approach to select important measures for predicting the conversion time from MCI to dementia. Two large AD cohort data sets were used to build, validate, and test the predictive model. Results: We found that the semi-parametric model can improve the prediction of the conversion time for patients with MCI-to-dementia conversion, and it also has good predictive performance for all patients. Conclusions: Interval-censored data should be analyzed by using the models that were developed for interval- censored data to improve the model performance.

10.
BMC Bioinformatics ; 25(1): 253, 2024 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-39090608

RESUMEN

BACKGROUND: Conditional logistic regression trees have been proposed as a flexible alternative to the standard method of conditional logistic regression for the analysis of matched case-control studies. While they allow to avoid the strict assumption of linearity and automatically incorporate interactions, conditional logistic regression trees may suffer from a relatively high variability. Further machine learning methods for the analysis of matched case-control studies are missing because conventional machine learning methods cannot handle the matched structure of the data. RESULTS: A random forest method for the analysis of matched case-control studies based on conditional logistic regression trees is proposed, which overcomes the issue of high variability. It provides an accurate estimation of exposure effects while being more flexible in the functional form of covariate effects. The efficacy of the method is illustrated in a simulation study and within an application to real-world data from a matched case-control study on the effect of regular participation in cervical cancer screening on the development of cervical cancer. CONCLUSIONS: The proposed random forest method is a promising add-on to the toolbox for the analysis of matched case-control studies and addresses the need for machine-learning methods in this field. It provides a more flexible approach compared to the standard method of conditional logistic regression, but also compared to conditional logistic regression trees. It allows for non-linearity and the automatic inclusion of interaction effects and is suitable both for exploratory and explanatory analyses.


Asunto(s)
Aprendizaje Automático , Bosques Aleatorios , Femenino , Humanos , Estudios de Casos y Controles , Modelos Logísticos , Neoplasias del Cuello Uterino
11.
Sci Rep ; 14(1): 18452, 2024 Aug 08.
Artículo en Inglés | MEDLINE | ID: mdl-39117728

RESUMEN

As artificial intelligence (AI) becomes widespread, there is increasing attention on investigating bias in machine learning (ML) models. Previous research concentrated on classification problems, with little emphasis on regression models. This paper presents an easy-to-apply and effective methodology for mitigating bias in bagging and boosting regression models, that is also applicable to any model trained through minimizing a differentiable loss function. Our methodology measures bias rigorously and extends the ML model's loss function with a regularization term to penalize high correlations between model errors and protected attributes. We applied our approach to three popular tree-based ensemble models: a random forest model (RF), a gradient-boosted model (GBT), and an extreme gradient boosting model (XGBoost). We implemented our methodology on a case study for predicting road-level traffic volume, where RF, GBT, and XGBoost models were shown to have high accuracy. Despite high accuracy, the ML models were shown to perform poorly on roads in minority-populated areas. Our bias mitigation approach reduced minority-related bias by over 50%.

12.
J Biophotonics ; : e202400075, 2024 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-39103198

RESUMEN

Otitis media (OM), a highly prevalent inflammatory middle-ear disease in children worldwide, is commonly caused by an infection, and can lead to antibiotic-resistant bacterial biofilms in recurrent/chronic OM cases. A biofilm related to OM typically contains one or multiple bacterial species. OCT has been used clinically to visualize the presence of bacterial biofilms in the middle ear. This study used OCT to compare microstructural image texture features from bacterial biofilms. The proposed method applied supervised machine-learning-based frameworks (SVM, random forest, and XGBoost) to classify multiple species bacterial biofilms from in vitro cultures and clinically-obtained in vivo images from human subjects. Our findings show that optimized SVM-RBF and XGBoost classifiers achieved more than 95% of AUC, detecting each biofilm class. These results demonstrate the potential for differentiating OM-causing bacterial biofilms through texture analysis of OCT images and a machine-learning framework, offering valuable insights for real-time in vivo characterization of ear infections.

13.
Sci Rep ; 14(1): 18834, 2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39138311

RESUMEN

As we all know, momentum plays a crucial role in ball game. Based on the 2023 Wimbledon final data, this paper investigated momentum in tennis. Firstly, we initially trained a decision tree regression model on reprocessed data for prediction, and established the CBRF model based on CatBoost regression and random forest regression models to obtain prediction data. Secondly, significant non-zero autocorrelation coefficients were found, confirming the correlation between momentum and success. Thirdly, Based on these key factors, we proposed winning strategies for the players, conducted predictive analyses for six specific time intervals of the game. At last, by implementing these models to women's matches, championships, matches on different surfaces, the results demonstrated that the models have effective generalization ability.

14.
J Hazard Mater ; 478: 135407, 2024 Aug 03.
Artículo en Inglés | MEDLINE | ID: mdl-39116745

RESUMEN

The accurate spatial mapping of heavy metal levels in agricultural soils is crucial for environmental management and food security. However, the inherent limitations of traditional interpolation methods and emerging machine-learning techniques restrict their spatial prediction accuracy. This study aimed to refine the spatial prediction of heavy metal distributions in Guangxi, China, by integrating machine learning models and spatial regionalization indices (SRIs). The results demonstrated that random forest (RF) models incorporating SRIs outperformed artificial neural network and support vector regression models, achieving R2 values exceeding 0.96 for eight heavy metals on the test data. Hierarchical clustering for feature selection further improved the model performance. The optimized RF models accurately predicted the heavy metal distributions in agricultural soils across the province, revealing higher levels in the central-western regions and lower levels in the north and south. Notably, the models identified that 25.78 % of agricultural soils constitute hotspots with multiple co-occurring heavy metals, and over 6.41 million people are exposed to excessive soil heavy metal levels. Our findings provide valuable insights for the development of targeted strategies for soil pollution control and agricultural soil management to safeguard food security and public health.

15.
Environ Monit Assess ; 196(9): 800, 2024 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-39120666

RESUMEN

Air pollution has a significant global impact on natural resources and public health. Accurate prediction of air pollution is crucial for effective prevention and control measures. However, due to regional variations, different cities may have varying primary pollutants, posing new challenges for accurate prediction. In this paper, we propose a novel method called FP-RF, which integrates clustering algorithms to categorize multiple cities according to their air quality index values. Subsequently, we apply functional principal component analysis to extract the primary components of air pollution within each cluster. Furthermore, an enhanced random forest algorithm is utilized to predict air quality grades for each city. We conduct experimental evaluations using authentic historical data from Anhui Province spanning from 2018 to 2023. The results unequivocally establish the effectiveness of our model, with an average accuracy rate of 98.6% in forecasting six pollution grades and 96.04% accuracy in predicting 16 prefecture-level cities, surpassing performance compared to other baseline models.


Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Monitoreo del Ambiente , Predicción , Contaminación del Aire/estadística & datos numéricos , Monitoreo del Ambiente/métodos , Contaminantes Atmosféricos/análisis , Ciudades , Algoritmos , China , Modelos Teóricos , Análisis de Componente Principal
16.
Diabetologia ; 2024 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-39168869

RESUMEN

AIMS/HYPOTHESIS: Clustering-based subclassification of type 2 diabetes, which reflects pathophysiology and genetic predisposition, is a promising approach for providing personalised and effective therapeutic strategies. Ahlqvist's classification is currently the most vigorously validated method because of its superior ability to predict diabetes complications but it does not have strong consistency over time and requires HOMA2 indices, which are not routinely available in clinical practice and standard cohort studies. We developed a machine learning (ML) model to classify individuals with type 2 diabetes into Ahlqvist's subtypes consistently over time. METHODS: Cohort 1 dataset comprised 619 Japanese individuals with type 2 diabetes who were divided into training and test sets for ML models in a 7:3 ratio. Cohort 2 dataset, comprising 597 individuals with type 2 diabetes, was used for external validation. Participants were pre-labelled (T2Dkmeans) by unsupervised k-means clustering based on Ahlqvist's variables (age at diagnosis, BMI, HbA1c, HOMA2-B and HOMA2-IR) to four subtypes: severe insulin-deficient diabetes (SIDD), severe insulin-resistant diabetes (SIRD), mild obesity-related diabetes (MOD) and mild age-related diabetes (MARD). We adopted 15 variables for a multiclass classification random forest (RF) algorithm to predict type 2 diabetes subtypes (T2DRF15). The proximity matrix computed by RF was visualised using a uniform manifold approximation and projection. Finally, we used a putative subset with missing insulin-related variables to test the predictive performance of the validation cohort, consistency of subtypes over time and prediction ability of diabetes complications. RESULTS: T2DRF15 demonstrated a 94% accuracy for predicting T2Dkmeans type 2 diabetes subtypes (AUCs ≥0.99 and F1 score [an indicator calculated by harmonic mean from precision and recall] ≥0.9) and retained the predictive performance in the external validation cohort (86.3%). T2DRF15 showed an accuracy of 82.9% for detecting T2Dkmeans, also in a putative subset with missing insulin-related variables, when used with an imputation algorithm. In Kaplan-Meier analysis, the diabetes clusters of T2DRF15 demonstrated distinct accumulation risks of diabetic retinopathy in SIDD and that of chronic kidney disease in SIRD during a median observation period of 11.6 (4.5-18.3) years, similarly to the subtypes using T2Dkmeans. The predictive accuracy was improved after excluding individuals with low predictive probability, who were categorised as an 'undecidable' cluster. T2DRF15, after excluding undecidable individuals, showed higher consistency (100% for SIDD, 68.6% for SIRD, 94.4% for MOD and 97.9% for MARD) than T2Dkmeans. CONCLUSIONS/INTERPRETATION: The new ML model for predicting Ahlqvist's subtypes of type 2 diabetes has great potential for application in clinical practice and cohort studies because it can classify individuals with missing HOMA2 indices and predict glycaemic control, diabetic complications and treatment outcomes with long-term consistency by using readily available variables. Future studies are needed to assess whether our approach is applicable to research and/or clinical practice in multiethnic populations.

17.
PeerJ Comput Sci ; 10: e2062, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39145255

RESUMEN

The SARS-CoV-2 virus, which induces an acute respiratory illness commonly referred to as COVID-19, had been designated as a pandemic by the World Health Organization due to its highly infectious nature and the associated public health risks it poses globally. Identifying the critical factors for predicting mortality is essential for improving patient therapy. Unlike other data types, such as computed tomography scans, x-radiation, and ultrasounds, basic blood test results are widely accessible and can aid in predicting mortality. The present research advocates the utilization of machine learning (ML) methodologies for predicting the likelihood of infectious disease like COVID-19 mortality by leveraging blood test data. Age, LDH (lactate dehydrogenase), lymphocytes, neutrophils, and hs-CRP (high-sensitivity C-reactive protein) are five extremely potent characteristics that, when combined, can accurately predict mortality in 96% of cases. By combining XGBoost feature importance with neural network classification, the optimal approach can predict mortality with exceptional accuracy from infectious disease, along with achieving a precision rate of 90% up to 16 days before the event. The studies suggested model's excellent predictive performance and practicality were confirmed through testing with three instances that depended on the days to the outcome. By carefully analyzing and identifying patterns in these significant biomarkers insightful information has been obtained for simple application. This study offers potential remedies that could accelerate decision-making for targeted medical treatments within healthcare systems, utilizing a timely, accurate, and reliable method.

18.
Carcinogenesis ; 2024 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-39086220

RESUMEN

Intrahepatic cholangiocarcinoma (ICC) is a rare disease associated with a poor prognosis, primarily due to early recurrence and metastasis. An important feature of this condition is microvascular invasion (MVI). However, current predictive models based on imaging have limited efficacy in this regard. This study employed a random forest model to construct a predictive model for MVI identification and uncover its biological basis. Single-cell transcriptome sequencing, whole exome sequencing, and proteome sequencing were performed. The area under the curve of the prediction model in the validation set was 0.93. Further analysis indicated that MVI-associated tumor cells exhibited functional changes related to epithelial-mesenchymal transition and lipid metabolism due to alterations in the NF-kappa B and MAPK signaling pathways. Tumor cells were also differentially enriched for the IL-17 signaling pathway. There was less infiltration of SLC30A1+ CD8+ T cells expressing cytotoxic genes in MVI-associated ICC, whereas there was more infiltration of myeloid cells with attenuated expression of the MHC II pathway. Additionally, MVI-associated intercellular communication was closely related to the SPP1-CD44 and ANXA1-FPR1 pathways. These findings resulted in a brilliant predictive model and fresh insights into MVI.

19.
Gigascience ; 132024 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-39101782

RESUMEN

BACKGROUND: Mobilization typing (MOB) is a classification scheme for plasmid genomes based on their relaxase gene. The host ranges of plasmids of different MOB categories are diverse, and MOB is crucial for investigating plasmid mobilization, especially the transmission of resistance genes and virulence factors. However, MOB typing of plasmid metagenomic data is challenging due to the highly fragmented characteristics of metagenomic contigs. RESULTS: We developed MOBFinder, an 11-class classifier, for categorizing plasmid fragments into 10 MOB types and a nonmobilizable category. We first performed MOB typing to classify complete plasmid genomes according to relaxase information and then constructed an artificial benchmark dataset of plasmid metagenomic fragments (PMFs) from those complete plasmid genomes whose MOB types are well annotated. Next, based on natural language models, we used word vectors to characterize the PMFs. Several random forest classification models were trained and integrated to predict fragments of different lengths. Evaluating the tool using the benchmark dataset, we found that MOBFinder outperforms previous tools such as MOBscan and MOB-suite, with an overall accuracy approximately 59% higher than that of MOB-suite. Moreover, the balanced accuracy, harmonic mean, and F1-score reached up to 99% for some MOB types. When applied to a cohort of patients with type 2 diabetes (T2D), MOBFinder offered insights suggesting that the MOBF type plasmid, which is widely present in Escherichia and Klebsiella, and the MOBQ type plasmid might accelerate antibiotic resistance transmission in patients with T2D. CONCLUSIONS: To the best of our knowledge, MOBFinder is the first tool for MOB typing of PMFs. The tool is freely available at https://github.com/FengTaoSMU/MOBFinder.


Asunto(s)
Metagenómica , Plásmidos , Plásmidos/genética , Metagenómica/métodos , Humanos , Programas Informáticos , Metagenoma
20.
Front Vet Sci ; 11: 1406107, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39104548

RESUMEN

Introduction: Clinical reasoning in veterinary medicine is often based on clinicians' personal experience in combination with information derived from publications describing cohorts of patients. Studies on the use of scientific methods for patient individual decision making are largely lacking. This applies to the prediction of the individual underlying pathology in seizuring dogs as well. The aim of this study was to apply machine learning to the prediction of the risk of structural epilepsy in dogs with seizures. Materials and methods: Dogs with a history of seizures were retrospectively as well as prospectively included. Data about clinical history, neurological examination, diagnostic tests performed as well as the final diagnosis were collected. For data analysis, the Bayesian Network and Random Forest algorithms were used. A total of 33 features for Random Forest and 17 for Bayesian Network were available for analysis. The following four feature selection methods were applied to select features for further analysis: Permutation Importance, Forward Selection, Random Selection and Expert Opinion. The two algorithms Bayesian Network and Random Forest were trained to predict structural epilepsy using the selected features. Results: A total of 328 dogs of 119 different breeds were identified retrospectively between January 2017 and June 2021, of which 33.2% were diagnosed with structural epilepsy. An overall of 89,848 models were trained. The Bayesian Network in combination with the Random feature selection performed best. It was able to predict structural epilepsy with an accuracy of 0.969 (sensitivity: 0.857, specificity: 1.000) among all dogs with seizures using the following features: age at first seizure, cluster seizures, seizure in last 24 h, seizure in last 6 month, and seizure in last year. Conclusion: Machine learning algorithms such as Bayesian Networks and Random Forests identify dogs with structural epilepsy with a high sensitivity and specificity. This information could provide some guidance to clinicians and pet owners in their clinical decision-making process.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...