Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38.146
Filtrar
1.
Front Endocrinol (Lausanne) ; 15: 1353023, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38590824

RESUMO

Background: Central precocious puberty (CPP) is a common endocrine disorder in children, and its diagnosis primarily relies on the gonadotropin-releasing hormone (GnRH) stimulation test, which is expensive and time-consuming. With the widespread application of artificial intelligence in medicine, some studies have utilized clinical, hormonal (laboratory) and imaging data-based machine learning (ML) models to identify CPP. However, the results of these studies varied widely and were challenging to directly compare, mainly due to diverse ML methods. Therefore, the diagnostic value of clinical, hormonal (laboratory) and imaging data-based ML models for CPP remains elusive. The aim of this study was to investigate the diagnostic value of ML models based on clinical, hormonal (laboratory) and imaging data for CPP through a meta-analysis of existing studies. Methods: We conducted a comprehensive search for relevant English articles on clinical, hormonal (laboratory) and imaging data-based ML models for diagnosing CPP, covering the period from the database creation date to December 2023. Pooled sensitivity, specificity, positive likelihood ratio (LR+), negative likelihood ratio (LR-), summary receiver operating characteristic (SROC) curve, and area under the curve (AUC) were calculated to assess the diagnostic value of clinical, hormonal (laboratory) and imaging data-based ML models for diagnosing CPP. The I2 test was employed to evaluate heterogeneity, and the source of heterogeneity was investigated through meta-regression analysis. Publication bias was assessed using the Deeks funnel plot asymmetry test. Results: Six studies met the eligibility criteria. The pooled sensitivity and specificity were 0.82 (95% confidence interval (CI) 0.62-0.93) and 0.85 (95% CI 0.80-0.90), respectively. The LR+ was 6.00, and the LR- was 0.21, indicating that clinical, hormonal (laboratory) and imaging data-based ML models exhibited an excellent ability to confirm or exclude CPP. Additionally, the SROC curve showed that the AUC of the clinical, hormonal (laboratory) and imaging data-based ML models in the diagnosis of CPP was 0.90 (95% CI 0.87-0.92), demonstrating good diagnostic value for CPP. Conclusion: Based on the outcomes of our meta-analysis, clinical and imaging data-based ML models are excellent diagnostic tools with high sensitivity, specificity, and AUC in the diagnosis of CPP. Despite the geographical limitations of the study findings, future research endeavors will strive to address these issues to enhance their applicability and reliability, providing more precise guidance for the differentiation and treatment of CPP.


Assuntos
Puberdade Precoce , Criança , Humanos , Inteligência Artificial , Aprendizado de Máquina , Puberdade Precoce/diagnóstico por imagem , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
2.
Front Immunol ; 15: 1372539, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38601145

RESUMO

Introduction: The coronavirus disease 2019 (COVID-19) pandemic has affected billions of people worldwide, and the lessons learned need to be concluded to get better prepared for the next pandemic. Early identification of high-risk patients is important for appropriate treatment and distribution of medical resources. A generalizable and easy-to-use COVID-19 severity stratification model is vital and may provide references for clinicians. Methods: Three COVID-19 cohorts (one discovery cohort and two validation cohorts) were included. Longitudinal peripheral blood mononuclear cells were collected from the discovery cohort (n = 39, mild = 15, critical = 24). The immune characteristics of COVID-19 and critical COVID-19 were analyzed by comparison with those of healthy volunteers (n = 16) and patients with mild COVID-19 using mass cytometry by time of flight (CyTOF). Subsequently, machine learning models were developed based on immune signatures and the most valuable laboratory parameters that performed well in distinguishing mild from critical cases. Finally, single-cell RNA sequencing data from a published study (n = 43) and electronic health records from a prospective cohort study (n = 840) were used to verify the role of crucial clinical laboratory and immune signature parameters in the stratification of COVID-19 severity. Results: Patients with COVID-19 were determined with disturbed glucose and tryptophan metabolism in two major innate immune clusters. Critical patients were further characterized by significant depletion of classical dendritic cells (cDCs), regulatory T cells (Tregs), and CD4+ central memory T cells (Tcm), along with increased systemic interleukin-6 (IL-6), interleukin-12 (IL-12), and lactate dehydrogenase (LDH). The machine learning models based on the level of cDCs and LDH showed great potential for predicting critical cases. The model performances in severity stratification were validated in two cohorts (AUC = 0.77 and 0.88, respectively) infected with different strains in different periods. The reference limits of cDCs and LDH as biomarkers for predicting critical COVID-19 were 1.2% and 270.5 U/L, respectively. Conclusion: Overall, we developed and validated a generalizable and easy-to-use COVID-19 severity stratification model using machine learning algorithms. The level of cDCs and LDH will assist clinicians in making quick decisions during future pandemics.


Assuntos
COVID-19 , Humanos , Pandemias , Estudos Prospectivos , Leucócitos Mononucleares , SARS-CoV-2 , L-Lactato Desidrogenase , Aprendizado de Máquina
3.
Sci Rep ; 14(1): 8589, 2024 04 13.
Artigo em Inglês | MEDLINE | ID: mdl-38615137

RESUMO

Early identification of high-risk metabolic dysfunction-associated steatohepatitis (MASH) can offer patients access to novel therapeutic options and potentially decrease the risk of progression to cirrhosis. This study aimed to develop an explainable machine learning model for high-risk MASH prediction and compare its performance with well-established biomarkers. Data were derived from the National Health and Nutrition Examination Surveys (NHANES) 2017-March 2020, which included a total of 5281 adults with valid elastography measurements. We used a FAST score ≥ 0.35, calculated using liver stiffness measurement and controlled attenuation parameter values and aspartate aminotransferase levels, to identify individuals with high-risk MASH. We developed an ensemble-based machine learning XGBoost model to detect high-risk MASH and explored the model's interpretability using an explainable artificial intelligence SHAP method. The prevalence of high-risk MASH was 6.9%. Our XGBoost model achieved a high level of sensitivity (0.82), specificity (0.91), accuracy (0.90), and AUC (0.95) for identifying high-risk MASH. Our model demonstrated a superior ability to predict high-risk MASH vs. FIB-4, APRI, BARD, and MASLD fibrosis scores (AUC of 0.95 vs. 0.50, 0.50, 0.49 and 0.50, respectively). To explain the high performance of our model, we found that the top 5 predictors of high-risk MASH were ALT, GGT, platelet count, waist circumference, and age. We used an explainable ML approach to develop a clinically applicable model that outperforms commonly used clinical risk indices and could increase the identification of high-risk MASH patients in resource-limited settings.


Assuntos
Técnicas de Imagem por Elasticidade , Hepatopatia Gordurosa não Alcoólica , Adulto , Humanos , Hepatopatia Gordurosa não Alcoólica/diagnóstico , Hepatopatia Gordurosa não Alcoólica/epidemiologia , Inteligência Artificial , Inquéritos Nutricionais , Aprendizado de Máquina
4.
Crit Rev Immunol ; 44(5): 15-25, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38618725

RESUMO

Chronic kidney disease (CKD) is a common disorder related to inflammatory pathways; its effective management remains limited. This study aimed to use bioinformatics analysis to find diagnostic markers that might be therapeutic targets for CKD. CKD microarray datasets were screened from the GEO database and the differentially expressed genes (DEGs) in CKD dataset GSE98603 were analyzed. Gene set variation analysis (GSVA) was used to explore the activity scores of the inflammatory pathways and samples. Algorithms such as weighted gene co-expression network analysis (WGCNA) and Lasso were used to screen CKD diagnostic markers related to inflammation. Then functional enrichment analysis of inflammation-related DEGs was performed. ROC curves were conducted to examine the diagnostic value of inflammation-related hub-genes. Lastly, quantitative real-time PCR further verified the prediction of bioinformatics. A total of 71 inflammation-related DEGs were obtained, of which 5 were hub genes. Enrichment analysis showed that these genes were significantly enriched in inflammation-related pathways (NF-κB, JAK-STAT, and MAPK signaling pathways). ROC curves showed that the 5 CKD diagnostic markers (TIGD7, ACTA2, ACTG2, MAP4K4, and HOXA11) also exhibited good diagnostic value. In addition, TIGD7, ACTA2, ACTG2, and HOXA11 expression was downregulated while MAP4K4 expression was upregulated in LPS-induced HK-2 cells. The present study identified TIGD7, ACTA2, ACTG2, MAP4K4, and HOXA11 as reliable CKD diagnostic markers, thereby providing a basis for further understanding of CKD in clinical treatments.


Assuntos
Perfilação da Expressão Gênica , Insuficiência Renal Crônica , Humanos , Aprendizado de Máquina , NF-kappa B , Inflamação/diagnóstico , Insuficiência Renal Crônica/diagnóstico , Insuficiência Renal Crônica/genética , Proteínas Serina-Treonina Quinases , Peptídeos e Proteínas de Sinalização Intracelular
5.
Environ Monit Assess ; 196(5): 453, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38619639

RESUMO

This study seeks to investigate the impact of COVID-19 lockdown measures on air quality in the city of Mashhad employing two strategies. We initiated our research using basic statistical methods such as paired sample t-tests to compare hourly PM2.5 data in two scenarios: before and during quarantine, and pre- and post-lockdown. This initial analysis provided a broad understanding of potential changes in air quality. Notably, a low reduction of 2.40% in PM2.5 was recorded when compared to air quality prior to the lockdown period. This finding highlights the wide range of factors that impact the levels of particulate matter in urban settings, with the transportation sector often being widely recognized as one of the principal causes of this issue. Nevertheless, throughout the period after the quarantine, a remarkable decrease in air quality was observed characterized by distinct seasonal patterns, in contrast to previous years. This finding demonstrates a significant correlation between changes in human mobility patterns and their influence on the air quality of urban areas. It also emphasizes the need to use air pollution modeling as a fundamental tool to evaluate and understand these linkages to support long-term plans for reducing air pollution. To obtain a more quantitative understanding, we then employed cutting-edge machine learning methods, such as random forest and long short-term memory algorithms, to accurately determine the effect of the lockdown on PM2.5 levels. Our models' results demonstrated remarkable efficacy in assessing the pollutant concentration in Mashhad during lockdown measures. The test set yielded an R-squared value of 0.82 for the long short-term memory network model, whereas the random forest model showed a calculated cross-validation R-squared of 0.78. The required computational cost for training the LSTM and the RF models across all data was 25 min and 3 s, respectively. In summary, through the integration of statistical methods and machine learning, this research attempts to provide a comprehensive understanding of the impact of human interventions on air quality dynamics.


Assuntos
COVID-19 , Humanos , COVID-19/epidemiologia , Controle de Doenças Transmissíveis , Monitoramento Ambiental , Aprendizado de Máquina , Material Particulado
6.
Genome Biol ; 25(1): 95, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622679

RESUMO

BACKGROUND: Aneuploidy, an abnormal number of chromosomes within a cell, is a hallmark of cancer. Patterns of aneuploidy differ across cancers, yet are similar in cancers affecting closely related tissues. The selection pressures underlying aneuploidy patterns are not fully understood, hindering our understanding of cancer development and progression. RESULTS: Here, we apply interpretable machine learning methods to study tissue-selective aneuploidy patterns. We define 20 types of features corresponding to genomic attributes of chromosome-arms, normal tissues, primary tumors, and cancer cell lines (CCLs), and use them to model gains and losses of chromosome arms in 24 cancer types. To reveal the factors that shape the tissue-specific cancer aneuploidy landscapes, we interpret the machine learning models by estimating the relative contribution of each feature to the models. While confirming known drivers of positive selection, our quantitative analysis highlights the importance of negative selection for shaping aneuploidy landscapes. This is exemplified by tumor suppressor gene density being a better predictor of gain patterns than oncogene density, and vice versa for loss patterns. We also identify the importance of tissue-selective features and demonstrate them experimentally, revealing KLF5 as an important driver for chr13q gain in colon cancer. Further supporting an important role for negative selection in shaping the aneuploidy landscapes, we find compensation by paralogs to be among the top predictors of chromosome arm loss prevalence and demonstrate this relationship for one paralog interaction. Similar factors shape aneuploidy patterns in human CCLs, demonstrating their relevance for aneuploidy research. CONCLUSIONS: Our quantitative, interpretable machine learning models improve the understanding of the genomic properties that shape cancer aneuploidy landscapes.


Assuntos
Aneuploidia , Neoplasias , Humanos , Neoplasias/genética , Neoplasias/patologia , Deleção Cromossômica , Cromossomos , Aprendizado de Máquina
7.
Front Immunol ; 15: 1368904, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38629070

RESUMO

Background: Coronary artery disease (CAD) is still a lethal disease worldwide. This study aims to identify clinically relevant diagnostic biomarker in CAD and explore the potential medications on CAD. Methods: GSE42148, GSE180081, and GSE12288 were downloaded as the training and validation cohorts to identify the candidate genes by constructing the weighted gene co-expression network analysis. Functional enrichment analysis was utilized to determine the functional roles of these genes. Machine learning algorithms determined the candidate biomarkers. Hub genes were then selected and validated by nomogram and the receiver operating curve. Using CIBERSORTx, the hub genes were further discovered in relation to immune cell infiltrability, and molecules associated with immune active families were analyzed by correlation analysis. Drug screening and molecular docking were used to determine medications that target the four genes. Results: There were 191 and 230 key genes respectively identified by the weighted gene co-expression network analysis in two modules. A total of 421 key genes found enriched pathways by functional enrichment analysis. Candidate immune-related genes were then screened and identified by the random forest model and the eXtreme Gradient Boosting algorithm. Finally, four hub genes, namely, CSF3R, EED, HSPA1B, and IL17RA, were obtained and used to establish the nomogram model. The receiver operating curve, the area under curve, and the calibration curve were all used to validate the accuracy and usefulness of the diagnostic model. Immune cell infiltrating was examined, and CAD patients were then divided into high- and low-expression groups for further gene set enrichment analysis. Through targeting the hub genes, we also found potential drugs for anti-CAD treatment by using the molecular docking method. Conclusions: CSF3R, EED, HSPA1B, and IL17RA are potential diagnostic biomarkers for CAD. CAD pathogenesis is greatly influenced by patterns of immune cell infiltration. Promising drugs offers new prospects for the development of CAD therapy.


Assuntos
Doença da Artéria Coronariana , Humanos , Doença da Artéria Coronariana/diagnóstico , Doença da Artéria Coronariana/genética , Simulação de Acoplamento Molecular , Nomogramas , Algoritmos , Aprendizado de Máquina
9.
J Med Internet Res ; 26: e48330, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38630522

RESUMO

BACKGROUND: Intensive care research has predominantly relied on conventional methods like randomized controlled trials. However, the increasing popularity of open-access, free databases in the past decade has opened new avenues for research, offering fresh insights. Leveraging machine learning (ML) techniques enables the analysis of trends in a vast number of studies. OBJECTIVE: This study aims to conduct a comprehensive bibliometric analysis using ML to compare trends and research topics in traditional intensive care unit (ICU) studies and those done with open-access databases (OADs). METHODS: We used ML for the analysis of publications in the Web of Science database in this study. Articles were categorized into "OAD" and "traditional intensive care" (TIC) studies. OAD studies were included in the Medical Information Mart for Intensive Care (MIMIC), eICU Collaborative Research Database (eICU-CRD), Amsterdam University Medical Centers Database (AmsterdamUMCdb), High Time Resolution ICU Dataset (HiRID), and Pediatric Intensive Care database. TIC studies included all other intensive care studies. Uniform manifold approximation and projection was used to visualize the corpus distribution. The BERTopic technique was used to generate 30 topic-unique identification numbers and to categorize topics into 22 topic families. RESULTS: A total of 227,893 records were extracted. After exclusions, 145,426 articles were identified as TIC and 1301 articles as OAD studies. TIC studies experienced exponential growth over the last 2 decades, culminating in a peak of 16,378 articles in 2021, while OAD studies demonstrated a consistent upsurge since 2018. Sepsis, ventilation-related research, and pediatric intensive care were the most frequently discussed topics. TIC studies exhibited broader coverage than OAD studies, suggesting a more extensive research scope. CONCLUSIONS: This study analyzed ICU research, providing valuable insights from a large number of publications. OAD studies complement TIC studies, focusing on predictive modeling, while TIC studies capture essential qualitative information. Integrating both approaches in a complementary manner is the future direction for ICU research. Additionally, natural language processing techniques offer a transformative alternative for literature review and bibliometric analysis.


Assuntos
Cuidados Críticos , Unidades de Terapia Intensiva , Criança , Humanos , Centros Médicos Acadêmicos , Bibliometria , Aprendizado de Máquina
10.
Int J Mol Sci ; 25(7)2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38612509

RESUMO

Cancer remains a leading cause of mortality worldwide and calls for novel therapeutic targets. Membrane proteins are key players in various cancer types but present unique challenges compared to soluble proteins. The advent of computational drug discovery tools offers a promising approach to address these challenges, allowing for the prioritization of "wet-lab" experiments. In this review, we explore the applications of computational approaches in membrane protein oncological characterization, particularly focusing on three prominent membrane protein families: receptor tyrosine kinases (RTKs), G protein-coupled receptors (GPCRs), and solute carrier proteins (SLCs). We chose these families due to their varying levels of understanding and research data availability, which leads to distinct challenges and opportunities for computational analysis. We discuss the utilization of multi-omics data, machine learning, and structure-based methods to investigate aberrant protein functionalities associated with cancer progression within each family. Moreover, we highlight the importance of considering the broader cellular context and, in particular, cross-talk between proteins. Despite existing challenges, computational tools hold promise in dissecting membrane protein dysregulation in cancer. With advancing computational capabilities and data resources, these tools are poised to play a pivotal role in identifying and prioritizing membrane proteins as personalized anticancer targets.


Assuntos
Proteínas de Membrana , Neoplasias , Humanos , Reações Cruzadas , Descoberta de Drogas , Aprendizado de Máquina , Neoplasias/tratamento farmacológico
11.
Int J Mol Sci ; 25(7)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38612671

RESUMO

This paper offers a thorough investigation of hyperparameter tuning for neural network architectures using datasets encompassing various combinations of Methylene Blue (MB) Reduction by Ascorbic Acid (AA) reactions with different solvents and concentrations. The aim is to predict coefficients of decay plots for MB absorbance, shedding light on the complex dynamics of chemical reactions. Our findings reveal that the optimal model, determined through our investigation, consists of five hidden layers, each with sixteen neurons and employing the Swish activation function. This model yields an NMSE of 0.05, 0.03, and 0.04 for predicting the coefficients A, B, and C, respectively, in the exponential decay equation A + B · e-x/C. These findings contribute to the realm of drug design based on machine learning, providing valuable insights into optimizing chemical reaction predictions.


Assuntos
Ácido Ascórbico , Azul de Metileno , Desenho de Fármacos , Aprendizado de Máquina , Redes Neurais de Computação
12.
Int J Mol Sci ; 25(7)2024 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-38612697

RESUMO

Tertiary lymphoid structures (TLSs) are organized aggregates of immune cells in non-lymphoid tissues and are associated with a favorable prognosis in tumors. However, TLS markers remain inconsistent, and the utilization of machine learning techniques for this purpose is limited. To tackle this challenge, we began by identifying TLS markers through bioinformatics analysis and machine learning techniques. Subsequently, we leveraged spatial transcriptomic data from Gene Expression Omnibus (GEO) and built two support vector classifier models for TLS prediction: one without feature selection and the other using the marker genes. The comparable performances of these two models confirm the efficacy of the selected markers. The majority of the markers are immunoglobulin genes, demonstrating their importance in the identification of TLSs. Our research has identified the markers of TLSs using machine learning methods and constructed a model to predict TLS location, contributing to the detection of TLS and holding the promising potential to impact cancer treatment strategies.


Assuntos
Estruturas Linfoides Terciárias , Humanos , Estruturas Linfoides Terciárias/genética , Perfilação da Expressão Gênica , Transcriptoma , Biologia Computacional , Aprendizado de Máquina
13.
Int J Mol Sci ; 25(7)2024 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-38612705

RESUMO

The advent of Surface-Enhanced Raman Scattering (SERS) has enabled the exploration and detection of small molecules, particularly in biological fluids such as serum, blood plasma, urine, saliva, and tears. SERS has been proposed as a simple diagnostic technique for various diseases, including cancer. Renal cell carcinoma (RCC) ranks as the sixth most commonly diagnosed cancer in men and is often asymptomatic, with detection occurring incidentally. The onset of symptoms typically aligns with advanced disease, aggressive histology, and unfavorable prognosis, and therefore new methods for an early diagnosis are needed. In this study, we investigated the utility of label-free SERS in urine, coupled with two multivariate analysis approaches: Principal Component Analysis combined with Linear Discriminant Analysis (PCA-LDA) and Support Vector Machine (SVM), to discriminate between 50 RCC patients and 44 healthy donors. Employing LDA-PCA, we achieved a discrimination accuracy of 100% using 13 principal components, and an 88% accuracy in discriminating between different RCC stages. The SVM approach yielded a training accuracy of 100%, a validation accuracy of 99% for discriminating between RCC and controls, and an 80% accuracy for discriminating between stages. The comparative analysis of raw and normalized SERS spectral data shows that while raw data disclose relative concentration variations in urine metabolites between the two classes, the normalization of spectral data significantly improves the accuracy of discrimination. Moreover, the selection of principal components with markedly distinct scores between the two classes serves to alleviate overfitting risks and reduces the number of components employed for discrimination. We obtained the accuracy of the discrimination between the RCC patients cases and healthy donors of 90% for three PCs and a linear discrimination function, and a 88% accuracy of discrimination between stages using six PCs, mitigating practically the risk of overfitting and increasing the robustness of our analysis. Our findings underscore the potential of label-free SERS of urine in conjunction with chemometrics for non-invasive and early RCC detection.


Assuntos
Líquidos Corporais , Carcinoma de Células Renais , Neoplasias Renais , Masculino , Humanos , Carcinoma de Células Renais/diagnóstico , Análise Multivariada , Aprendizado de Máquina , Neoplasias Renais/diagnóstico
14.
Nutrients ; 16(7)2024 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-38613106

RESUMO

In industry 4.0, where the automation and digitalization of entities and processes are fundamental, artificial intelligence (AI) is increasingly becoming a pivotal tool offering innovative solutions in various domains. In this context, nutrition, a critical aspect of public health, is no exception to the fields influenced by the integration of AI technology. This study aims to comprehensively investigate the current landscape of AI in nutrition, providing a deep understanding of the potential of AI, machine learning (ML), and deep learning (DL) in nutrition sciences and highlighting eventual challenges and futuristic directions. A hybrid approach from the systematic literature review (SLR) guidelines and the preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines was adopted to systematically analyze the scientific literature from a search of major databases on artificial intelligence in nutrition sciences. A rigorous study selection was conducted using the most appropriate eligibility criteria, followed by a methodological quality assessment ensuring the robustness of the included studies. This review identifies several AI applications in nutrition, spanning smart and personalized nutrition, dietary assessment, food recognition and tracking, predictive modeling for disease prevention, and disease diagnosis and monitoring. The selected studies demonstrated the versatility of machine learning and deep learning techniques in handling complex relationships within nutritional datasets. This study provides a comprehensive overview of the current state of AI applications in nutrition sciences and identifies challenges and opportunities. With the rapid advancement in AI, its integration into nutrition holds significant promise to enhance individual nutritional outcomes and optimize dietary recommendations. Researchers, policymakers, and healthcare professionals can utilize this research to design future projects and support evidence-based decision-making in AI for nutrition and dietary guidance.


Assuntos
Inteligência Artificial , Aprendizado Profundo , Humanos , Aprendizado de Máquina , Estado Nutricional , Automação
15.
Cancer Med ; 13(7): e7161, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38613173

RESUMO

BACKGROUND: Ovarian clear cell carcinoma (OCCC) represents a subtype of ovarian epithelial carcinoma (OEC) known for its limited responsiveness to chemotherapy, and the onset of distant metastasis significantly impacts patient prognoses. This study aimed to identify potential risk factors contributing to the occurrence of distant metastasis in OCCC. METHODS: Utilizing the Surveillance, Epidemiology, and End Results (SEER) database, we identified patients diagnosed with OCCC between 2004 and 2015. The most influential factors were selected through the application of Gaussian Naive Bayes (GNB) and Adaboost machine learning algorithms, employing a Venn test for further refinement. Subsequently, six machine learning (ML) techniques, namely XGBoost, LightGBM, Random Forest (RF), Adaptive Boosting (Adaboost), Support Vector Machine (SVM), and Multilayer Perceptron (MLP), were employed to construct predictive models for distant metastasis. Shapley Additive Interpretation (SHAP) analysis facilitated a visual interpretation for individual patient. Model validity was assessed using accuracy, sensitivity, specificity, positive predictive value, negative predictive value, F1 score, and the area under the receiver operating characteristic curve (AUC). RESULTS: In the realm of predicting distant metastasis, the Random Forest (RF) model outperformed the other five machine learning algorithms. The RF model demonstrated accuracy, sensitivity, specificity, positive predictive value, negative predictive value, F1 score, and AUC (95% CI) values of 0.792 (0.762-0.823), 0.904 (0.835-0.973), 0.759 (0.731-0.787), 0.221 (0.186-0.256), 0.974 (0.967-0.982), 0.353 (0.306-0.399), and 0.834 (0.696-0.967), respectively, surpassing the performance of other models. Additionally, the calibration curve's Brier Score (95%) for the RF model reached the minimum value of 0.06256 (0.05753-0.06759). SHAP analysis provided independent explanations, reaffirming the critical clinical factors associated with the risk of metastasis in OCCC patients. CONCLUSIONS: This study successfully established a precise predictive model for OCCC patient metastasis using machine learning techniques, offering valuable support to clinicians in making informed clinical decisions.


Assuntos
Adenocarcinoma de Células Claras , Neoplasias Ovarianas , Feminino , Humanos , Teorema de Bayes , Algoritmos , Carcinoma Epitelial do Ovário , Aprendizado de Máquina
16.
Urolithiasis ; 52(1): 64, 2024 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-38613668

RESUMO

Radiomics and machine learning have been extensively utilized in the realm of urinary stones, particularly in forecasting stone treatment outcomes. The objective of this study was to integrate clinical variables and radiomic features to develop a machine learning model for predicting the stone-free rate (SFR) following percutaneous nephrolithotomy (PCNL). A total of 212 eligible patients who underwent PCNL surgery at the Second Affiliated Hospital of Nanchang University were included in a retrospective analysis. Preoperative clinical variables and non-contrast-enhanced CT images of all patients were collected, and radiomic features were extracted after delineating the stone ROI. Univariate analysis was conducted to identify clinical variables strongly correlated with the stone-free rate after PCNL, and the least absolute shrinkage and selection operator algorithm (lasso regression) was utilized to screen radiomic features. Four supervised machine learning algorithms, including Logistic Regression, Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Gradient Boosting Decision Tree (GBDT), were employed. The clinical variables with strong correlation and screened radiomic features were integrated into the four machine learning algorithms to construct a prediction model, and the receiver operating curve was plotted. The area under the receiver operating curve (AUC), the accuracy rate, the specificity, etc., were used to evaluate the predictive performance of the four models. After analyzing postoperative statistics, the stone-free rate following the procedure was found to be 70.3% (n = 149). Among the various clinical variables examined, factors, such as stone number, stone diameter, stone CT value, stone location, and history of stone surgery, were identified as statistically significant in relation to the stone-free rate after PCNL. A total of 121 radiomic features were extracted, and through lasso regression, 7 features most closely associated with the stone-free rate post-PCNL were identified. The predictive accuracy of different models (Logistic Regression, RF, XGBoost, and GBDT) for determining the stone-free rate after PCNL was evaluated, yielding accuracies of 78.1%, 76.6%, 75.0%, and 73.4%, respectively. The corresponding area under the curve AUC (95%CI) were 0.85 (0.83-0.89), 0.81 (0.76-0.85), 0.82 (0.78-0.85), and 0.77 (0.73-0.81), positioning these models among the top performers in logistic regression prediction. In terms of predictive importance scores, the key factors identified by the logistic regression model were number of stone, zone percentage, stone diameter, and surface area. Similarly, the RF model highlighted number of stone, stone CT value, stone diameter, and surface area as the top predictors. Among the four machine learning models, the logistic regression model demonstrated the highest accuracy and discrimination ability in predicting the stone-free rate following PCNL. In comparison to XGBoost and GBDT, RF also exhibited superior accuracy and a certain level of discrimination ability. However, based on the performance of all four models, logistic regression is more likely to aid in clinical decision-making by assisting clinicians in diagnosing PCNL in patients. This enables us to effectively predict the presence of residual stones post-surgery and ultimately select patients who are suitable candidates for PCNL.


Assuntos
Nefrolitotomia Percutânea , Cálculos Urinários , Humanos , 60570 , Estudos Retrospectivos , Aprendizado de Máquina
17.
Cereb Cortex ; 34(4)2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38615239

RESUMO

How to achieve a high-precision suicide attempt classifier based on the three-dimensional psychological pain model is a valuable issue in suicide research. The aim of the present study is to explore the importance of pain avoidance and its related neural features in suicide attempt classification models among patients with major depressive disorder. By recursive feature elimination with cross-validation and support-vector-machine algorithms, scores from the measurements and the task-based EEG signals were chosen to achieve a suicide attempt classification model. In the multimodal suicide attempt classifier with an accuracy of 83.91% and an area under the curve of 0.90, pain avoidance ranked as the top one in the optimal feature set. Theta (reward positive feedback minus neutral positive feedback) was the shared neural representation ranking as the top one of event-related potential features in pain avoidance and suicide attempt classifiers. In conclusion, the suicide attempt classifier based on pain avoidance and its related affective processing neural features has excellent accuracy among patients with major depressive disorder. Pain avoidance is a stable and strong indicator for identifying suicide risks in both traditional analyses and machine-learning approaches. A novel methodology is needed to clarify the relationship between cognitive and affective processing evoked by punishment stimuli and pain avoidance.


Assuntos
Transtorno Depressivo Maior , Humanos , Tentativa de Suicídio , Dor , Potenciais Evocados , Aprendizado de Máquina
18.
Sci Rep ; 14(1): 8487, 2024 04 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605059

RESUMO

Breast cancer has rapidly increased in prevalence in recent years, making it one of the leading causes of mortality worldwide. Among all cancers, it is by far the most common. Diagnosing this illness manually requires significant time and expertise. Since detecting breast cancer is a time-consuming process, preventing its further spread can be aided by creating machine-based forecasts. Machine learning and Explainable AI are crucial in classification as they not only provide accurate predictions but also offer insights into how the model arrives at its decisions, aiding in the understanding and trustworthiness of the classification results. In this study, we evaluate and compare the classification accuracy, precision, recall, and F1 scores of five different machine learning methods using a primary dataset (500 patients from Dhaka Medical College Hospital). Five different supervised machine learning techniques, including decision tree, random forest, logistic regression, naive bayes, and XGBoost, have been used to achieve optimal results on our dataset. Additionally, this study applied SHAP analysis to the XGBoost model to interpret the model's predictions and understand the impact of each feature on the model's output. We compared the accuracy with which several algorithms classified the data, as well as contrasted with other literature in this field. After final evaluation, this study found that XGBoost achieved the best model accuracy, which is 97%.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/diagnóstico , Teorema de Bayes , Bangladesh/epidemiologia , Mama , Aprendizado de Máquina , Hidrolases
19.
BMC Cancer ; 24(1): 454, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38605303

RESUMO

OBJECTIVE: To explore the value of six machine learning models based on PET/CT radiomics combined with EGFR in predicting brain metastases of lung adenocarcinoma. METHODS: Retrospectively collected 204 patients with lung adenocarcinoma who underwent PET/CT examination and EGFR gene detection before treatment from Cancer Hospital Affiliated to Shandong First Medical University in 2020. Using univariate analysis and multivariate logistic regression analysis to find the independent risk factors for brain metastasis. Based on PET/CT imaging combined with EGFR and PET metabolic indexes, established six machine learning models to predict brain metastases of lung adenocarcinoma. Finally, using ten-fold cross-validation to evaluate the predictive effectiveness. RESULTS: In univariate analysis, patients with N2-3, EGFR mutation-positive, LYM%≤20, and elevated tumor markers(P<0.05) were more likely to develop brain metastases. In multivariate Logistic regression analysis, PET metabolic indices revealed that SUVmax, SUVpeak, Volume, and TLG were risk factors for lung adenocarcinoma brain metastasis(P<0.05). The SVM model was the most efficient predictor of brain metastasis with an AUC of 0.82 (PET/CT group),0.70 (CT group),0.76 (PET group). CONCLUSIONS: Radiomics combined with EGFR machine learning model as a new method have higher accuracy than EGFR mutation alone. SVM model is the most effective method for predicting brain metastases of lung adenocarcinoma, and the prediction efficiency of PET/CT group is better than PET group and CT group.


Assuntos
Adenocarcinoma de Pulmão , Adenocarcinoma , Neoplasias Encefálicas , Neoplasias Pulmonares , Humanos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Neoplasias Pulmonares/genética , Estudos Retrospectivos , Adenocarcinoma/genética , Adenocarcinoma de Pulmão/diagnóstico por imagem , Adenocarcinoma de Pulmão/patologia , Pulmão/patologia , Receptores ErbB/genética , Aprendizado de Máquina , Neoplasias Encefálicas/diagnóstico por imagem
20.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38605639

RESUMO

The accurate identification of disease-associated genes is crucial for understanding the molecular mechanisms underlying various diseases. Most current methods focus on constructing biological networks and utilizing machine learning, particularly deep learning, to identify disease genes. However, these methods overlook complex relations among entities in biological knowledge graphs. Such information has been successfully applied in other areas of life science research, demonstrating their effectiveness. Knowledge graph embedding methods can learn the semantic information of different relations within the knowledge graphs. Nonetheless, the performance of existing representation learning techniques, when applied to domain-specific biological data, remains suboptimal. To solve these problems, we construct a biological knowledge graph centered on diseases and genes, and develop an end-to-end knowledge graph completion framework for disease gene prediction using interactional tensor decomposition named KDGene. KDGene incorporates an interaction module that bridges entity and relation embeddings within tensor decomposition, aiming to improve the representation of semantically similar concepts in specific domains and enhance the ability to accurately predict disease genes. Experimental results show that KDGene significantly outperforms state-of-the-art algorithms, whether existing disease gene prediction methods or knowledge graph embedding methods for general domains. Moreover, the comprehensive biological analysis of the predicted results further validates KDGene's capability to accurately identify new candidate genes. This work proposes a scalable knowledge graph completion framework to identify disease candidate genes, from which the results are promising to provide valuable references for further wet experiments. Data and source codes are available at https://github.com/2020MEAI/KDGene.


Assuntos
Disciplinas das Ciências Biológicas , Reconhecimento Automatizado de Padrão , Algoritmos , Aprendizado de Máquina , Semântica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...