Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 5.544
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38783704

RESUMO

The untranslated region (UTR) of messenger ribonucleic acid (mRNA), including the 5'UTR and 3'UTR, plays a critical role in regulating gene expression and translation. Variants within the UTR can lead to changes associated with human traits and diseases; however, computational prediction of UTR variant effect is challenging. Current noncoding variant prediction mainly focuses on the promoters and enhancers, neglecting the unique sequence of the UTR and thereby limiting their predictive accuracy. In this study, using consolidated datasets of UTR variants from disease databases and large-scale experimental data, we systematically analyzed more than 50 region-specific features of UTR, including functional elements, secondary structure, sequence composition and site conservation. Our analysis reveals that certain features, such as C/G-related sequence composition in 5'UTR and A/T-related sequence composition in 3'UTR, effectively differentiate between nonfunctional and functional variant sets, unveiling potential sequence determinants of functional UTR variants. Leveraging these insights, we developed two classification models to predict functional UTR variants using machine learning, achieving an area under the curve (AUC) value of 0.94 for 5'UTR and 0.85 for 3'UTR, outperforming all existing methods. Our models will be valuable for enhancing clinical interpretation of genetic variants, facilitating the prediction and management of disease risk.


Assuntos
Regiões 3' não Traduzidas , Regiões 5' não Traduzidas , Humanos , Biologia Computacional/métodos , Aprendizado de Máquina , Variação Genética , Regiões não Traduzidas
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38600667

RESUMO

Human leukocyte antigen (HLA) recognizes foreign threats and triggers immune responses by presenting peptides to T cells. Computationally modeling the binding patterns between peptide and HLA is very important for the development of tumor vaccines. However, it is still a big challenge to accurately predict HLA molecules binding peptides. In this paper, we develop a new model TripHLApan for predicting HLA molecules binding peptides by integrating triple coding matrix, BiGRU + Attention models, and transfer learning strategy. We have found the main interaction site regions between HLA molecules and peptides, as well as the correlation between HLA encoding and binding motifs. Based on the discovery, we make the preprocessing and coding closer to the natural biological process. Besides, due to the input being based on multiple types of features and the attention module focused on the BiGRU hidden layer, TripHLApan has learned more sequence level binding information. The application of transfer learning strategies ensures the accuracy of prediction results under special lengths (peptides in length 8) and model scalability with the data explosion. Compared with the current optimal models, TripHLApan exhibits strong predictive performance in various prediction environments with different positive and negative sample ratios. In addition, we validate the superiority and scalability of TripHLApan's predictive performance using additional latest data sets, ablation experiments and binding reconstitution ability in the samples of a melanoma patient. The results show that TripHLApan is a powerful tool for predicting the binding of HLA-I and HLA-II molecular peptides for the synthesis of tumor vaccines. TripHLApan is publicly available at https://github.com/CSUBioGroup/TripHLApan.git.


Assuntos
Vacinas Anticâncer , Humanos , Ligação Proteica , Peptídeos/química , Antígenos HLA/química , Antígenos de Histocompatibilidade Classe II/química , Antígenos de Histocompatibilidade Classe I/química , Aprendizado de Máquina
3.
Brief Bioinform ; 25(6)2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39397424

RESUMO

Ischemic stroke (IS) is a leading cause of adult disability that can severely compromise the quality of life for patients. Accurately predicting the IS functional outcome is crucial for precise risk stratification and effective therapeutic interventions. We developed a predictive model integrating genetic, environmental, and clinical factors using data from 7819 IS patients in the Third China National Stroke Registry. Employing an 80:20 split, we randomly divided the dataset into development and internal validation cohorts. The discrimination and calibration performance of models were evaluated using the area under the receiver operating characteristic curves (AUC) for discrimination and Brier score with calibration curve in the internal validation cohort. We conducted genome-wide association studies (GWAS) in the development cohort, identifying rs11109607 (ANKS1B) as the most significant variant associated with IS functional outcome. We employed principal component analysis to reduce dimensionality on the top 100 significant variants identified by the GWAS, incorporating them as genetic factors in the predictive model. We employed a machine learning algorithm capable of identifying nonlinear relationships to establish predictive models for IS patient functional outcome. The optimal model was the XGBoost model, which outperformed the logistic regression model (AUC 0.818 versus 0.756, P < .05) and significantly improved reclassification efficiency. Our study innovatively incorporated genetic, environmental, and clinical factors for predicting the IS functional outcome in East Asian populations, thereby offering novel insights into IS functional outcome.


Assuntos
Estudo de Associação Genômica Ampla , AVC Isquêmico , Aprendizado de Máquina , Humanos , AVC Isquêmico/genética , Feminino , Masculino , Pessoa de Meia-Idade , Estudos Prospectivos , Idoso , China , Polimorfismo de Nucleotídeo Único , Prognóstico , Curva ROC
4.
J Biol Chem ; 300(4): 107140, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38447795

RESUMO

RNA modification, a posttranscriptional regulatory mechanism, significantly influences RNA biogenesis and function. The accurate identification of modification sites is paramount for investigating their biological implications. Methods for encoding RNA sequence into numerical data play a crucial role in developing robust models for predicting modification sites. However, existing techniques suffer from limitations, including inadequate information representation, challenges in effectively integrating positional and sequential information, and the generation of irrelevant or redundant features when combining multiple approaches. These deficiencies hinder the effectiveness of machine learning models in addressing the performance challenges associated with predicting RNA modification sites. Here, we introduce a novel RNA sequence feature representation method, named BiPSTP, which utilizes bidirectional trinucleotide position-specific propensities. We employ the parameter ξ to denote the interval between the current nucleotide and its adjacent forward or backward dinucleotide, enabling the extraction of positional and sequential information from RNA sequences. Leveraging the BiPSTP method, we have developed the prediction model mRNAPred using support vector machine classifier to identify multiple types of RNA modification sites. We evaluate the performance of our BiPSTP method and mRNAPred model across 12 distinct RNA modification types. Our experimental results demonstrate the superiority of the mRNAPred model compared to state-of-art models in the domain of RNA modification sites identification. Importantly, our BiPSTP method enhances the robustness and generalization performance of prediction models. Notably, it can be applied to feature extraction from DNA sequences to predict other biological modification sites.


Assuntos
Processamento Pós-Transcricional do RNA , RNA , Máquina de Vetores de Suporte , Biologia Computacional/métodos , RNA/química , RNA/genética , RNA/metabolismo , Análise de Sequência de RNA/métodos , Nucleotídeos/química , Nucleotídeos/metabolismo
5.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37099694

RESUMO

Studies have found that human microbiome is associated with and predictive of human health and diseases. Many statistical methods developed for microbiome data focus on different distance metrics that can capture various information in microbiomes. Prediction models were also developed for microbiome data, including deep learning methods with convolutional neural networks that consider both taxa abundance profiles and taxonomic relationships among microbial taxa from a phylogenetic tree. Studies have also suggested that a health outcome could associate with multiple forms of microbiome profiles. In addition to the abundance of some taxa that are associated with a health outcome, the presence/absence of some taxa is also associated with and predictive of the same health outcome. Moreover, associated taxa may be close to each other on a phylogenetic tree or spread apart on a phylogenetic tree. No prediction models currently exist that use multiple forms of microbiome-outcome associations. To address this, we propose a multi-kernel machine regression (MKMR) method that is able to capture various types of microbiome signals when doing predictions. MKMR utilizes multiple forms of microbiome signals through multiple kernels being transformed from multiple distance metrics for microbiomes and learn an optimal conic combination of these kernels, with kernel weights helping us understand contributions of individual microbiome signal types. Simulation studies suggest a much-improved prediction performance over competing methods with mixture of microbiome signals. Real data applicants to predict multiple health outcomes using throat and gut microbiome data also suggest a better prediction of MKMR than that of competing methods.


Assuntos
Microbiota , Humanos , Filogenia , Simulação por Computador , Redes Neurais de Computação , Avaliação de Resultados em Cuidados de Saúde
6.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37649385

RESUMO

Protein crystallization is crucial for biology, but the steps involved are complex and demanding in terms of external factors and internal structure. To save on experimental costs and time, the tendency of proteins to crystallize can be initially determined and screened by modeling. As a result, this study created a new pipeline aimed at using protein sequence to predict protein crystallization propensity in the protein material production stage, purification stage and production of crystal stage. The newly created pipeline proposed a new feature selection method, which involves combining Chi-square (${\chi }^{2}$) and recursive feature elimination together with the 12 selected features, followed by a linear discriminant analysisfor dimensionality reduction and finally, a support vector machine algorithm with hyperparameter tuning and 10-fold cross-validation is used to train the model and test the results. This new pipeline has been tested on three different datasets, and the accuracy rates are higher than the existing pipelines. In conclusion, our model provides a new solution to predict multistage protein crystallization propensity which is a big challenge in computational biology.


Assuntos
Algoritmos , Aprendizado de Máquina , Cristalização , Sequência de Aminoácidos , Biologia Computacional
7.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37114659

RESUMO

Cyclic AMP receptor proteins (CRPs) are important transcription regulators in many species. The prediction of CRP-binding sites was mainly based on position-weighted matrixes (PWMs). Traditional prediction methods only considered known binding motifs, and their ability to discover inflexible binding patterns was limited. Thus, a novel CRP-binding site prediction model called CRPBSFinder was developed in this research, which combined the hidden Markov model, knowledge-based PWMs and structure-based binding affinity matrixes. We trained this model using validated CRP-binding data from Escherichia coli and evaluated it with computational and experimental methods. The result shows that the model not only can provide higher prediction performance than a classic method but also quantitatively indicates the binding affinity of transcription factor binding sites by prediction scores. The prediction result included not only the most knowns regulated genes but also 1089 novel CRP-regulated genes. The major regulatory roles of CRPs were divided into four classes: carbohydrate metabolism, organic acid metabolism, nitrogen compound metabolism and cellular transport. Several novel functions were also discovered, including heterocycle metabolic and response to stimulus. Based on the functional similarity of homologous CRPs, we applied the model to 35 other species. The prediction tool and the prediction results are online and are available at: https://awi.cuhk.edu.cn/∼CRPBSFinder.


Assuntos
Proteína Receptora de AMP Cíclico , Proteínas de Escherichia coli , Proteína Receptora de AMP Cíclico/genética , Proteína Receptora de AMP Cíclico/química , Proteína Receptora de AMP Cíclico/metabolismo , Proteínas de Escherichia coli/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Sítios de Ligação/genética , Ligação Proteica/genética
8.
J Pathol ; 264(3): 284-292, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-39329449

RESUMO

T-lymphoblastic lymphoma (T-LBL) and thymoma are two rare primary tumors of the thymus deriving either from T-cell precursors or from thymic epithelial cells, respectively. Some thymoma subtypes (AB, B1, and B2) display numerous reactive terminal deoxynucleotidyl transferase-positive (TdT+) T-cell precursors masking epithelial tumor cells. Therefore, the differential diagnosis between T-LBL and TdT+ T-lymphocyte-rich thymoma could be challenging, especially in the case of needle biopsy. To distinguish between T-LBL and thymoma-associated lymphoid proliferations, we analyzed the global DNA methylation using two different technologies, namely MeDIP array and EPIC array, in independent samples series [17 T-LBLs compared with one TdT+ lymphocyte-rich thymoma (B1 subtype) and three normal thymi, and seven lymphocyte-rich thymomas compared with 24 T-LBLs, respectively]. In unsupervised principal component analysis (PCA), T-LBL and thymoma samples clustered separately. We identified differentially methylated regions (DMRs) using MeDIP-array and EPIC-array datasets and nine overlapping genes between the two datasets considering the top 100 DMRs including ZIC1, TSHZ2, CDC42BPB, RBM24, C10orf53, and MACROD2. In order to explore the DNA methylation profiles in larger series, we defined a classifier based on these six differentially methylated gene promoters, developed an MS-MLPA assay, and demonstrated a significant differential methylation between thymomas (hypomethylated; n = 48) and T-LBLs (hypermethylated; n = 54) (methylation ratio median 0.03 versus 0.66, respectively; p < 0.0001), with MACROD2 methylation status the most discriminating. Using a machine learning strategy, we built a prediction model trained with the EPIC-array dataset and defined a cumulative score taking into account the weight of each feature. A score above or equal to 0.4 was predictive of T-LBL and conversely. Applied to the MS-MLPA dataset, this prediction model accurately predicted diagnoses of T-LBL and thymoma. © 2024 The Author(s). The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.


Assuntos
Metilação de DNA , Leucemia-Linfoma Linfoblástico de Células T Precursoras , Timoma , Neoplasias do Timo , Humanos , Timoma/genética , Timoma/diagnóstico , Timoma/patologia , Neoplasias do Timo/genética , Neoplasias do Timo/patologia , Neoplasias do Timo/diagnóstico , Diagnóstico Diferencial , Masculino , Pessoa de Meia-Idade , Adulto , Feminino , Leucemia-Linfoma Linfoblástico de Células T Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células T Precursoras/diagnóstico , Leucemia-Linfoma Linfoblástico de Células T Precursoras/patologia , Leucemia-Linfoma Linfoblástico de Células T Precursoras/imunologia , Idoso , Adulto Jovem , Biomarcadores Tumorais/genética , Adolescente , Criança
9.
Cereb Cortex ; 34(9)2024 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-39329357

RESUMO

Arithmetic, a high-order cognitive ability, show marked individual difference over development. Despite recent advancements in neuroimaging techniques have enabled the identification of brain markers for individual differences in high-order cognitive abilities, it remains largely unknown about the brain markers for arithmetic. This study used a data-driven connectome-based prediction model to identify brain markers of arithmetic skills from arithmetic-state functional connectivity and individualized structural similarity in 132 children aged 8 to 15 years. We found that both subtraction-state functional connectivity and individualized SS successfully predicted subtraction and multiplication skills but multiplication-state functional connectivity failed to predict either skill. Among the four successful prediction models, most predictive connections were located in frontal-parietal, default-mode, and secondary visual networks. Further computational lesion analyses revealed the essential structural role of frontal-parietal network in predicting subtraction and the essential functional roles of secondary visual, language, and ventral multimodal networks in predicting multiplication. Finally, a few shared nodes but largely nonoverlapping functional and structural connections were found to predict subtraction and multiplication skills. Altogether, our findings provide new insights into the brain markers of arithmetic skills in children and highlight the importance of studying different connectivity modalities and different arithmetic domains to advance our understanding of children's arithmetic skills.


Assuntos
Encéfalo , Conectoma , Imageamento por Ressonância Magnética , Humanos , Criança , Masculino , Feminino , Adolescente , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia , Encéfalo/crescimento & desenvolvimento , Imageamento por Ressonância Magnética/métodos , Conectoma/métodos , Rede Nervosa/diagnóstico por imagem , Rede Nervosa/fisiologia , Conceitos Matemáticos , Matemática , Vias Neurais/fisiologia , Vias Neurais/diagnóstico por imagem , Cognição/fisiologia
10.
Eur Heart J ; 45(1): 45-53, 2024 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-37769352

RESUMO

BACKGROUND AND AIMS: Patients with unprovoked venous thromboembolism (VTE) have a high recurrence risk, and guidelines suggest extended-phase anticoagulation. Many patients never experience recurrence but are exposed to bleeding. The aim of this study was to assess the performance of the Vienna Prediction Model (VPM) and to evaluate if the VPM accurately identifies these patients. METHODS: In patients with unprovoked VTE, the VPM was performed 3 weeks after anticoagulation withdrawal. Those with a predicted 1-year recurrence risk of ≤5.5% were prospectively followed. Study endpoint was recurrent VTE over 2 years. RESULTS: A total of 818 patients received anticoagulation for a median of 3.9 months. 520 patients (65%) had a predicted annual recurrence risk of ≤5.5%. During a median time of 23.9 months, 52 patients had non-fatal recurrence. The recurrence risk was 5.2% [95% confidence interval (CI) 3.2-7.2] at 1 year and 11.2% (95% CI 8.3-14) at 2 years. Model calibration was adequate after 1 year. The VPM underestimated the recurrence risk of patients with a 2-year recurrence rate of >5%. In a post-hoc analysis, the VPM's baseline hazard was recalibrated. Bootstrap validation confirmed an ideal ratio of observed and expected recurrence events. The recurrence risk was highest in men with proximal deep-vein thrombosis or pulmonary embolism and lower in women regardless of the site of incident VTE. CONCLUSIONS: In this prospective evaluation of the performance of the VPM, the 1-year rate of recurrence in patients with unprovoked VTE was 5.2%. Recalibration improved identification of patients at low recurrence risk and stratification into distinct low-risk categories.


Assuntos
Embolia Pulmonar , Tromboembolia Venosa , Masculino , Humanos , Feminino , Tromboembolia Venosa/epidemiologia , Estudos Prospectivos , Anticoagulantes/uso terapêutico , Recidiva , Fatores de Risco
11.
Eur Heart J ; 45(16): 1430-1439, 2024 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-38282532

RESUMO

BACKGROUND AND AIMS: There are no established clinical tools to predict left ventricular (LV) recovery in women with peripartum cardiomyopathy (PPCM). Using data from women enrolled in the ESC EORP PPCM Registry, the aim was to derive a prognostic model to predict LV recovery at 6 months and develop the 'ESC EORP PPCM Recovery Score'-a tool for clinicians to estimate the probability of LV recovery. METHODS: From 2012 to 2018, 752 women from 51 countries were enrolled. Eligibility included (i) a peripartum state, (ii) signs or symptoms of heart failure, (iii) LV ejection fraction (LVEF) ≤ 45%, and (iv) exclusion of alternative causes of heart failure. The model was derived using data from participants in the Registry and internally validated using bootstrap methods. The outcome was LV recovery (LVEF ≥50%) at six months. An integer score was created. RESULTS: Overall, 465 women had a 6-month echocardiogram. LV recovery occurred in 216 (46.5%). The final model included baseline LVEF, baseline LV end diastolic diameter, human development index (a summary measure of a country's social and economic development), duration of symptoms, QRS duration and pre-eclampsia. The model was well-calibrated and had good discriminatory ability (C-statistic 0.79, 95% confidence interval [CI] 0.74-0.83). The model was internally validated (optimism-corrected C-statistic 0.78, 95% CI 0.73-0.82). CONCLUSIONS: A model which accurately predicts LV recovery at 6 months in women with PPCM was derived. The corresponding ESC EORP PPCM Recovery Score can be easily applied in clinical practice to predict the probability of LV recovery for an individual in order to guide tailored counselling and treatment.


Assuntos
Cardiomiopatias , Insuficiência Cardíaca , Complicações Cardiovasculares na Gravidez , Transtornos Puerperais , Gravidez , Feminino , Humanos , Período Periparto , Função Ventricular Esquerda , Volume Sistólico , Cardiomiopatias/diagnóstico
12.
J Infect Dis ; 229(3): 813-823, 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38262629

RESUMO

BACKGROUND: Tuberculosis (TB) treatment-related adverse drug reactions (TB-ADRs) can negatively affect adherence and treatment success rates. METHODS: We developed prediction models for TB-ADRs, considering participants with drug-susceptible pulmonary TB who initiated standard TB therapy. TB-ADRs were determined by the physician attending the participant, assessing causality to TB drugs, the affected organ system, and grade. Potential baseline predictors of TB-ADR included concomitant medication (CM) use, human immunodeficiency virus (HIV) status, glycated hemoglobin (HbA1c), age, body mass index (BMI), sex, substance use, and TB drug metabolism variables (NAT2 acetylator profiles). The models were developed through bootstrapped backward selection. Cox regression was used to evaluate TB-ADR risk. RESULTS: There were 156 TB-ADRs among 102 of the 945 (11%) participants included. Most TB-ADRs were hepatic (n = 82 [53%]), of moderate severity (grade 2; n = 121 [78%]), and occurred in NAT2 slow acetylators (n = 62 [61%]). The main prediction model included CM use, HbA1c, alcohol use, HIV seropositivity, BMI, and age, with robust performance (c-statistic = 0.79 [95% confidence interval {CI}, .74-.83) and fit (optimism-corrected slope and intercept of -0.09 and 0.94, respectively). An alternative model replacing BMI with NAT2 had similar performance. HIV seropositivity (hazard ratio [HR], 2.68 [95% CI, 1.75-4.09]) and CM use (HR, 5.26 [95% CI, 2.63-10.52]) increased TB-ADR risk. CONCLUSIONS: The models, with clinical variables and with NAT2, were highly predictive of TB-ADRs.


Assuntos
Arilamina N-Acetiltransferase , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Soropositividade para HIV , Tuberculose Pulmonar , Humanos , Antituberculosos/efeitos adversos , Brasil/epidemiologia , Hemoglobinas Glicadas , Soropositividade para HIV/tratamento farmacológico , Tuberculose Pulmonar/tratamento farmacológico , Arilamina N-Acetiltransferase/metabolismo
13.
J Infect Dis ; 230(3): 606-613, 2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-38420871

RESUMO

BACKGROUND: Early risk assessment is needed to stratify Staphylococcus aureus infective endocarditis (SA-IE) risk among patients with S. aureus bacteremia (SAB) to guide clinical management. The objective of the current study was to develop a novel risk score that is independent of subjective clinical judgment and can be used early, at the time of blood culture positivity. METHODS: We conducted a retrospective big data analysis from territory-wide electronic data and included hospitalized patients with SAB between 2009 and 2019. We applied a random forest risk scoring model to select variables from an array of parameters, according to the statistical importance in predicting SA-IE outcome. The data were divided into derivation and validation cohorts. The areas under the curve of the receiver operating characteristic (AUCROCs) were determined. RESULTS: We identified 15 741 SAB patients, among them 658 (4.18%) had SA-IE. The AUCROC was 0.74 (95%CI 0.70-0.76), with a negative predictive value of 0.980 (95%CI 0.977-0.983). The four most discriminatory features were age, history of infective endocarditis, valvular heart disease, and community onset. CONCLUSIONS: We developed a novel risk score with performance comparable with existing scores, which can be used at the time of SAB and prior to subjective clinical judgment.


Assuntos
Bacteriemia , Endocardite Bacteriana , Aprendizado de Máquina , Infecções Estafilocócicas , Staphylococcus aureus , Humanos , Masculino , Infecções Estafilocócicas/microbiologia , Infecções Estafilocócicas/diagnóstico , Feminino , Bacteriemia/microbiologia , Bacteriemia/diagnóstico , Estudos Retrospectivos , Pessoa de Meia-Idade , Staphylococcus aureus/isolamento & purificação , Idoso , Medição de Risco/métodos , Endocardite Bacteriana/microbiologia , Endocardite Bacteriana/diagnóstico , Fatores de Risco , Curva ROC , Adulto
14.
Semin Cancer Biol ; 95: 52-74, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37473825

RESUMO

Head and neck tumors (HNTs) constitute a multifaceted ensemble of pathologies that primarily involve regions such as the oral cavity, pharynx, and nasal cavity. The intricate anatomical structure of these regions poses considerable challenges to efficacious treatment strategies. Despite the availability of myriad treatment modalities, the overall therapeutic efficacy for HNTs continues to remain subdued. In recent years, the deployment of artificial intelligence (AI) in healthcare practices has garnered noteworthy attention. AI modalities, inclusive of machine learning (ML), neural networks (NNs), and deep learning (DL), when amalgamated into the holistic management of HNTs, promise to augment the precision, safety, and efficacy of treatment regimens. The integration of AI within HNT management is intricately intertwined with domains such as medical imaging, bioinformatics, and medical robotics. This article intends to scrutinize the cutting-edge advancements and prospective applications of AI in the realm of HNTs, elucidating AI's indispensable role in prevention, diagnosis, treatment, prognostication, research, and inter-sectoral integration. The overarching objective is to stimulate scholarly discourse and invigorate insights among medical practitioners and researchers to propel further exploration, thereby facilitating superior therapeutic alternatives for patients.


Assuntos
Inteligência Artificial , Neoplasias de Cabeça e Pescoço , Humanos , Aprendizado de Máquina , Redes Neurais de Computação , Neoplasias de Cabeça e Pescoço/diagnóstico , Neoplasias de Cabeça e Pescoço/terapia , Diagnóstico por Imagem/métodos
15.
BMC Bioinformatics ; 25(1): 324, 2024 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-39379821

RESUMO

BACKGROUND: Safe drug treatment requires an understanding of the potential side effects. Identifying the frequency of drug side effects can reduce the risks associated with drug use. However, existing computational methods for predicting drug side effect frequencies heavily depend on known drug side effect frequency information. Consequently, these methods face challenges when predicting the side effect frequencies of new drugs. Although a few methods can predict the side effect frequencies of new drugs, they exhibit unreliable performance owing to the exclusion of drug-side effect relationships. RESULTS: This study proposed CrossFeat, a model based on convolutional neural network-transformer architecture with cross-feature learning that can predict the occurrence and frequency of drug side effects for new drugs, even in the absence of information regarding drug-side effect relationships. CrossFeat facilitates the concurrent learning of drugs and side effect information within its transformer architecture. This simultaneous exchange of information enables drugs to learn about their associated side effects, while side effects concurrently acquire information about the respective drugs. Such bidirectional learning allows for the comprehensive integration of drug and side effect knowledge. Our five-fold cross-validation experiments demonstrated that CrossFeat outperforms existing studies in predicting side effect frequencies for new drugs without prior knowledge. CONCLUSIONS: Our model offers a promising approach for predicting the drug side effect frequencies, particularly for new drugs where prior information is limited. CrossFeat's superior performance in cross-validation experiments, along with evidence from case studies and ablation experiments, highlights its effectiveness.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Redes Neurais de Computação , Humanos , Biologia Computacional/métodos , Aprendizado de Máquina
16.
BMC Bioinformatics ; 25(1): 56, 2024 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-38308205

RESUMO

BACKGROUND: Genome-wide association studies have successfully identified genetic variants associated with human disease. Various statistical approaches based on penalized and machine learning methods have recently been proposed for disease prediction. In this study, we evaluated the performance of several such methods for predicting asthma using the Korean Chip (KORV1.1) from the Korean Genome and Epidemiology Study (KoGES). RESULTS: First, single-nucleotide polymorphisms were selected via single-variant tests using logistic regression with the adjustment of several epidemiological factors. Next, we evaluated the following methods for disease prediction: ridge, least absolute shrinkage and selection operator, elastic net, smoothly clipped absolute deviation, support vector machine, random forest, boosting, bagging, naïve Bayes, and k-nearest neighbor. Finally, we compared their predictive performance based on the area under the curve of the receiver operating characteristic curves, precision, recall, F1-score, Cohen's Kappa, balanced accuracy, error rate, Matthews correlation coefficient, and area under the precision-recall curve. Additionally, three oversampling algorithms are used to deal with imbalance problems. CONCLUSIONS: Our results show that penalized methods exhibit better predictive performance for asthma than that achieved via machine learning methods. On the other hand, in the oversampling study, randomforest and boosting methods overall showed better prediction performance than penalized methods.


Assuntos
Algoritmos , Estudo de Associação Genômica Ampla , Humanos , Teorema de Bayes , Aprendizado de Máquina , República da Coreia/epidemiologia
17.
J Proteome Res ; 23(10): 4648-4657, 2024 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-39253780

RESUMO

Platinum resistance in ovarian cancer poses a significant challenge, substantially impacting patient outcomes. Developing an accurate predictive model is crucial for improving clinical decision-making and guiding treatment strategies. Proteomic data from 217 high-grade serous ovarian cancer (HGSOC) biospecimens obtained from JHU, PNNL, and PTRC were used to construct a prediction model for identifying individuals who are resistant to platinum-based chemotherapy. A total of 6437 common proteins were detected across all data sets, with 26 proteins overlapping between the development cohorts JHU and PNNL. Using LASSO and logistic regression analysis, a six-protein model (P31323_PRKAR2B, Q13309_SKP2, Q14997_PSME4, Q6ZRP7_QSOX2, Q7LGA3_HS2ST1, and Q7Z2Z2_EFL1) was developed, which accurately predicted platinum resistance, with an AUC of 0.964 (95% CI, 0.929-0.999). Internal validation by resampling resulted in a C-index of 0.972 (95% CI 0.894-0.988). External validation performed on the PTRC cohort achieved an AUC of 0.855 (95% CI 0.748-0.963). Calibration curves showed good consistency, and DCA indicated superior clinical utility. The model also performed well in predicting PFS and OS at various time points. Based on these proteins, our predictive model can precisely predict platinum response and survival outcomes in HGSOC patients, which can assist clinicians in promptly identifying potentially platinum-resistant individuals.


Assuntos
Resistencia a Medicamentos Antineoplásicos , Neoplasias Ovarianas , Proteômica , Humanos , Feminino , Neoplasias Ovarianas/tratamento farmacológico , Neoplasias Ovarianas/metabolismo , Neoplasias Ovarianas/patologia , Proteômica/métodos , Pessoa de Meia-Idade , Idoso , Platina/uso terapêutico , Antineoplásicos/uso terapêutico
18.
Diabetologia ; 67(5): 885-894, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38374450

RESUMO

AIMS/HYPOTHESIS: People with type 2 diabetes are heterogeneous in their disease trajectory, with some progressing more quickly to insulin initiation than others. Although classical biomarkers such as age, HbA1c and diabetes duration are associated with glycaemic progression, it is unclear how well such variables predict insulin initiation or requirement and whether newly identified markers have added predictive value. METHODS: In two prospective cohort studies as part of IMI-RHAPSODY, we investigated whether clinical variables and three types of molecular markers (metabolites, lipids, proteins) can predict time to insulin requirement using different machine learning approaches (lasso, ridge, GRridge, random forest). Clinical variables included age, sex, HbA1c, HDL-cholesterol and C-peptide. Models were run with unpenalised clinical variables (i.e. always included in the model without weights) or penalised clinical variables, or without clinical variables. Model development was performed in one cohort and the model was applied in a second cohort. Model performance was evaluated using Harrel's C statistic. RESULTS: Of the 585 individuals from the Hoorn Diabetes Care System (DCS) cohort, 69 required insulin during follow-up (1.0-11.4 years); of the 571 individuals in the Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS) cohort, 175 required insulin during follow-up (0.3-11.8 years). Overall, the clinical variables and proteins were selected in the different models most often, followed by the metabolites. The most frequently selected clinical variables were HbA1c (18 of the 36 models, 50%), age (15 models, 41.2%) and C-peptide (15 models, 41.2%). Base models (age, sex, BMI, HbA1c) including only clinical variables performed moderately in both the DCS discovery cohort (C statistic 0.71 [95% CI 0.64, 0.79]) and the GoDARTS replication cohort (C 0.71 [95% CI 0.69, 0.75]). A more extensive model including HDL-cholesterol and C-peptide performed better in both cohorts (DCS, C 0.74 [95% CI 0.67, 0.81]; GoDARTS, C 0.73 [95% CI 0.69, 0.77]). Two proteins, lactadherin and proto-oncogene tyrosine-protein kinase receptor, were most consistently selected and slightly improved model performance. CONCLUSIONS/INTERPRETATION: Using machine learning approaches, we show that insulin requirement risk can be modestly well predicted by predominantly clinical variables. Inclusion of molecular markers improves the prognostic performance beyond that of clinical variables by up to 5%. Such prognostic models could be useful for identifying people with diabetes at high risk of progressing quickly to treatment intensification. DATA AVAILABILITY: Summary statistics of lipidomic, proteomic and metabolomic data are available from a Shiny dashboard at https://rhapdata-app.vital-it.ch .


Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/metabolismo , Estudos Prospectivos , Peptídeo C , Proteômica , Insulina/uso terapêutico , Biomarcadores , Aprendizado de Máquina , Colesterol
19.
Int J Cancer ; 154(10): 1760-1771, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38296842

RESUMO

Predicting who will benefit from treatment with immune checkpoint inhibition (ICI) in patients with advanced melanoma is challenging. We developed a multivariable prediction model for response to ICI, using routinely available clinical data including primary melanoma characteristics. We used a population-based cohort of 3525 patients with advanced cutaneous melanoma treated with anti-PD-1-based therapy. Our prediction model for predicting response within 6 months after ICI initiation was internally validated with bootstrap resampling. Performance evaluation included calibration, discrimination and internal-external cross-validation. Included patients received anti-PD-1 monotherapy (n = 2366) or ipilimumab plus nivolumab (n = 1159) in any treatment line. The model included serum lactate dehydrogenase, World Health Organization performance score, type and line of ICI, disease stage and time to first distant recurrence-all at start of ICI-, and location and type of primary melanoma, the presence of satellites and/or in-transit metastases at primary diagnosis and sex. The over-optimism adjusted area under the receiver operating characteristic was 0.66 (95% CI: 0.64-0.66). The range of predicted response probabilities was 7%-81%. Based on these probabilities, patients were categorized into quartiles. Compared to the lowest response quartile, patients in the highest quartile had a significantly longer median progression-free survival (20.0 vs 2.8 months; P < .001) and median overall survival (62.0 vs 8.0 months; P < .001). Our prediction model, based on routinely available clinical variables and primary melanoma characteristics, predicts response to ICI in patients with advanced melanoma and discriminates well between treated patients with a very good and very poor prognosis.


Assuntos
Melanoma , Neoplasias Cutâneas , Humanos , Melanoma/patologia , Inibidores de Checkpoint Imunológico/uso terapêutico , Neoplasias Cutâneas/patologia , Ipilimumab/uso terapêutico , Nivolumabe/uso terapêutico , Estudos Retrospectivos
20.
Cancer Sci ; 115(6): 1820-1833, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38571294

RESUMO

Radiotherapy, one of the most fundamental cancer treatments, is confronted with the dilemma of treatment failure due to radioresistance. To predict the radiosensitivity and improve tumor treatment efficiency in pan-cancer, we developed a model called Radiation Intrinsic Sensitivity Evaluation (RISE). The RISE model was built using cell line-based mRNA sequencing data from five tumor types with varying radiation sensitivity. Through four cell-derived datasets, two public tissue-derived cohorts, and one local cohort of 42 nasopharyngeal carcinoma patients, we demonstrated that RISE could effectively predict the level of radiation sensitivity (area under the ROC curve [AUC] from 0.666 to 1 across different datasets). After the verification by the colony formation assay and flow cytometric analysis of apoptosis, our four well-established radioresistant cell models successfully proved higher RISE values in radioresistant cells by RT-qPCR experiments. We also explored the prognostic value of RISE in five independent TCGA cohorts consisting of 1137 patients who received radiation therapy and found that RISE was an independent adverse prognostic factor (pooled multivariate Cox regression hazard ratio [HR]: 1.84, 95% CI 1.39-2.42; p < 0.01). RISE showed a promising ability to evaluate the radiotherapy benefit while predicting the prognosis of cancer patients, enabling clinicians to make individualized radiotherapy strategies in the future and improve the success rate of radiotherapy.


Assuntos
Neoplasias , Tolerância a Radiação , Humanos , Tolerância a Radiação/genética , Prognóstico , Neoplasias/radioterapia , Neoplasias/genética , Neoplasias/patologia , Linhagem Celular Tumoral , Feminino , Masculino , Apoptose/efeitos da radiação , Pessoa de Meia-Idade , Curva ROC , Carcinoma Nasofaríngeo/radioterapia , Carcinoma Nasofaríngeo/genética , Carcinoma Nasofaríngeo/patologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA