Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros

Banco de datos
Tipo de estudio
Tipo del documento
Intervalo de año de publicación
1.
Mol Biol Evol ; 41(7)2024 Jul 03.
Artículo en Inglés | MEDLINE | ID: mdl-38934805

RESUMEN

Most algorithms that are used to predict the effects of variants rely on evolutionary conservation. However, a majority of such techniques compute evolutionary conservation by solely using the alignment of multiple sequences while overlooking the evolutionary context of substitution events. We had introduced PHACT, a scoring-based pathogenicity predictor for missense mutations that can leverage phylogenetic trees, in our previous study. By building on this foundation, we now propose PHACTboost, a gradient boosting tree-based classifier that combines PHACT scores with information from multiple sequence alignments, phylogenetic trees, and ancestral reconstruction. By learning from data, PHACTboost outperforms PHACT. Furthermore, the results of comprehensive experiments on carefully constructed sets of variants demonstrated that PHACTboost can outperform 40 prevalent pathogenicity predictors reported in the dbNSFP, including conventional tools, metapredictors, and deep learning-based approaches as well as more recent tools such as AlphaMissense, EVE, and CPT-1. The superiority of PHACTboost over these methods was particularly evident in case of hard variants for which different pathogenicity predictors offered conflicting results. We provide predictions of 215 million amino acid alterations over 20,191 proteins. PHACTboost is available at https://github.com/CompGenomeLab/PHACTboost. PHACTboost can improve our understanding of genetic diseases and facilitate more accurate diagnoses.


Asunto(s)
Mutación Missense , Filogenia , Humanos , Programas Informáticos , Biología Computacional/métodos , Algoritmos , Alineación de Secuencia
2.
Hum Genet ; 2024 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-39110250

RESUMEN

This paper presents an evaluation of predictions submitted for the "HMBS" challenge, a component of the sixth round of the Critical Assessment of Genome Interpretation held in 2021. The challenge required participants to predict the effects of missense variants of the human HMBS gene on yeast growth. The HMBS enzyme, critical for the biosynthesis of heme in eukaryotic cells, is highly conserved among eukaryotes. Despite the application of a variety of algorithms and methods, the performance of predictors was relatively similar, with Kendall's tau correlation coefficients between predictions and experimental scores around 0.3 for a majority of submissions. Notably, the median correlation (≥ 0.34) observed among these predictors, especially the top predictions from different groups, was greater than the correlation observed between their predictions and the actual experimental results. Most predictors were moderately successful in distinguishing between deleterious and benign variants, as evidenced by an area under the receiver operating characteristic (ROC) curve (AUC) of approximately 0.7 respectively. Compared with the recent two rounds of CAGI competitions, we noticed more predictors outperformed the baseline predictor, which is solely based on the amino acid frequencies. Nevertheless, the overall accuracy of predictions is still far short of positive control, which is derived from experimental scores, indicating the necessity for considerable improvements in the field. The most inaccurately predicted variants in this round were associated with the insertion loop, which is absent in many orthologs, suggesting the predictors still heavily rely on the information from multiple sequence alignment.

3.
Mol Biol Evol ; 39(6)2022 06 02.
Artículo en Inglés | MEDLINE | ID: mdl-35639618

RESUMEN

Evolutionary conservation is a fundamental resource for predicting the substitutability of amino acids and the loss of function in proteins. The use of multiple sequence alignment alone-without considering the evolutionary relationships among sequences-results in the redundant counting of evolutionarily related alteration events, as if they were independent. Here, we propose a new method, PHACT, that predicts the pathogenicity of missense mutations directly from the phylogenetic tree of proteins. PHACT travels through the nodes of the phylogenetic tree and evaluates the deleteriousness of a substitution based on the probability differences of ancestral amino acids between neighboring nodes in the tree. Moreover, PHACT assigns weights to each node in the tree based on their distance to the query organism. For each potential amino acid substitution, the algorithm generates a score that is used to calculate the effect of substitution on protein function. To analyze the predictive performance of PHACT, we performed various experiments over the subsets of two datasets that include 3,023 proteins and 61,662 variants in total. The experiments demonstrated that our method outperformed the widely used pathogenicity prediction tools (i.e., SIFT and PolyPhen-2) and achieved a better predictive performance than other conventional statistical approaches presented in dbNSFP. The PHACT source code is available at https://github.com/CompGenomeLab/PHACT.


Asunto(s)
Mutación Missense , Programas Informáticos , Aminoácidos , Filogenia , Proteínas/química , Proteínas/genética , Alineación de Secuencia
4.
Bioinformatics ; 35(24): 5137-5145, 2019 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-31147687

RESUMEN

MOTIVATION: Survival analysis methods that integrate pathways/gene sets into their learning model could identify molecular mechanisms that determine survival characteristics of patients. Rather than first picking the predictive pathways/gene sets from a given collection and then training a predictive model on the subset of genomic features mapped to these selected pathways/gene sets, we developed a novel machine learning algorithm (Path2Surv) that conjointly performs these two steps using multiple kernel learning. RESULTS: We extensively tested our Path2Surv algorithm on 7655 patients from 20 cancer types using cancer-specific pathway/gene set collections and gene expression profiles of these patients. Path2Surv statistically significantly outperformed survival random forest (RF) on 12 out of 20 datasets and obtained comparable predictive performance against survival support vector machine (SVM) using significantly fewer gene expression features (i.e. less than 10% of what survival RF and survival SVM used). AVAILABILITY AND IMPLEMENTATION: Our implementations of survival SVM and Path2Surv algorithms in R are available at https://github.com/mehmetgonen/path2surv together with the scripts that replicate the reported experiments. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Neoplasias , Humanos , Aprendizaje Automático , Programas Informáticos , Máquina de Vectores de Soporte , Análisis de Supervivencia
5.
Sci Rep ; 13(1): 21596, 2023 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-38062059

RESUMEN

Major Depressive Disorder (MDD) is a commonly observed psychiatric disorder that affects more than 2% of the world population with a rising trend. However, disease-associated pathways and biomarkers are yet to be fully comprehended. In this study, we analyzed previously generated RNA-seq data across seven different brain regions from three distinct studies to identify differentially and co-expressed genes for patients with MDD. Differential gene expression (DGE) analysis revealed that NPAS4 is the only gene downregulated in three different brain regions. Furthermore, co-expressing gene modules responsible for glutamatergic signaling are negatively enriched in these regions. We used the results of both DGE and co-expression analyses to construct a novel MDD-associated pathway. In our model, we propose that disruption in glutamatergic signaling-related pathways might be associated with the downregulation of NPAS4 and many other immediate-early genes (IEGs) that control synaptic plasticity. In addition to DGE analysis, we identified the relative importance of KEGG pathways in discriminating MDD phenotype using a machine learning-based approach. We anticipate that our study will open doors to developing better therapeutic approaches targeting glutamatergic receptors in the treatment of MDD.


Asunto(s)
Trastorno Depresivo Mayor , Humanos , Encéfalo/metabolismo , Trastorno Depresivo Mayor/genética , Trastorno Depresivo Mayor/metabolismo , Redes Reguladoras de Genes , Genes Inmediatos-Precoces , Transducción de Señal
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA