Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
Tipo de documento
Assunto da revista
País de afiliação
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 52(D1): D1143-D1154, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38183205

RESUMO

Machine Learning-based scoring and classification of genetic variants aids the assessment of clinical findings and is employed to prioritize variants in diverse genetic studies and analyses. Combined Annotation-Dependent Depletion (CADD) is one of the first methods for the genome-wide prioritization of variants across different molecular functions and has been continuously developed and improved since its original publication. Here, we present our most recent release, CADD v1.7. We explored and integrated new annotation features, among them state-of-the-art protein language model scores (Meta ESM-1v), regulatory variant effect predictions (from sequence-based convolutional neural networks) and sequence conservation scores (Zoonomia). We evaluated the new version on data sets derived from ClinVar, ExAC/gnomAD and 1000 Genomes variants. For coding effects, we tested CADD on 31 Deep Mutational Scanning (DMS) data sets from ProteinGym and, for regulatory effect prediction, we used saturation mutagenesis reporter assay data of promoter and enhancer sequences. The inclusion of new features further improved the overall performance of CADD. As with previous releases, all data sets, genome-wide CADD v1.7 scores, scripts for on-site scoring and an easy-to-use webserver are readily provided via https://cadd.bihealth.org/ or https://cadd.gs.washington.edu/ to the community.


Assuntos
Variação Genética , Genoma Humano , Aprendizado de Máquina , Software , Nucleotídeos , Humanos
2.
Bioinformatics ; 39(5)2023 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-37084271

RESUMO

MOTIVATION: Missense variants are a frequent class of variation within the coding genome, and some of them cause Mendelian diseases. Despite advances in computational prediction, classifying missense variants into pathogenic or benign remains a major challenge in the context of personalized medicine. Recently, the structure of the human proteome was derived with unprecedented accuracy using the artificial intelligence system AlphaFold2. This raises the question of whether AlphaFold2 wild-type structures can improve the accuracy of computational pathogenicity prediction for missense variants. RESULTS: To address this, we first engineered a set of features for each amino acid from these structures. We then trained a random forest to distinguish between relatively common (proxy-benign) and singleton (proxy-pathogenic) missense variants from gnomAD v3.1. This yielded a novel AlphaFold2-based pathogenicity prediction score, termed AlphScore. Important feature classes used by AlphScore are solvent accessibility, amino acid network related features, features describing the physicochemical environment, and AlphaFold2's quality parameter (predicted local distance difference test). AlphScore alone showed lower performance than existing in silico scores used for missense prediction, such as CADD or REVEL. However, when AlphScore was added to those scores, the performance increased, as measured by the approximation of deep mutational scan data, as well as the prediction of expert-curated missense variants from the ClinVar database. Overall, our data indicate that the integration of AlphaFold2-predicted structures can improve pathogenicity prediction of missense variants. AVAILABILITY AND IMPLEMENTATION: AlphScore, combinations of AlphScore with existing scores, as well as variants used for training and testing are publicly available.


Assuntos
Inteligência Artificial , Biologia Computacional , Humanos , Virulência , Mutação de Sentido Incorreto , Mutação
3.
NAR Genom Bioinform ; 5(4): lqad102, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38025047

RESUMO

Analyses of cell-free DNA (cfDNA) are increasingly being employed for various diagnostic and research applications. Many technologies aim to increase resolution, e.g. for detecting early-stage cancer or minimal residual disease. However, these efforts may be confounded by inherent base composition biases of cfDNA, specifically the over - and underrepresentation of guanine (G) and cytosine (C) sequences. Currently, there is no universally applicable tool to correct these effects on sequencing read-level data. Here, we present GCparagon, a two-stage algorithm for computing and correcting GC biases in cfDNA samples. In the initial step, length and GC base count parameters are determined. Here, our algorithm minimizes the inclusion of known problematic genomic regions, such as low-mappability regions, in its calculations. In the second step, GCparagon computes weights counterbalancing the distortion of cfDNA attributes (correction matrix). These fragment weights are added to a binary alignment map (BAM) file as alignment tags for individual reads. The GC correction matrix or the tagged BAM file can be used for downstream analyses. Parallel computing allows for a GC bias estimation below 1 min. We demonstrate that GCparagon vastly improves the analysis of regulatory regions, which frequently show specific GC composition patterns and will contribute to standardized cfDNA applications.

4.
Sci Rep ; 5: 13935, 2015 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-26358883

RESUMO

Olfactory perception is mediated by a multitude of olfactory receptors, whose expression in the sensory surface, the olfactory epithelium, is spatially regulated. A common theme is the segregation of different olfactory receptors in different expression domains, which in turn leads to corresponding segregation in the neuronal responses to different odor groups. The amphibian olfactory receptor gene family of trace amine associated receptors, in short TAARs, is exceedingly small and allows a comprehensive analysis of spatial expression patterns, as well as a comparison with neuronal responses to the expected ligands for this receptor family, amines. Here we report that TAAR4b exhibits a spatial expression pattern characteristically different in two dimensions from that of TAAR4a, its close homolog. Together, these two genes result in a bimodal distribution resembling that of amine responses as visualized by calcium imaging. A stringent quantitative analysis suggests the involvement of additional olfactory receptors in amphibian responses to amine odors.


Assuntos
Aminas , Regulação da Expressão Gênica , Odorantes , Domínios e Motivos de Interação entre Proteínas/genética , Receptores Acoplados a Proteínas G/genética , Anfíbios , Animais , Neurônios Receptores Olfatórios/metabolismo , Especificidade de Órgãos/genética , Filogenia , Receptores Acoplados a Proteínas G/química , Receptores Acoplados a Proteínas G/classificação , Receptores Odorantes/genética , Xenopus
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA