Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Base de dados
Tipo de documento
Assunto da revista
Intervalo de ano de publicação
1.
Nat Genet ; 55(9): 1512-1522, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37563329

RESUMO

Predicting the effects of coding variants is a major challenge. While recent deep-learning models have improved variant effect prediction accuracy, they cannot analyze all coding variants due to dependency on close homologs or software limitations. Here we developed a workflow using ESM1b, a 650-million-parameter protein language model, to predict all ~450 million possible missense variant effects in the human genome, and made all predictions available on a web portal. ESM1b outperformed existing methods in classifying ~150,000 ClinVar/HGMD missense variants as pathogenic or benign and predicting measurements across 28 deep mutational scan datasets. We further annotated ~2 million variants as damaging only in specific protein isoforms, demonstrating the importance of considering all isoforms when predicting variant effects. Our approach also generalizes to more complex coding variants such as in-frame indels and stop-gains. Together, these results establish protein language models as an effective, accurate and general approach to predicting variant effects.


Assuntos
Biologia Computacional , Software , Humanos , Biologia Computacional/métodos , Mutação de Sentido Incorreto/genética , Proteínas/genética , Genoma Humano/genética
2.
Nat Genet ; 54(4): 437-449, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35361970

RESUMO

We conduct a genome-wide association study (GWAS) of educational attainment (EA) in a sample of ~3 million individuals and identify 3,952 approximately uncorrelated genome-wide-significant single-nucleotide polymorphisms (SNPs). A genome-wide polygenic predictor, or polygenic index (PGI), explains 12-16% of EA variance and contributes to risk prediction for ten diseases. Direct effects (i.e., controlling for parental PGIs) explain roughly half the PGI's magnitude of association with EA and other phenotypes. The correlation between mate-pair PGIs is far too large to be consistent with phenotypic assortment alone, implying additional assortment on PGI-associated factors. In an additional GWAS of dominance deviations from the additive model, we identify no genome-wide-significant SNPs, and a separate X-chromosome additive GWAS identifies 57.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética
3.
Nat Hum Behav ; 5(12): 1744-1758, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34140656

RESUMO

Polygenic indexes (PGIs) are DNA-based predictors. Their value for research in many scientific disciplines is growing rapidly. As a resource for researchers, we used a consistent methodology to construct PGIs for 47 phenotypes in 11 datasets. To maximize the PGIs' prediction accuracies, we constructed them using genome-wide association studies-some not previously published-from multiple data sources, including 23andMe and UK Biobank. We present a theoretical framework to help interpret analyses involving PGIs. A key insight is that a PGI can be understood as an unbiased but noisy measure of a latent variable we call the 'additive SNP factor'. Regressions in which the true regressor is this factor but the PGI is used as its proxy therefore suffer from errors-in-variables bias. We derive an estimator that corrects for the bias, illustrate the correction, and make a Python tool for implementing it publicly available.


Assuntos
Bases de Dados Genéticas , Herança Multifatorial , Polimorfismo de Nucleotídeo Único , Análise de Dados , Estudo de Associação Genômica Ampla , Humanos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa