Búsqueda | Portal de Búsqueda de la BVS Colombia

Improving GWAS discovery and genomic prediction accuracy in biobank data.

Orliac, Etienne J; Trejo Banos, Daniel; Ojavee, Sven E; Läll, Kristi; Mägi, Reedik; Visscher, Peter M; Robinson, Matthew R.

Proc Natl Acad Sci U S A ; 119(31): e2121279119, 2022 08 02.

Artículo en Inglés | MEDLINE | ID: mdl-35905320

RESUMEN

Genetically informed, deep-phenotyped biobanks are an important research resource and it is imperative that the most powerful, versatile, and efficient analysis approaches are used. Here, we apply our recently developed Bayesian grouped mixture of regressions model (GMRM) in the UK and Estonian Biobanks and obtain the highest genomic prediction accuracy reported to date across 21 heritable traits. When compared to other approaches, GMRM accuracy was greater than annotation prediction models run in the LDAK or LDPred-funct software by 15% (SE 7%) and 14% (SE 2%), respectively, and was 18% (SE 3%) greater than a baseline BayesR model without single-nucleotide polymorphism (SNP) markers grouped into minor allele frequency-linkage disequilibrium (MAF-LD) annotation categories. For height, the prediction accuracy R2 was 47% in a UK Biobank holdout sample, which was 76% of the estimated [Formula: see text]. We then extend our GMRM prediction model to provide mixed-linear model association (MLMA) SNP marker estimates for genome-wide association (GWAS) discovery, which increased the independent loci detected to 16,162 in unrelated UK Biobank individuals, compared to 10,550 from BoltLMM and 10,095 from Regenie, a 62 and 65% increase, respectively. The average [Formula: see text] value of the leading markers increased by 15.24 (SE 0.41) for every 1% increase in prediction accuracy gained over a baseline BayesR model across the traits. Thus, we show that modeling genetic associations accounting for MAF and LD differences among SNP markers, and incorporating prior knowledge of genomic function, is important for both genomic prediction and discovery in large-scale individual-level studies.

Asunto(s)

Bases de Datos Genéticas , Estudio de Asociación del Genoma Completo , Medicina de Precisión , Carácter Cuantitativo Heredable , Teorema de Bayes , Inglaterra , Estonia , Genómica , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple

Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits.

Patxot, Marion; Banos, Daniel Trejo; Kousathanas, Athanasios; Orliac, Etienne J; Ojavee, Sven E; Moser, Gerhard; Holloway, Alexander; Sidorenko, Julia; Kutalik, Zoltan; Mägi, Reedik; Visscher, Peter M; Rönnegård, Lars; Robinson, Matthew R.

Nat Commun ; 12(1): 6972, 2021 11 30.

Artículo en Inglés | MEDLINE | ID: mdl-34848700

RESUMEN

We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32-44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.

Asunto(s)

Estudio de Asociación del Genoma Completo , Genómica , Herencia Multifactorial/genética , Teorema de Bayes , Estatura , Índice de Masa Corporal , Enfermedades Cardiovasculares , Diabetes Mellitus Tipo 2 , Técnicas Genéticas , Variación Genética , Genotipo , Humanos , Intrones , Modelos Estadísticos , Sistemas de Lectura Abierta , Fenotipo , Programas Informáticos

Genomic architecture and prediction of censored time-to-event phenotypes with a Bayesian genome-wide analysis.

Ojavee, Sven E; Kousathanas, Athanasios; Trejo Banos, Daniel; Orliac, Etienne J; Patxot, Marion; Läll, Kristi; Mägi, Reedik; Fischer, Krista; Kutalik, Zoltan; Robinson, Matthew R.

Nat Commun ; 12(1): 2337, 2021 04 20.

Artículo en Inglés | MEDLINE | ID: mdl-33879782

RESUMEN

While recent advancements in computation and modelling have improved the analysis of complex traits, our understanding of the genetic basis of the time at symptom onset remains limited. Here, we develop a Bayesian approach (BayesW) that provides probabilistic inference of the genetic architecture of age-at-onset phenotypes in a sampling scheme that facilitates biobank-scale time-to-event analyses. We show in extensive simulation work the benefits BayesW provides in terms of number of discoveries, model performance and genomic prediction. In the UK Biobank, we find many thousands of common genomic regions underlying the age-at-onset of high blood pressure (HBP), cardiac disease (CAD), and type-2 diabetes (T2D), and for the genetic basis of onset reflecting the underlying genetic liability to disease. Age-at-menopause and age-at-menarche are also highly polygenic, but with higher variance contributed by low frequency variants. Genomic prediction into the Estonian Biobank data shows that BayesW gives higher prediction accuracy than other approaches.

Asunto(s)

Edad de Inicio , Genoma Humano , Modelos Genéticos , Herencia Multifactorial , Factores de Edad , Algoritmos , Teorema de Bayes , Enfermedades Cardiovasculares/genética , Simulación por Computador , Bases de Datos Genéticas , Diabetes Mellitus Tipo 2/genética , Estonia , Femenino , Estudios de Asociación Genética , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Hipertensión/genética , Menarquia/genética , Menopausia/genética , Fenotipo , Polimorfismo de Nucleótido Simple , Reino Unido

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA