Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Am J Hum Genet ; 110(6): 927-939, 2023 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-37224807

RESUMEN

Genome-wide association studies (GWASs) have identified thousands of variants for disease risk. These studies have predominantly been conducted in individuals of European ancestries, which raises questions about their transferability to individuals of other ancestries. Of particular interest are admixed populations, usually defined as populations with recent ancestry from two or more continental sources. Admixed genomes contain segments of distinct ancestries that vary in composition across individuals in the population, allowing for the same allele to induce risk for disease on different ancestral backgrounds. This mosaicism raises unique challenges for GWASs in admixed populations, such as the need to correctly adjust for population stratification. In this work we quantify the impact of differences in estimated allelic effect sizes for risk variants between ancestry backgrounds on association statistics. Specifically, while the possibility of estimated allelic effect-size heterogeneity by ancestry (HetLanc) can be modeled when performing a GWAS in admixed populations, the extent of HetLanc needed to overcome the penalty from an additional degree of freedom in the association statistic has not been thoroughly quantified. Using extensive simulations of admixed genotypes and phenotypes, we find that controlling for and conditioning effect sizes on local ancestry can reduce statistical power by up to 72%. This finding is especially pronounced in the presence of allele frequency differentiation. We replicate simulation results using 4,327 African-European admixed genomes from the UK Biobank for 12 traits to find that for most significant SNPs, HetLanc is not large enough for GWASs to benefit from modeling heterogeneity in this way.


Asunto(s)
Genética de Población , Estudio de Asociación del Genoma Completo , Humanos , Estudio de Asociación del Genoma Completo/métodos , Frecuencia de los Genes/genética , Genotipo , Fenotipo , Polimorfismo de Nucleótido Simple/genética
2.
Am J Hum Genet ; 109(1): 24-32, 2022 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-34861179

RESUMEN

Genetic correlation is an important parameter in efforts to understand the relationships among complex traits. Current methods that analyze individual genotype data for estimating genetic correlation are challenging to scale to large datasets. Methods that analyze summary data, while being computationally efficient, tend to yield estimates of genetic correlation with reduced precision. We propose SCORE (scalable genetic correlation estimator), a randomized method of moments estimator of genetic correlation that is both scalable and accurate. SCORE obtains more precise estimates of genetic correlations relative to summary-statistic methods that can be applied at scale; it achieves a 44% reduction in standard error relative to LD-score regression (LDSC) and a 20% reduction relative to high-definition likelihood (HDL) (averaged over all simulations). The efficiency of SCORE enables computation of genetic correlations on the UK Biobank dataset, consisting of ≈300 K individuals and ≈500 K SNPs, in a few h (orders of magnitude faster than methods that analyze individual data, such as GCTA). Across 780 pairs of traits in 291,273 unrelated white British individuals in the UK Biobank, SCORE identifies significant genetic correlation between 200 additional pairs of traits over LDSC (beyond the 245 pairs identified by both).


Asunto(s)
Bancos de Muestras Biológicas , Estudios de Asociación Genética , Antecedentes Genéticos , Modelos Genéticos , Fenotipo , Algoritmos , Variación Genética , Humanos , Herencia Multifactorial , Reproducibilidad de los Resultados , Reino Unido
3.
Am J Hum Genet ; 109(4): 692-709, 2022 04 07.
Artículo en Inglés | MEDLINE | ID: mdl-35271803

RESUMEN

Recent works have shown that SNP heritability-which is dominated by low-effect common variants-may not be the most relevant quantity for localizing high-effect/critical disease genes. Here, we introduce methods to estimate the proportion of phenotypic variance explained by a given assignment of SNPs to a single gene ("gene-level heritability"). We partition gene-level heritability by minor allele frequency (MAF) to find genes whose gene-level heritability is explained exclusively by "low-frequency/rare" variants (0.5% ≤ MAF < 1%). Applying our method to ∼16K protein-coding genes and 25 quantitative traits in the UK Biobank (N = 290K "White British"), we find that, on average across traits, ∼2.5% of nonzero-heritability genes have a rare-variant component and only ∼0.8% (327 gene-trait pairs) have heritability exclusively from rare variants. Of these 327 gene-trait pairs, 114 (35%) were not detected by existing gene-level association testing methods. The additional genes we identify are significantly enriched for known disease genes, and we find several examples of genes that have been previously implicated in phenotypically related Mendelian disorders. Notably, the rare-variant component of gene-level heritability exhibits trends different from those of common-variant gene-level heritability. For example, while total gene-level heritability increases with gene length, the rare-variant component is significantly larger among shorter genes; the cumulative distributions of gene-level heritability also vary across traits and reveal differences in the relative contributions of rare/common variants to overall gene-level polygenicity. While nonzero gene-level heritability does not imply causality, if interpreted in the correct context, gene-level heritability can reveal useful insights into complex-trait genetic architecture.


Asunto(s)
Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Frecuencia de los Genes/genética , Estudio de Asociación del Genoma Completo/métodos , Humanos , Herencia Multifactorial/genética , Fenotipo , Polimorfismo de Nucleótido Simple/genética
4.
Am J Hum Genet ; 108(5): 799-808, 2021 05 06.
Artículo en Inglés | MEDLINE | ID: mdl-33811807

RESUMEN

The proportion of variation in complex traits that can be attributed to non-additive genetic effects has been a topic of intense debate. The availability of biobank-scale datasets of genotype and trait data from unrelated individuals opens up the possibility of obtaining precise estimates of the contribution of non-additive genetic effects. We present an efficient method to estimate the variation in a complex trait that can be attributed to additive (additive heritability) and dominance deviation (dominance heritability) effects across all genotyped SNPs in a large collection of unrelated individuals. Over a wide range of genetic architectures, our method yields unbiased estimates of additive and dominance heritability. We applied our method, in turn, to array genotypes as well as imputed genotypes (at common SNPs with minor allele frequency [MAF] > 1%) and 50 quantitative traits measured in 291,273 unrelated white British individuals in the UK Biobank. Averaged across these 50 traits, we find that additive heritability on array SNPs is 21.86% while dominance heritability is 0.13% (about 0.48% of the additive heritability) with qualitatively similar results for imputed genotypes. We find no statistically significant evidence for dominance heritability (p<0.05/50 accounting for the number of traits tested) and estimate that dominance heritability is unlikely to exceed 1% for the traits analyzed. Our analyses indicate a limited contribution of dominance heritability to complex trait variation.


Asunto(s)
Bancos de Muestras Biológicas , Conjuntos de Datos como Asunto , Genes Dominantes/genética , Variación Genética , Herencia Multifactorial/genética , Femenino , Humanos , Masculino , Modelos Genéticos , Polimorfismo de Nucleótido Simple/genética
5.
Am J Hum Genet ; 106(6): 805-817, 2020 06 04.
Artículo en Inglés | MEDLINE | ID: mdl-32442408

RESUMEN

Despite strong transethnic genetic correlations reported in the literature for many complex traits, the non-transferability of polygenic risk scores across populations suggests the presence of population-specific components of genetic architecture. We propose an approach that models GWAS summary data for one trait in two populations to estimate genome-wide proportions of population-specific/shared causal SNPs. In simulations across various genetic architectures, we show that our approach yields approximately unbiased estimates with in-sample LD and slight upward-bias with out-of-sample LD. We analyze nine complex traits in individuals of East Asian and European ancestry, restricting to common SNPs (MAF > 5%), and find that most common causal SNPs are shared by both populations. Using the genome-wide estimates as priors in an empirical Bayes framework, we perform fine-mapping and observe that high-posterior SNPs (for both the population-specific and shared causal configurations) have highly correlated effects in East Asians and Europeans. In population-specific GWAS risk regions, we observe a 2.8× enrichment of shared high-posterior SNPs, suggesting that population-specific GWAS risk regions harbor shared causal SNPs that are undetected in the other GWASs due to differences in LD, allele frequencies, and/or sample size. Finally, we report enrichments of shared high-posterior SNPs in 53 tissue-specific functional categories and find evidence that SNP-heritability enrichments are driven largely by many low-effect common SNPs.


Asunto(s)
Etnicidad/genética , Estudio de Asociación del Genoma Completo , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple/genética , Teorema de Bayes , Europa (Continente)/etnología , Asia Oriental/etnología , Frecuencia de los Genes , Humanos , Desequilibrio de Ligamiento , Especificidad de Órganos/genética
6.
Bioinformatics ; 36(24): 5640-5648, 2021 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-33453114

RESUMEN

MOTIVATION: While gene-environment (GxE) interactions contribute importantly to many different phenotypes, detecting such interactions requires well-powered studies and has proven difficult. To address this, we combine two approaches to improve GxE power: simultaneously evaluating multiple phenotypes and using a two-step analysis approach. Previous work shows that the power to identify a main genetic effect can be improved by simultaneously analyzing multiple related phenotypes. For a univariate phenotype, two-step methods produce higher power for detecting a GxE interaction compared to single step analysis. Therefore, we propose a two-step approach to test for an overall GxE effect for multiple phenotypes. RESULTS: Using simulations we demonstrate that, when more than one phenotype has GxE effect (i.e. GxE pleiotropy), our approach offers substantial gain in power (18-43%) to detect an aggregate-level GxE effect for a multivariate phenotype compared to an analogous two-step method to identify GxE effect for a univariate phenotype. We applied the proposed approach to simultaneously analyze three lipids, LDL, HDL and Triglyceride with the frequency of alcohol consumption as environmental factor in the UK Biobank. The method identified two loci with an overall GxE effect on the vector of lipids, one of which was missed by the competing approaches. AVAILABILITY AND IMPLEMENTATION: We provide an R package MPGE implementing the proposed approach which is available from CRAN: https://cran.r-project.org/web/packages/MPGE/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

7.
PLoS Comput Biol ; 17(10): e1009483, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34673766

RESUMEN

The number of variants that have a non-zero effect on a trait (i.e. polygenicity) is a fundamental parameter in the study of the genetic architecture of a complex trait. Although many previous studies have investigated polygenicity at a genome-wide scale, a detailed understanding of how polygenicity varies across genomic regions is currently lacking. In this work, we propose an accurate and scalable statistical framework to estimate regional polygenicity for a complex trait. We show that our approach yields approximately unbiased estimates of regional polygenicity in simulations across a wide-range of various genetic architectures. We then partition the polygenicity of anthropometric and blood pressure traits across 6-Mb genomic regions (N = 290K, UK Biobank) and observe that all analyzed traits are highly polygenic: over one-third of regions harbor at least one causal variant for each of the traits analyzed. Additionally, we observe wide variation in regional polygenicity: on average across all traits, 48.9% of regions contain at least 5 causal SNPs, 5.44% of regions contain at least 50 causal SNPs. Finally, we find that heritability is proportional to polygenicity at the regional level, which is consistent with the hypothesis that heritability enrichments are largely driven by the variation in the number of causal SNPs.


Asunto(s)
Genoma Humano/genética , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Herencia Multifactorial/genética , Algoritmos , Presión Sanguínea/genética , Humanos , Polimorfismo de Nucleótido Simple/genética
8.
Am J Hum Genet ; 103(4): 535-552, 2018 10 04.
Artículo en Inglés | MEDLINE | ID: mdl-30290150

RESUMEN

Although recent studies provide evidence for a common genetic basis between complex traits and Mendelian disorders, a thorough quantification of their overlap in a phenotype-specific manner remains elusive. Here, we have quantified the overlap of genes identified through large-scale genome-wide association studies (GWASs) for 62 complex traits and diseases with genes containing mutations known to cause 20 broad categories of Mendelian disorders. We identified a significant enrichment of genes linked to phenotypically matched Mendelian disorders in GWAS gene sets; of the total 1,240 comparisons, a higher proportion of phenotypically matched or related pairs (n = 50 of 92 [54%]) than phenotypically unmatched pairs (n = 27 of 1,148 [2%]) demonstrated significant overlap, confirming a phenotype-specific enrichment pattern. Further, we observed elevated GWAS effect sizes near genes linked to phenotypically matched Mendelian disorders. Finally, we report examples of GWAS variants localized at the transcription start site or physically interacting with the promoters of genes linked to phenotypically matched Mendelian disorders. Our results are consistent with the hypothesis that genes that are disrupted in Mendelian disorders are dysregulated by non-coding variants in complex traits and demonstrate how leveraging findings from related Mendelian disorders and functional genomic datasets can prioritize genes that are putatively dysregulated by local and distal non-coding GWAS variants.


Asunto(s)
Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Femenino , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/métodos , Humanos , Masculino , Fenotipo , Regiones Promotoras Genéticas/genética , Sitio de Iniciación de la Transcripción/fisiología
9.
Genet Epidemiol ; 43(6): 629-645, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31087417

RESUMEN

Dupuytren's disease is a common inherited tissue-specific fibrotic disorder, characterized by progressive and irreversible fibroblastic proliferation affecting the palmar fascia of the hand. Although genome-wide association study (GWAS) have identified 24 genomic regions associated with Dupuytrens risk, the biological mechanisms driving signal at these regions remain elusive. We identify potential biological mechanisms for Dupuytren's disease by integrating the most recent, largest GWAS (3,871 cases and 4,686 controls) with eQTLs (47 tissue panels from five consortia, total n = 3,975) to perform a transcriptome-wide association study. We identify 43 tissue-specific gene associations with Dupuytren's risk, including one in a novel risk region. We also estimate the genome-wide genetic correlation between Dupuytren's disease and 45 complex traits and find significant genetic correlations between Dupuytren's disease and body mass index (BMI), type II diabetes, triglycerides, and high-density lipoprotein (HDL), suggesting a shared genetic etiology between these traits. We further examine local genetic correlation to identify 8 and 3 novel regions significantly correlated with BMI and HDL respectively. Our results are consistent with previous epidemiological findings showing that lower BMI increases risk for Dupuytren's disease. These 12 novel risk regions provide new insight into the biological mechanisms of Dupuytren's disease and serve as a starting point for functional validation.


Asunto(s)
Índice de Masa Corporal , Diabetes Mellitus Tipo 2/genética , Contractura de Dupuytren/etiología , Marcadores Genéticos , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Estudios de Casos y Controles , Cromosomas Humanos Par 17/genética , Contractura de Dupuytren/patología , Humanos , Factores de Riesgo
10.
bioRxiv ; 2023 Jan 24.
Artículo en Inglés | MEDLINE | ID: mdl-36747759

RESUMEN

Genome-wide association studies (GWAS) have identified thousands of variants for disease risk. These studies have predominantly been conducted in individuals of European ancestries, which raises questions about their transferability to individuals of other ancestries. Of particular interest are admixed populations, usually defined as populations with recent ancestry from two or more continental sources. Admixed genomes contain segments of distinct ancestries that vary in composition across individuals in the population, allowing for the same allele to induce risk for disease on different ancestral backgrounds. This mosaicism raises unique challenges for GWAS in admixed populations, such as the need to correctly adjust for population stratification to balance type I error with statistical power. In this work we quantify the impact of differences in estimated allelic effect sizes for risk variants between ancestry backgrounds on association statistics. Specifically, while the possibility of estimated allelic effect-size heterogeneity by ancestry (HetLanc) can be modeled when performing GWAS in admixed populations, the extent of HetLanc needed to overcome the penalty from an additional degree of freedom in the association statistic has not been thoroughly quantified. Using extensive simulations of admixed genotypes and phenotypes we find that modeling HetLanc in its absence reduces statistical power by up to 72%. This finding is especially pronounced in the presence of allele frequency differentiation. We replicate simulation results using 4,327 African-European admixed genomes from the UK Biobank for 12 traits to find that for most significant SNPs HetLanc is not large enough for GWAS to benefit from modeling heterogeneity.

11.
Nat Genet ; 54(1): 30-39, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34931067

RESUMEN

Although the cohort-level accuracy of polygenic risk scores (PRSs)-estimates of genetic value at the individual level-has been widely assessed, uncertainty in PRSs remains underexplored. In the present study, we show that Bayesian PRS methods can estimate the variance of an individual's PRS and can yield well-calibrated credible intervals via posterior sampling. For 13 real traits in the UK Biobank (n = 291,273 unrelated 'white British'), we observe large variances in individual PRS estimates which impact interpretation of PRS-based stratification; averaging across traits, only 0.8% (s.d. = 1.6%) of individuals with PRS point estimates in the top decile have corresponding 95% credible intervals fully contained in the top decile. We provide an analytical estimator for the expectation of individual PRS variance as a function of SNP heritability, number of causal SNPs and sample size. Our results showcase the importance of incorporating uncertainty in individual PRS estimates into subsequent analyses.


Asunto(s)
Predisposición Genética a la Enfermedad , Herencia Multifactorial , Medición de Riesgo , Incertidumbre , Estudios de Asociación Genética , Estudio de Asociación del Genoma Completo , Humanos , Modelos Genéticos , Modelos Estadísticos
12.
Genome Med ; 14(1): 104, 2022 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-36085083

RESUMEN

BACKGROUND: Large medical centers in urban areas, like Los Angeles, care for a diverse patient population and offer the potential to study the interplay between genetic ancestry and social determinants of health. Here, we explore the implications of genetic ancestry within the University of California, Los Angeles (UCLA) ATLAS Community Health Initiative-an ancestrally diverse biobank of genomic data linked with de-identified electronic health records (EHRs) of UCLA Health patients (N=36,736). METHODS: We quantify the extensive continental and subcontinental genetic diversity within the ATLAS data through principal component analysis, identity-by-descent, and genetic admixture. We assess the relationship between genetically inferred ancestry (GIA) and >1500 EHR-derived phenotypes (phecodes). Finally, we demonstrate the utility of genetic data linked with EHR to perform ancestry-specific and multi-ancestry genome and phenome-wide scans across a broad set of disease phenotypes. RESULTS: We identify 5 continental-scale GIA clusters including European American (EA), African American (AA), Hispanic Latino American (HL), South Asian American (SAA) and East Asian American (EAA) individuals and 7 subcontinental GIA clusters within the EAA GIA corresponding to Chinese American, Vietnamese American, and Japanese American individuals. Although we broadly find that self-identified race/ethnicity (SIRE) is highly correlated with GIA, we still observe marked differences between the two, emphasizing that the populations defined by these two criteria are not analogous. We find a total of 259 significant associations between continental GIA and phecodes even after accounting for individuals' SIRE, demonstrating that for some phenotypes, GIA provides information not already captured by SIRE. GWAS identifies significant associations for liver disease in the 22q13.31 locus across the HL and EAA GIA groups (HL p-value=2.32×10-16, EAA p-value=6.73×10-11). A subsequent PheWAS at the top SNP reveals significant associations with neurologic and neoplastic phenotypes specifically within the HL GIA group. CONCLUSIONS: Overall, our results explore the interplay between SIRE and GIA within a disease context and underscore the utility of studying the genomes of diverse individuals through biobank-scale genotyping linked with EHR-based phenotyping.


Asunto(s)
Registros Electrónicos de Salud , Salud Pública , Pueblo Asiatico , Bancos de Muestras Biológicas , Genómica , Humanos
13.
Nat Commun ; 11(1): 4020, 2020 08 11.
Artículo en Inglés | MEDLINE | ID: mdl-32782262

RESUMEN

While variance components analysis has emerged as a powerful tool in complex trait genetics, existing methods for fitting variance components do not scale well to large-scale datasets of genetic variation. Here, we present a method for variance components analysis that is accurate and efficient: capable of estimating one hundred variance components on a million individuals genotyped at a million SNPs in a few hours. We illustrate the utility of our method in estimating and partitioning variation in a trait explained by genotyped SNPs (SNP-heritability). Analyzing 22 traits with genotypes from 300,000 individuals across about 8 million common and low frequency SNPs, we observe that per-allele squared effect size increases with decreasing minor allele frequency (MAF) and linkage disequilibrium (LD) consistent with the action of negative selection. Partitioning heritability across 28 functional annotations, we observe enrichment of heritability in FANTOM5 enhancers in asthma, eczema, thyroid and autoimmune disorders.


Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Modelos Genéticos , Alelos , Frecuencia de los Genes , Genotipo , Humanos , Desequilibrio de Ligamiento , Herencia Multifactorial/genética , Fenotipo , Polimorfismo de Nucleótido Simple , Carácter Cuantitativo Heredable
14.
Nat Genet ; 51(8): 1244-1251, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31358995

RESUMEN

SNP-heritability is a fundamental quantity in the study of complex traits. Recent studies have shown that existing methods to estimate genome-wide SNP-heritability can yield biases when their assumptions are violated. While various approaches have been proposed to account for frequency- and linkage disequilibrium (LD)-dependent genetic architectures, it remains unclear which estimates reported in the literature are reliable. Here we show that genome-wide SNP-heritability can be accurately estimated from biobank-scale data irrespective of genetic architecture, without specifying a heritability model or partitioning SNPs by allele frequency and/or LD. We show analytically and through extensive simulations starting from real genotypes (UK Biobank, N = 337 K) that, unlike existing methods, our closed-form estimator is robust across a wide range of architectures. We provide estimates of SNP-heritability for 22 complex traits in the UK Biobank and show that, consistent with our results in simulations, existing biobank-scale methods yield estimates up to 30% different from our theoretically-justified approach.


Asunto(s)
Bancos de Muestras Biológicas/estadística & datos numéricos , Genoma Humano , Desequilibrio de Ligamiento , Modelos Teóricos , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple , Carácter Cuantitativo Heredable , Estudio de Asociación del Genoma Completo , Humanos , Fenotipo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA