Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Nat Genet ; 55(9): 1494-1502, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37640881

RESUMO

Linkage disequilibrium (LD) is the correlation among nearby genetic variants. In genetic association studies, LD is often modeled using large correlation matrices, but this approach is inefficient, especially in ancestrally diverse studies. In the present study, we introduce LD graphical models (LDGMs), which are an extremely sparse and efficient representation of LD. LDGMs are derived from genome-wide genealogies; statistical relationships among alleles in the LDGM correspond to genealogical relationships among haplotypes. We published LDGMs and ancestry-specific LDGM precision matrices for 18 million common variants (minor allele frequency >1%) in five ancestry groups, validated their accuracy and demonstrated order-of-magnitude improvements in runtime for commonly used LD matrix computations. We implemented an extremely fast multiancestry polygenic prediction method, BLUPx-ldgm, which performs better than a similar method based on the reference LD correlation matrix. LDGMs will enable sophisticated methods that scale to ancestrally diverse genetic association data across millions of variants and individuals.


Assuntos
Desequilíbrio de Ligação , Humanos , Alelos , Frequência do Gene/genética , Estudos de Associação Genética , Haplótipos/genética
2.
Bioinformatics ; 39(9)2023 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-37647640

RESUMO

MOTIVATION: Existing methods for simulating synthetic genotype and phenotype datasets have limited scalability, constraining their usability for large-scale analyses. Moreover, a systematic approach for evaluating synthetic data quality and a benchmark synthetic dataset for developing and evaluating methods for polygenic risk scores are lacking. RESULTS: We present HAPNEST, a novel approach for efficiently generating diverse individual-level genotypic and phenotypic data. In comparison to alternative methods, HAPNEST shows faster computational speed and a lower degree of relatedness with reference panels, while generating datasets that preserve key statistical properties of real data. These desirable synthetic data properties enabled us to generate 6.8 million common variants and nine phenotypes with varying degrees of heritability and polygenicity across 1 million individuals. We demonstrate how HAPNEST can facilitate biobank-scale analyses through the comparison of seven methods to generate polygenic risk scoring across multiple ancestry groups and different genetic architectures. AVAILABILITY AND IMPLEMENTATION: A synthetic dataset of 1 008 000 individuals and nine traits for 6.8 million common variants is available at https://www.ebi.ac.uk/biostudies/studies/S-BSST936. The HAPNEST software for generating synthetic datasets is available as Docker/Singularity containers and open source Julia and C code at https://github.com/intervene-EU-H2020/synthetic_data.


Assuntos
Benchmarking , Confiabilidade dos Dados , Humanos , Genótipo , Fenótipo , Herança Multifatorial
3.
Nature ; 614(7948): 492-499, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36755099

RESUMO

Both common and rare genetic variants influence complex traits and common diseases. Genome-wide association studies have identified thousands of common-variant associations, and more recently, large-scale exome sequencing studies have identified rare-variant associations in hundreds of genes1-3. However, rare-variant genetic architecture is not well characterized, and the relationship between common-variant and rare-variant architecture is unclear4. Here we quantify the heritability explained by the gene-wise burden of rare coding variants across 22 common traits and diseases in 394,783 UK Biobank exomes5. Rare coding variants (allele frequency < 1 × 10-3) explain 1.3% (s.e. = 0.03%) of phenotypic variance on average-much less than common variants-and most burden heritability is explained by ultrarare loss-of-function variants (allele frequency < 1 × 10-5). Common and rare variants implicate the same cell types, with similar enrichments, and they have pleiotropic effects on the same pairs of traits, with similar genetic correlations. They partially colocalize at individual genes and loci, but not to the same extent: burden heritability is strongly concentrated in significant genes, while common-variant heritability is more polygenic, and burden heritability is also more strongly concentrated in constrained genes. Finally, we find that burden heritability for schizophrenia and bipolar disorder6,7 is approximately 2%. Our results indicate that rare coding variants will implicate a tractable number of large-effect genes, that common and rare associations are mechanistically convergent, and that rare coding variants will contribute only modestly to missing heritability and population risk stratification.


Assuntos
Exoma , Frequência do Gene , Variação Genética , Herança Multifatorial , Humanos , Exoma/genética , Variação Genética/genética , Estudo de Associação Genômica Ampla , Herança Multifatorial/genética , Fatores de Risco , Reino Unido , Loci Gênicos/genética , Esquizofrenia/genética , Transtorno Bipolar/genética
4.
Nat Genet ; 54(11): 1630-1639, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36280734

RESUMO

The canonical paradigm for converting genetic association to mechanism involves iteratively mapping individual associations to the proximal genes through which they act. In contrast, in the present study we demonstrate the feasibility of extracting biological insights from a very large region of the genome and leverage this strategy to study the genetic influences on autism. Using a new statistical approach, we identified the 33-Mb p-arm of chromosome 16 (16p) as harboring the greatest excess of autism's common polygenic influences. The region also includes the mechanistically cryptic and autism-associated 16p11.2 copy number variant. Analysis of RNA-sequencing data revealed that both the common polygenic influences within 16p and the 16p11.2 deletion were associated with decreased average gene expression across 16p. The transcriptional effects of the rare deletion and diffuse common variation were correlated at the level of individual genes and analysis of Hi-C data revealed patterns of chromatin contact that may explain this transcriptional convergence. These results reflect a new approach for extracting biological insight from genetic association data and suggest convergence of common and rare genetic influences on autism at 16p.


Assuntos
Transtorno Autístico , Humanos , Transtorno Autístico/genética , Variações do Número de Cópias de DNA , Cromossomos , Deleção Cromossômica , Cromossomos Humanos Par 16/genética
5.
Nat Genet ; 54(6): 827-836, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35668300

RESUMO

Disease-associated single-nucleotide polymorphisms (SNPs) generally do not implicate target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis. Here, we developed a heritability-based framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk. Our optimal combined S2G strategy (cS2G) included seven constituent S2G strategies and achieved a precision of 0.75 and a recall of 0.33, more than doubling the recall of any individual strategy. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 5,095 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. We further applied cS2G to provide an empirical assessment of disease omnigenicity; we determined that the top 1% of genes explained roughly half of the SNP heritability linked to all genes and that gene-level architectures vary with variant allele frequency.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
6.
Am J Hum Genet ; 109(3): 405-416, 2022 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-35143757

RESUMO

Unknown SNP-to-gene regulatory architecture complicates efforts to link noncoding GWAS associations with genes implicated by sequencing or functional studies. eQTLs are often used to link SNPs to genes, but expression in bulk tissue explains a small fraction of disease heritability. A simple but successful approach has been to link SNPs with nearby genes via base pair windows, but genes may often be regulated by SNPs outside their window. We propose the abstract mediation model (AMM) to estimate (1) the fraction of heritability mediated by the closest or kth-closest gene to each SNP and (2) the mediated heritability enrichment of a gene set (e.g., genes with rare-variant associations). AMM jointly estimates these quantities by matching the decay in SNP enrichment with distance from genes in the gene set. Across 47 complex traits and diseases, we estimate that the closest gene to each SNP mediates 27% (SE: 6%) of heritability and that a substantial fraction is mediated by genes outside the ten closest. Mendelian disease genes are strongly enriched for common-variant heritability; for example, just 21 dyslipidemia genes mediate 25% of LDL heritability (211× enrichment, p = 0.01). Among brain-related traits, genes involved in neurodevelopmental disorders are only about 4× enriched, but gene expression patterns are highly informative, as they have detectable differences in per-gene heritability even among weakly brain-expressed genes.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Regulação da Expressão Gênica/genética , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
7.
Nat Genet ; 53(8): 1243-1249, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34326547

RESUMO

The genetic effect-size distribution of a disease describes the number of risk variants, the range of their effect sizes and sample sizes that will be required to discover them. Accurate estimation has been a challenge. Here I propose Fourier Mixture Regression (FMR), validating that it accurately estimates real and simulated effect-size distributions. Applied to summary statistics for ten diseases (average [Formula: see text]), FMR estimates that 100,000-1,000,000 cases will be required for genome-wide significant SNPs to explain 50% of SNP heritability. In such large studies, genome-wide significance becomes increasingly conservative, and less stringent thresholds achieve high true positive rates if confounding is controlled. Across traits, polygenicity varies, but the range of their effect sizes is similar. Compared with effect sizes in the top 10% of heritability, including most discovered thus far, those in the bottom 10-50% are orders of magnitude smaller and more numerous, spanning a large fraction of the genome.


Assuntos
Estudo de Associação Genômica Ampla , Modelos Genéticos , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único , Bancos de Espécimes Biológicos , Análise de Fourier , Predisposição Genética para Doença , Humanos , Desequilíbrio de Ligação , Análise de Regressão , Reino Unido
8.
Nat Genet ; 52(6): 626-633, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32424349

RESUMO

Disease variants identified by genome-wide association studies (GWAS) tend to overlap with expression quantitative trait loci (eQTLs), but it remains unclear whether this overlap is driven by gene expression levels 'mediating' genetic effects on disease. Here, we introduce a new method, mediated expression score regression (MESC), to estimate disease heritability mediated by the cis genetic component of gene expression levels. We applied MESC to GWAS summary statistics for 42 traits (average N = 323,000) and cis-eQTL summary statistics for 48 tissues from the Genotype-Tissue Expression (GTEx) consortium. Averaging across traits, only 11 ± 2% of heritability was mediated by assayed gene expression levels. Expression-mediated heritability was enriched in genes with evidence of selective constraint and genes with disease-appropriate annotations. Our results demonstrate that assayed bulk tissue eQTLs, although disease relevant, cannot explain the majority of disease heritability.


Assuntos
Expressão Gênica , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Locos de Características Quantitativas , Calibragem , Estudo de Associação Genômica Ampla/métodos , Humanos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Análise de Regressão
9.
J Allergy Clin Immunol ; 145(2): 537-549, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31669095

RESUMO

BACKGROUND: Clinical and epidemiologic studies have shown that obesity is associated with asthma and that these associations differ by asthma subtype. Little is known about the shared genetic components between obesity and asthma. OBJECTIVE: We sought to identify shared genetic associations between obesity-related traits and asthma subtypes in adults. METHODS: A cross-trait genome-wide association study (GWAS) was performed using 457,822 subjects of European ancestry from the UK Biobank. Experimental evidence to support the role of genes significantly associated with both obesity-related traits and asthma through a GWAS was sought by using results from obese versus lean mouse RNA sequencing and RT-PCR experiments. RESULTS: We found a substantial positive genetic correlation between body mass index and later-onset asthma defined by asthma age of onset at 16 years or greater (Rg = 0.25, P = 9.56 × 10-22). Mendelian randomization analysis provided strong evidence in support of body mass index causally increasing asthma risk. Cross-trait meta-analysis identified 34 shared loci among 3 obesity-related traits and 2 asthma subtypes. GWAS functional analyses identified potential causal relationships between the shared loci and Genotype-Tissue Expression (GTEx) quantitative trait loci and shared immune- and cell differentiation-related pathways between obesity and asthma. Finally, RNA sequencing data from lungs of obese versus control mice found that 2 genes (acyl-coenzyme A oxidase-like [ACOXL] and myosin light chain 6 [MYL6]) from the cross-trait meta-analysis were differentially expressed, and these findings were validated by using RT-PCR in an independent set of mice. CONCLUSIONS: Our work identified shared genetic components between obesity-related traits and specific asthma subtypes, reinforcing the hypothesis that obesity causally increases the risk of asthma and identifying molecular pathways that might underlie both obesity and asthma.


Assuntos
Asma/genética , Predisposição Genética para Doença/genética , Obesidade/genética , Adulto , Animais , Bancos de Espécimes Biológicos , Índice de Massa Corporal , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Camundongos , Reino Unido
10.
Nat Genet ; 51(10): 1459-1474, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31578528

RESUMO

Elevated serum urate levels cause gout and correlate with cardiometabolic diseases via poorly understood mechanisms. We performed a trans-ancestry genome-wide association study of serum urate in 457,690 individuals, identifying 183 loci (147 previously unknown) that improve the prediction of gout in an independent cohort of 334,880 individuals. Serum urate showed significant genetic correlations with many cardiometabolic traits, with genetic causality analyses supporting a substantial role for pleiotropy. Enrichment analysis, fine-mapping of urate-associated loci and colocalization with gene expression in 47 tissues implicated the kidney and liver as the main target organs and prioritized potentially causal genes and variants, including the transcriptional master regulators in the liver and kidney, HNF1A and HNF4A. Experimental validation showed that HNF4A transactivated the promoter of ABCG2, encoding a major urate transporter, in kidney cells, and that HNF4A p.Thr139Ile is a functional variant. Transcriptional coregulation within and across organs may be a general mechanism underlying the observed pleiotropy between urate and cardiometabolic traits.


Assuntos
Doenças Cardiovasculares/sangue , Marcadores Genéticos , Gota/sangue , Doenças Metabólicas/sangue , Polimorfismo de Nucleotídeo Único , Transdução de Sinais , Ácido Úrico/sangue , Membro 2 da Subfamília G de Transportadores de Cassetes de Ligação de ATP/genética , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/genética , Estudos de Coortes , Loci Gênicos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Gota/epidemiologia , Gota/genética , Fator 1-alfa Nuclear de Hepatócito/genética , Fator 4 Nuclear de Hepatócito/genética , Humanos , Rim/metabolismo , Rim/patologia , Fígado/metabolismo , Fígado/patologia , Doenças Metabólicas/epidemiologia , Doenças Metabólicas/genética , Proteínas de Neoplasias/genética , Especificidade de Órgãos
11.
Am J Hum Genet ; 105(3): 456-476, 2019 09 05.
Artigo em Inglês | MEDLINE | ID: mdl-31402091

RESUMO

Complex traits and common diseases are extremely polygenic, their heritability spread across thousands of loci. One possible explanation is that thousands of genes and loci have similarly important biological effects when mutated. However, we hypothesize that for most complex traits, relatively few genes and loci are critical, and negative selection-purging large-effect mutations in these regions-leaves behind common-variant associations in thousands of less critical regions instead. We refer to this phenomenon as flattening. To quantify its effects, we introduce a mathematical definition of polygenicity, the effective number of independently associated SNPs (Me), which describes how evenly the heritability of a trait is spread across the genome. We developed a method, stratified LD fourth moments regression (S-LD4M), to estimate Me, validating that it produces robust estimates in simulations. Analyzing 33 complex traits (average N = 361k), we determined that heritability is spread ∼4× more evenly among common SNPs than among low-frequency SNPs. This difference, together with evolutionary modeling of new mutations, suggests that complex traits would be orders of magnitude less polygenic if not for the influence of negative selection. We also determined that heritability is spread more evenly within functionally important regions in proportion to their heritability enrichment; functionally important regions do not harbor common SNPs with greatly increased causal effect sizes, due to selective constraint. Our results suggest that for most complex traits, the genes and loci with the most critical biological effects often differ from those with the strongest common-variant associations.


Assuntos
Herança Multifatorial , Seleção Genética , Humanos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único
12.
Nat Commun ; 10(1): 790, 2019 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-30770844

RESUMO

Understanding the role of rare variants is important in elucidating the genetic basis of human disease. Negative selection can cause rare variants to have larger per-allele effect sizes than common variants. Here, we develop a method to estimate the minor allele frequency (MAF) dependence of SNP effect sizes. We use a model in which per-allele effect sizes have variance proportional to [p(1 - p)]α, where p is the MAF and negative values of α imply larger effect sizes for rare variants. We estimate α for 25 UK Biobank diseases and complex traits. All traits produce negative α estimates, with best-fit mean of -0.38 (s.e. 0.02) across traits. Despite larger rare variant effect sizes, rare variants (MAF < 1%) explain less than 10% of total SNP-heritability for most traits analyzed. Using evolutionary modeling and forward simulations, we validate the α model of MAF-dependent trait effects and assess plausible values of relevant evolutionary parameters.


Assuntos
Bancos de Espécimes Biológicos , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Seleção Genética , Algoritmos , Alelos , Frequência do Gene , Genótipo , Humanos , Modelos Genéticos , Reino Unido
13.
Nat Genet ; 50(12): 1753, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30401984

RESUMO

In the version of this article originally published, there were errors in equations. In the HTML and PDF, the initial term of equation 10 was estimated GCP but should have been estimated standard error, while a 'hat' was missing from the first alpha in the second term of the expression at the end of the paragraph following equation (6) in the Methods. In addition, in the abstract in the PDF, a subscript 1 was used instead of a subscript 2 for the final term of the first fourth-moment expression. These errors have been corrected in the HTML, PDF and print versions of the paper.

14.
Nat Genet ; 50(12): 1728-1734, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30374074

RESUMO

Mendelian randomization, a method to infer causal relationships, is confounded by genetic correlations reflecting shared etiology. We developed a model in which a latent causal variable mediates the genetic correlation; trait 1 is partially genetically causal for trait 2 if it is strongly genetically correlated with the latent causal variable, quantified using the genetic causality proportion. We fit this model using mixed fourth moments [Formula: see text] and [Formula: see text] of marginal effect sizes for each trait; if trait 1 is causal for trait 2, then SNPs affecting trait 1 (large [Formula: see text]) will have correlated effects on trait 2 (large α1α2), but not vice versa. In simulations, our method avoided false positives due to genetic correlations, unlike Mendelian randomization. Across 52 traits (average n = 331,000), we identified 30 causal relationships with high genetic causality proportion estimates. Novel findings included a causal effect of low-density lipoprotein on bone mineral density, consistent with clinical trials of statins in osteoporosis.


Assuntos
Causalidade , Doença/etiologia , Predisposição Genética para Doença , Herança Multifatorial , Transtorno Autístico/epidemiologia , Transtorno Autístico/genética , Densidade Óssea/genética , Simulação por Computador , Doença/genética , Genótipo , Humanos , Hipotireoidismo/epidemiologia , Hipotireoidismo/genética , Desequilíbrio de Ligação , Modelos Teóricos , Herança Multifatorial/genética , Infarto do Miocárdio/epidemiologia , Infarto do Miocárdio/genética , Osteoporose/epidemiologia , Osteoporose/genética , Fenótipo , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...