Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nat Commun ; 10(1): 4719, 2019 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-31624269

RESUMO

Mosaic loss of chromosome Y (mLOY) is frequently observed in the leukocytes of ageing men. However, the genetic architecture and biological mechanisms underlying mLOY are not fully understood. In a cohort of 95,380 Japanese men, we identify 50 independent genetic markers in 46 loci associated with mLOY at a genome-wide significant level, 35 of which are unreported. Lead markers overlap enhancer marks in hematopoietic stem cells (HSCs, P ≤ 1.0 × 10-6). mLOY genome-wide association study signals exhibit polygenic architecture and demonstrate strong heritability enrichment in regions surrounding genes specifically expressed in multipotent progenitor (MPP) cells and HSCs (P ≤ 3.5 × 10-6). ChIP-seq data demonstrate that binding sites of FLI1, a fate-determining factor promoting HSC differentiation into platelets rather than red blood cells (RBCs), show a strong heritability enrichment (P = 1.5 × 10-6). Consistent with these findings, platelet and RBC counts are positively and negatively associated with mLOY, respectively. Collectively, our observations improve our understanding of the mechanisms underlying mLOY.

2.
Hum Mol Genet ; 2019 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-31595288

RESUMO

Regulatory variation plays a major role in complex disease and that cell-type-specific binding of transcription factors (TF) is critical to gene regulation. However, assessing the contribution of genetic variation in TF binding sites to disease heritability is challenging, as binding is often cell-type-specific and annotations from directly measured TF binding are not currently available for most cell-type-TF pairs. We investigate approaches to annotate TF binding, including directly measured chromatin data and sequence-based predictions. We find that TF binding annotations constructed by intersecting sequence-based TF binding predictions with cell-type-specific chromatin data explain a large fraction of heritability across a broad set of diseases and corresponding cell-types; this strategy of constructing annotations addresses both the limitation that identical sequences may be bound or unbound depending on surrounding chromatin context, and the limitation that sequence-based predictions are generally not cell-type-specific. We partitioned the heritability of 49 diseases and complex traits using stratified LD score regression with the baseline-LD model (which is not cell-type-specific) plus the new annotations. We determined that 100bp windows around MotifMap sequenced-based TF binding predictions intersected with a union of six cell-type-specific chromatin marks (imputed using ChromImpute) performed best, with an 58% increase in heritability enrichment compared to the chromatin marks alone (11.6x vs 7.3x; P = 9 x 10-14 for difference) and a 20% increase in cell-type-specific signal conditional on annotations from the baseline-LD model (P = 8 x 10-11 for difference). Our results show that TF binding annotations explain substantial disease heritability and can help refine genome-wide association signals.

3.
Nat Genet ; 51(8): 1295, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31273336

RESUMO

In the version of the paper initially published, information on competing interests for author Benjamin M. Neale was missing. The 'Competing interests' statement should have included the sentence 'B.M.N. is on the Scientific Advisory Board of Deep Genomics'.

4.
Am J Hum Genet ; 104(5): 896-913, 2019 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-31051114

RESUMO

Recent studies have highlighted the role of gene networks in disease biology. To formally assess this, we constructed a broad set of pathway, network, and pathway+network annotations and applied stratified LD score regression to 42 diseases and complex traits (average N = 323K) to identify enriched annotations. First, we analyzed 18,119 biological pathways. We identified 156 pathway-trait pairs whose disease enrichment was statistically significant (FDR < 5%) after conditioning on all genes and 75 known functional annotations (from the baseline-LD model), a stringent step that greatly reduced the number of pathways detected; most significant pathway-trait pairs were previously unreported. Next, for each of four published gene networks, we constructed probabilistic annotations based on network connectivity. For each gene network, the network connectivity annotation was strongly significantly enriched. Surprisingly, the enrichments were fully explained by excess overlap between network annotations and regulatory annotations from the baseline-LD model, validating the informativeness of the baseline-LD model and emphasizing the importance of accounting for regulatory annotations in gene network analyses. Finally, for each of the 156 enriched pathway-trait pairs, for each of the four gene networks, we constructed pathway+network annotations by annotating genes with high network connectivity to the input pathway. For each gene network, these pathway+network annotations were strongly significantly enriched for the corresponding traits. Once again, the enrichments were largely explained by the baseline-LD model. In conclusion, gene network connectivity is highly informative for disease architectures, but the information in gene networks may be subsumed by regulatory annotations, emphasizing the importance of accounting for known annotations.

5.
Nat Commun ; 10(1): 790, 2019 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-30770844

RESUMO

Understanding the role of rare variants is important in elucidating the genetic basis of human disease. Negative selection can cause rare variants to have larger per-allele effect sizes than common variants. Here, we develop a method to estimate the minor allele frequency (MAF) dependence of SNP effect sizes. We use a model in which per-allele effect sizes have variance proportional to [p(1 - p)]α, where p is the MAF and negative values of α imply larger effect sizes for rare variants. We estimate α for 25 UK Biobank diseases and complex traits. All traits produce negative α estimates, with best-fit mean of -0.38 (s.e. 0.02) across traits. Despite larger rare variant effect sizes, rare variants (MAF < 1%) explain less than 10% of total SNP-heritability for most traits analyzed. Using evolutionary modeling and forward simulations, we validate the α model of MAF-dependent trait effects and assess plausible values of relevant evolutionary parameters.


Assuntos
Bancos de Espécimes Biológicos , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Seleção Genética , Algoritmos , Alelos , Frequência do Gene , Genótipo , Humanos , Modelos Genéticos , Reino Unido
6.
Nat Commun ; 10(1): 569, 2019 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-30718517

RESUMO

We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait based on individual-level genotypes and/or summary statistics. Our novel implementation of a parallel computing algorithm makes it feasible to apply our method to biobank-scale GWAS data. We illustrate our method using large-scale GWAS data (~1M SNPs) from the UK Biobank (N = 456,837). We show that our multi-trait method outperforms the recently proposed multi-trait analysis of GWAS (MTAG) for predictive performance. The prediction accuracy for height by the aid of BMI improves from R2 = 35.8% (MTAG) to 42.5% (MCP + CTPR) or 42.8% (Lasso + CTPR) with UK Biobank data.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Algoritmos , Genótipo , Humanos , Fenótipo , Locos de Características Quantitativas/genética
7.
Am J Hum Genet ; 104(1): 65-75, 2019 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-30595370

RESUMO

Functional genomics data has the potential to increase GWAS power by identifying SNPs that have a higher prior probability of association. Here, we introduce a method that leverages polygenic functional enrichment to incorporate coding, conserved, regulatory, and LD-related genomic annotations into association analyses. We show via simulations with real genotypes that the method, functionally informed novel discovery of risk loci (FINDOR), correctly controls the false-positive rate at null loci and attains a 9%-38% increase in the number of independent associations detected at causal loci, depending on trait polygenicity and sample size. We applied FINDOR to 27 independent complex traits and diseases from the interim UK Biobank release (average N = 130K). Averaged across traits, we attained a 13% increase in genome-wide significant loci detected (including a 20% increase for disease traits) compared to unweighted raw p values that do not use functional data. We replicated the additional loci in independent UK Biobank and non-UK Biobank data, yielding a highly statistically significant replication slope (0.66-0.69) in each case. Finally, we applied FINDOR to the full UK Biobank release (average N = 416K), attaining smaller relative improvements (consistent with simulations) but larger absolute improvements, detecting an additional 583 GWAS loci. In conclusion, leveraging functional enrichment using our method robustly increases GWAS power.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , Calibragem , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Reações Falso-Positivas , Humanos , Probabilidade , Reprodutibilidade dos Testes , Reino Unido
8.
Nat Genet ; 50(12): 1753, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30390058

RESUMO

In the version of this article originally published, there were two errors in the text of the second paragraph of the Methods section. In the sentence "To identify genetic variants that contribute to doctor-diagnosed asthma and allergic diseases (detailed phenotype information described in the Supplementary Note) and link them with other conditions, we performed GWASs using phenotype measures in UK Biobank participants (N = 487,409)" the number of participants should have been 150,509. In the sentence "Thus, a total of 110,361 European descendants with high-quality genotyping and complete phenotype/covariate data were used for these analyses, including 25,685 allergic diseases subjects (hay fever/allergic rhinitis or eczema, without doctor-diagnosed asthma), 14,085 asthma subjects and 76,768 controls for the analysis" the phrase "without doctor-diagnosed asthma" should have read "some with doctor-diagnosed asthma." The errors have been corrected in the HTML and PDF versions of the article.

9.
Genet Epidemiol ; 2018 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-30474154

RESUMO

Recent studies have examined the genetic correlations of single-nucleotide polymorphism (SNP) effect sizes across pairs of populations to better understand the genetic architectures of complex traits. These studies have estimated ρ g , the cross-population correlation of joint-fit effect sizes at genotyped SNPs. However, the value of ρ g depends both on the cross-population correlation of true causal effect sizes ( ρ b ) and on the similarity in linkage disequilibrium (LD) patterns in the two populations, which drive tagging effects. Here, we derive the value of the ratio ρ g / ρ b as a function of LD in each population. By applying existing methods to obtain estimates of ρ g , we can use this ratio to estimate ρ b . Our estimates of ρ b were equal to 0.55 ( SE = 0.14) between Europeans and East Asians averaged across nine traits in the Genetic Epidemiology Research on Adult Health and Aging data set, 0.54 ( SE = 0.18) between Europeans and South Asians averaged across 13 traits in the UK Biobank data set, and 0.48 ( SE = 0.06) and 0.65 ( SE = 0.09) between Europeans and East Asians in summary statistic data sets for type 2 diabetes and rheumatoid arthritis, respectively. These results implicate substantially different causal genetic architectures across continental populations.

10.
Nat Genet ; 50(11): 1600-1607, 2018 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-30297966

RESUMO

Common variant heritability has been widely reported to be concentrated in variants within cell-type-specific non-coding functional annotations, but little is known about low-frequency variant functional architectures. We partitioned the heritability of both low-frequency (0.5%≤ minor allele frequency <5%) and common (minor allele frequency ≥5%) variants in 40 UK Biobank traits across a broad set of functional annotations. We determined that non-synonymous coding variants explain 17 ± 1% of low-frequency variant heritability ([Formula: see text]) versus 2.1 ± 0.2% of common variant heritability ([Formula: see text]). Cell-type-specific non-coding annotations that were significantly enriched for [Formula: see text] of corresponding traits were similarly enriched for [Formula: see text] for most traits, but more enriched for brain-related annotations and traits. For example, H3K4me3 marks in brain dorsolateral prefrontal cortex explain 57 ± 12% of [Formula: see text] versus 12 ± 2% of [Formula: see text] for neuroticism. Forward simulations confirmed that low-frequency variant enrichment depends on the mean selection coefficient of causal variants in the annotation, and can be used to predict effect size variance of causal rare variants (minor allele frequency <0.5%).

11.
Nat Genet ; 50(10): 1483-1493, 2018 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-30177862

RESUMO

Biological interpretation of genome-wide association study data frequently involves assessing whether SNPs linked to a biological process, for example, binding of a transcription factor, show unsigned enrichment for disease signal. However, signed annotations quantifying whether each SNP allele promotes or hinders the biological process can enable stronger statements about disease mechanism. We introduce a method, signed linkage disequilibrium profile regression, for detecting genome-wide directional effects of signed functional annotations on disease risk. We validate the method via simulations and application to molecular quantitative trait loci in blood, recovering known transcriptional regulators. We apply the method to expression quantitative trait loci in 48 Genotype-Tissue Expression tissues, identifying 651 transcription factor-tissue associations including 30 with robust evidence of tissue specificity. We apply the method to 46 diseases and complex traits (average n = 290 K), identifying 77 annotation-trait associations representing 12 independent transcription factor-trait associations, and characterize the underlying transcriptional programs using gene-set enrichment analyses. Our results implicate new causal disease genes and new disease mechanisms.

12.
Nature ; 559(7714): 350-355, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29995854

RESUMO

The selective pressures that shape clonal evolution in healthy individuals are largely unknown. Here we investigate 8,342 mosaic chromosomal alterations, from 50 kb to 249 Mb long, that we uncovered in blood-derived DNA from 151,202 UK Biobank participants using phase-based computational techniques (estimated false discovery rate, 6-9%). We found six loci at which inherited variants associated strongly with the acquisition of deletions or loss of heterozygosity in cis. At three such loci (MPL, TM2D3-TARSL2, and FRA10B), we identified a likely causal variant that acted with high penetrance (5-50%). Inherited alleles at one locus appeared to affect the probability of somatic mutation, and at three other loci to be objects of positive or negative clonal selection. Several specific mosaic chromosomal alterations were strongly associated with future haematological malignancies. Our results reveal a multitude of paths towards clonal expansions with a wide range of effects on human health.

13.
Nat Genet ; 50(7): 1041-1047, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-29942083

RESUMO

There is increasing evidence that many risk loci found using genome-wide association studies are molecular quantitative trait loci (QTLs). Here we introduce a new set of functional annotations based on causal posterior probabilities of fine-mapped molecular cis-QTLs, using data from the Genotype-Tissue Expression (GTEx) and BLUEPRINT consortia. We show that these annotations are more strongly enriched for heritability (5.84× for eQTLs; P = 1.19 × 10-31) across 41 diseases and complex traits than annotations containing all significant molecular QTLs (1.80× for expression (e)QTLs). eQTL annotations obtained by meta-analyzing all GTEx tissues generally performed best, whereas tissue-specific eQTL annotations produced stronger enrichments for blood- and brain-related diseases and traits. eQTL annotations restricted to loss-of-function intolerant genes were even more enriched for heritability (17.06×; P = 1.20 × 10-35). All molecular QTLs except splicing QTLs remained significantly enriched in joint analysis, indicating that each of these annotations is uniquely informative for disease and complex trait architectures.

14.
15.
Nat Genet ; 50(6): 857-864, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-29785011

RESUMO

Clinical and epidemiological data suggest that asthma and allergic diseases are associated and may share a common genetic etiology. We analyzed genome-wide SNP data for asthma and allergic diseases in 33,593 cases and 76,768 controls of European ancestry from UK Biobank. Two publicly available independent genome-wide association studies were used for replication. We have found a strong genome-wide genetic correlation between asthma and allergic diseases (rg = 0.75, P = 6.84 × 10-62). Cross-trait analysis identified 38 genome-wide significant loci, including 7 novel shared loci. Computational analysis showed that shared genetic loci are enriched in immune/inflammatory systems and tissues with epithelium cells. Our work identifies common genetic architectures shared between asthma and allergy and will help to advance understanding of the molecular mechanisms underlying co-morbid asthma and allergic diseases.

16.
Nat Genet ; 50(4): 621-629, 2018 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-29632380

RESUMO

We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.

17.
Circ Cardiovasc Genet ; 10(6)2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29237688

RESUMO

BACKGROUND: Previous reports have implicated multiple genetic loci associated with AF, but the contributions of genome-wide variation to AF susceptibility have not been quantified. METHODS AND RESULTS: We assessed the contribution of genome-wide single-nucleotide polymorphism variation to AF risk (single-nucleotide polymorphism heritability, h2g ) using data from 120 286 unrelated individuals of European ancestry (2987 with AF) in the population-based UK Biobank. We ascertained AF based on self-report, medical record billing codes, procedure codes, and death records. We estimated h2g using a variance components method with variants having a minor allele frequency ≥1%. We evaluated h2g in age, sex, and genomic strata of interest. The h2g for AF was 22.1% (95% confidence interval, 15.6%-28.5%) and was similar for early- versus older-onset AF (≤65 versus >65 years of age), as well as for men and women. The proportion of AF variance explained by genetic variation was mainly accounted for by common (minor allele frequency, ≥5%) variants (20.4%; 95% confidence interval, 15.1%-25.6%). Only 6.4% (95% confidence interval, 5.1%-7.7%) of AF variance was attributed to variation within known AF susceptibility, cardiac arrhythmia, and cardiomyopathy gene regions. CONCLUSIONS: Genetic variation contributes substantially to AF risk. The risk for AF conferred by genomic variation is similar to that observed for several other cardiovascular diseases. Established AF loci only explain a moderate proportion of disease risk, suggesting that further genetic discovery, with an emphasis on common variation, is warranted to understand the causal genetic basis of AF.


Assuntos
Fibrilação Atrial/genética , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Idoso , Grupo com Ancestrais do Continente Europeu/genética , Feminino , Frequência do Gene , Predisposição Genética para Doença/etnologia , Humanos , Desequilíbrio de Ligação , Masculino , Pessoa de Meia-Idade , Fatores de Risco , Reino Unido
18.
Genet Epidemiol ; 41(8): 811-823, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-29110330

RESUMO

Methods for genetic risk prediction have been widely investigated in recent years. However, most available training data involves European samples, and it is currently unclear how to accurately predict disease risk in other populations. Previous studies have used either training data from European samples in large sample size or training data from the target population in small sample size, but not both. Here, we introduce a multiethnic polygenic risk score that combines training data from European samples and training data from the target population. We applied this approach to predict type 2 diabetes (T2D) in a Latino cohort using both publicly available European summary statistics in large sample size (Neff  = 40k) and Latino training data in small sample size (Neff  = 8k). Here, we attained a >70% relative improvement in prediction accuracy (from R2  = 0.027 to 0.047) compared to methods that use only one source of training data, consistent with large relative improvements in simulations. We observed a systematically lower load of T2D risk alleles in Latino individuals with more European ancestry, which could be explained by polygenic selection in ancestral European and/or Native American populations. We predict T2D in a South Asian UK Biobank cohort using European (Neff  = 40k) and South Asian (Neff  = 16k) training data and attained a >70% relative improvement in prediction accuracy, and application to predict height in an African UK Biobank cohort using European (N = 113k) and African (N = 2k) training data attained a 30% relative improvement. Our work reduces the gap in polygenic risk prediction accuracy between European and non-European target populations.


Assuntos
Diabetes Mellitus Tipo 2/genética , Modelos Genéticos , Alelos , Estudos de Coortes , Diabetes Mellitus Tipo 2/patologia , Grupos Étnicos/genética , Estudo de Associação Genômica Ampla , Genótipo , Hispano-Americanos/genética , Humanos , Herança Multifatorial , Fenótipo , Polimorfismo de Nucleotídeo Único , Fatores de Risco
19.
Nat Genet ; 49(10): 1421-1427, 2017 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-28892061

RESUMO

Recent work has hinted at the linkage disequilibrium (LD)-dependent architecture of human complex traits, where SNPs with low levels of LD (LLD) have larger per-SNP heritability. Here we analyzed summary statistics from 56 complex traits (average N = 101,401) by extending stratified LD score regression to continuous annotations. We determined that SNPs with low LLD have significantly larger per-SNP heritability and that roughly half of this effect can be explained by functional annotations negatively correlated with LLD, such as DNase I hypersensitivity sites (DHSs). The remaining signal is largely driven by our finding that more recent common variants tend to have lower LLD and to explain more heritability (P = 2.38 × 10-104); the youngest 20% of common SNPs explain 3.9 times more heritability than the oldest 20%, consistent with the action of negative selection. We also inferred jointly significant effects of other LD-related annotations and confirmed via forward simulations that they jointly predict deleterious effects.


Assuntos
Variação Genética/genética , Desequilíbrio de Ligação , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único , Seleção Genética , Alelos , Distribuição de Qui-Quadrado , Conjuntos de Dados como Assunto , Aptidão Genética , Humanos , Modelos Genéticos , Anotação de Sequência Molecular
20.
Nat Genet ; 49(6): 834-841, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28436984

RESUMO

The timing of puberty is a highly polygenic childhood trait that is epidemiologically associated with various adult diseases. Using 1000 Genomes Project-imputed genotype data in up to ∼370,000 women, we identify 389 independent signals (P < 5 × 10-8) for age at menarche, a milestone in female pubertal development. In Icelandic data, these signals explain ∼7.4% of the population variance in age at menarche, corresponding to ∼25% of the estimated heritability. We implicate ∼250 genes via coding variation or associated expression, demonstrating significant enrichment in neural tissues. Rare variants near the imprinted genes MKRN3 and DLK1 were identified, exhibiting large effects when paternally inherited. Mendelian randomization analyses suggest causal inverse associations, independent of body mass index (BMI), between puberty timing and risks for breast and endometrial cancers in women and prostate cancer in men. In aggregate, our findings highlight the complexity of the genetic regulation of puberty timing and support causal links with cancer susceptibility.


Assuntos
Peptídeos e Proteínas de Sinalização Intercelular/genética , Proteínas de Membrana/genética , Menarca/genética , Neoplasias/genética , Puberdade/genética , Ribonucleoproteínas/genética , Adolescente , Fatores Etários , Índice de Massa Corporal , Bases de Dados Genéticas , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Impressão Genômica , Humanos , Masculino , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Fatores de Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA