RESUMO
Overall adiposity and body fat distribution are heritable traits associated with altered risk of cardiometabolic disease and mortality. Performing rare variant (minor allele frequency<1%) association testing using exome-sequencing data from 402,375 participants in the UK Biobank (UKB) for nine overall and tissue-specific fat distribution traits, we identified 19 genes where putatively damaging rare variation associated with at least one trait (Bonferroni-adjusted P<1.58×10-7) and 52 additional genes at FDR≤1% (P≤4.37×10-5). These 71 genes exhibited higher (P=3.58×10-18) common variant prioritisation scores than genes not significantly enriched for rare putatively damaging variation, with evidence of monotonic allelic series (dose-response relationships) among ultra-rare variants (minor allele count≤10) in 22 genes. Five of the 71 genes have cognate protein UKB Olink data available; all five associated (P<3.80×10-6) with three or more analysed traits. Combining rare and common variation evidence, allelic series and proteomics, we selected 17 genes for CRISPR knockout in human white adipose tissue cell lines. In three previously uncharacterised target genes, knockout increased (two-sided t-test P<0.05) lipid accumulation, a cellular phenotype relevant for fat mass traits, compared to Cas9-empty negative controls: COL5A3 (fold change [FC]=1.72, P=0.0028), EXOC7 (FC=1.35, P=0.0096), and TRIP10 (FC=1.39, P=0.0157); furthermore, knockout of SLTM resulted in reduced lipid accumulation (FC=0.51, P=1.91×10-4). Integrating across population-based genetic and in vitro functional evidence, we highlight therapeutic avenues for altering obesity and body fat distribution by modulating lipid accumulation.
RESUMO
Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 24.5 million primary-care health records in over 740,000 individuals in the UK Biobank, Million Veteran Program USA, and Estonian Biobank, to discover and validate the genetic architecture of adiposity trajectories. Using multiple BMI measurements over time increases power to identify genetic factors affecting baseline BMI by 14%. In the largest reported genome-wide study of adiposity-change in adulthood, we identify novel associations with BMI-change at six independent loci, including rs429358 (APOE missense variant). The SNP-based heritability of BMI-change (1.98%) is 9-fold lower than that of BMI. The modest genetic correlation between BMI-change and BMI (45.2%) indicates that genetic studies of longitudinal trajectories could uncover novel biology of quantitative traits in adulthood.
Assuntos
Adiposidade , Índice de Massa Corporal , Registros Eletrônicos de Saúde , Estudo de Associação Genômica Ampla , Obesidade , Polimorfismo de Nucleotídeo Único , Humanos , Adiposidade/genética , Masculino , Feminino , Obesidade/genética , Pessoa de Meia-Idade , Adulto , Idoso , Reino Unido , Fenótipo , Estônia , Estados Unidos , Predisposição Genética para DoençaRESUMO
Data within biobanks capture broad yet detailed indices of human variation, but biobank-wide insights can be difficult to extract due to complexity and scale. Here, using large-scale factor analysis, we distill hundreds of variables (diagnoses, assessments and survey items) into 35 latent constructs, using data from unrelated individuals with predominantly estimated European genetic ancestry in UK Biobank. These factors recapitulate known disease classifications, disentangle elements of socioeconomic status, highlight the relevance of psychiatric constructs to health and improve measurement of pro-health behaviours. We go on to demonstrate the power of this approach to clarify genetic signal, enhance discovery and identify associations between underlying phenotypic structure and health outcomes. In building a deeper understanding of ways in which constructs such as socioeconomic status, trauma, or physical activity are structured in the dataset, we emphasize the importance of considering the interwoven nature of the human phenome when evaluating public health patterns.
Assuntos
Bancos de Espécimes Biológicos , Fenótipo , Humanos , Reino Unido , Masculino , Feminino , Classe Social , Pessoa de Meia-Idade , Biobanco do Reino UnidoRESUMO
The phenotypic impact of compound heterozygous (CH) variation has not been investigated at the population scale. We phased rare variants (MAF â¼0.001%) in the UK Biobank (UKBB) exome-sequencing data to characterize recessive effects in 175,587 individuals across 311 common diseases. A total of 6.5% of individuals carry putatively damaging CH variants, 90% of which are only identifiable upon phasing rare variants (MAF < 0.38%). We identify six recessive gene-trait associations (p < 1.68 × 10-7) after accounting for relatedness, polygenicity, nearby common variants, and rare variant burden. Of these, just one is discovered when considering homozygosity alone. Using longitudinal health records, we additionally identify and replicate a novel association between bi-allelic variation in ATP2C2 and an earlier age at onset of chronic obstructive pulmonary disease (COPD) (p < 3.58 × 10-8). Genetic phase contributes to disease risk for gene-trait pairs: ATP2C2-COPD (p = 0.000238), FLG-asthma (p = 0.00205), and USH2A-visual impairment (p = 0.0084). We demonstrate the power of phasing large-scale genetic cohorts to discover phenome-wide consequences of compound heterozygosity.
Assuntos
Bancos de Espécimes Biológicos , Exoma , Heterozigoto , Fenótipo , Humanos , Reino Unido/epidemiologia , Exoma/genética , Predisposição Genética para Doença , Doença Pulmonar Obstrutiva Crônica/genética , Feminino , Masculino , Proteínas Filagrinas , Estudo de Associação Genômica Ampla , Asma/genética , Biobanco do Reino UnidoRESUMO
Genome-wide association studies (GWASs) may help inform treatments for infertility, whose causes remain unknown in many cases. Here we present GWAS meta-analyses across six cohorts for male and female infertility in up to 41,200 cases and 687,005 controls. We identified 21 genetic risk loci for infertility (P≤5E-08), of which 12 have not been reported for any reproductive condition. We found positive genetic correlations between endometriosis and all-cause female infertility (rg=0.585, P=8.98E-14), and between polycystic ovary syndrome and anovulatory infertility (rg=0.403, P=2.16E-03). The evolutionary persistence of female infertility-risk alleles in EBAG9 may be explained by recent directional selection. We additionally identified up to 269 genetic loci associated with follicle-stimulating hormone (FSH), luteinising hormone, oestradiol, and testosterone through sex-specific GWAS meta-analyses (N=6,095-246,862). While hormone-associated variants near FSHB and ARL14EP colocalised with signals for anovulatory infertility, we found no rg between female infertility and reproductive hormones (P>0.05). Exome sequencing analyses in the UK Biobank (N=197,340) revealed that women carrying testosterone-lowering rare variants in GPC2 were at higher risk of infertility (OR=2.63, P=1.25E-03). Taken together, our results suggest that while individual genes associated with hormone regulation may be relevant for fertility, there is limited genetic evidence for correlation between reproductive hormones and infertility at the population level. We provide the first comprehensive view of the genetic architecture of infertility across multiple diagnostic criteria in men and women, and characterise its relationship to other health conditions.
RESUMO
Exome-sequencing association studies have successfully linked rare protein-coding variation to risk of thousands of diseases. However, the relationship between rare deleterious compound heterozygous (CH) variation and their phenotypic impact has not been fully investigated. Here, we leverage advances in statistical phasing to accurately phase rare variants (MAF ~ 0.001%) in exome sequencing data from 175,587 UK Biobank (UKBB) participants, which we then systematically annotate to identify putatively deleterious CH coding variation. We show that 6.5% of individuals carry such damaging variants in the CH state, with 90% of variants occurring at MAF < 0.34%. Using a logistic mixed model framework, systematically accounting for relatedness, polygenic risk, nearby common variants, and rare variant burden, we investigate recessive effects in common complex diseases. We find six exome-wide significant (P<1.68×10-7) and 17 nominally significant (P<5.25×10-5) gene-trait associations. Among these, only four would have been identified without accounting for CH variation in the gene. We further incorporate age-at-diagnosis information from primary care electronic health records, to show that genetic phase influences lifetime risk of disease across 20 gene-trait combinations (FDR < 5%). Using a permutation approach, we find evidence for genetic phase contributing to disease susceptibility for a collection of gene-trait pairs, including FLG-asthma (P=0.00205) and USH2A-visual impairment (P=0.0084). Taken together, we demonstrate the utility of phasing large-scale genetic sequencing cohorts for robust identification of the phenome-wide consequences of compound heterozygosity.
RESUMO
Classical statistical genetics theory defines dominance as any deviation from a purely additive, or dosage, effect of a genotype on a trait, which is known as the dominance deviation. Dominance is well documented in plant and animal breeding. Outside of rare monogenic traits, however, evidence in humans is limited. We systematically examined common genetic variation across 1060 traits in a large population cohort (UK Biobank, N = 361,194 samples analyzed) for evidence of dominance effects. We then developed a computationally efficient method to rapidly assess the aggregate contribution of dominance deviations to heritability. Lastly, observing that dominance associations are inherently less correlated between sites at a genomic locus than their additive counterparts, we explored whether they may be leveraged to identify causal variants more confidently.
Assuntos
Bancos de Espécimes Biológicos , Genes Dominantes , Variação Genética , Herança Multifatorial , Animais , Humanos , Cruzamento , Genótipo , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Reino UnidoRESUMO
Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 1.5 million primary-care health records in over 177,000 individuals in UK Biobank to study the genetic architecture of weight-change. Using multiple BMI measurements over time increases power to identify genetic factors affecting baseline BMI. In the largest reported genome-wide study of adiposity-change in adulthood, we identify novel associations with BMI-change at six independent loci, including rs429358 (a missense variant in APOE). The SNP-based heritability of BMI-change (1.98%) is 9-fold lower than that of BMI, and higher in women than in men. The modest genetic correlation between BMI-change and BMI (45.2%) indicates that genetic studies of longitudinal trajectories could uncover novel biology driving quantitative trait values in adulthood.
RESUMO
We report results from the Bipolar Exome (BipEx) collaboration analysis of whole-exome sequencing of 13,933 patients with bipolar disorder (BD) matched with 14,422 controls. We find an excess of ultra-rare protein-truncating variants (PTVs) in patients with BD among genes under strong evolutionary constraint in both major BD subtypes. We find enrichment of ultra-rare PTVs within genes implicated from a recent schizophrenia exome meta-analysis (SCHEMA; 24,248 cases and 97,322 controls) and among binding targets of CHD8. Genes implicated from genome-wide association studies (GWASs) of BD, however, are not significantly enriched for ultra-rare PTVs. Combining gene-level results with SCHEMA, AKAP11 emerges as a definitive risk gene (odds ratio (OR) = 7.06, P = 2.83 × 10-9). At the protein level, AKAP-11 interacts with GSK3B, the hypothesized target of lithium, a primary treatment for BD. Our results lend support to BD's polygenicity, demonstrating a role for rare coding variation as a significant risk factor in BD etiology.
Assuntos
Transtorno Bipolar , Esquizofrenia , Proteínas de Ancoragem à Quinase A/genética , Transtorno Bipolar/genética , Exoma/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Esquizofrenia/genética , Sequenciamento do ExomaRESUMO
Autism spectrum disorder (ASD) is diagnosed three to four times more frequently in males than in females. Genetic studies of rare variants support a female protective effect (FPE) against ASD. However, sex differences in common inherited genetic risk for ASD are less studied, particularly within families. Leveraging the Danish iPSYCH resource, we found siblings of female ASD cases (n = 1,707) had higher rates of ASD than siblings of male ASD cases (n = 6,270; p < 1.0 × 10-10). In the Simons Simplex and SPARK collections, mothers of ASD cases (n = 7,436) carried more polygenic risk for ASD than fathers of ASD cases (n = 5,926; 0.08 polygenic risk score [PRS] SD; p = 7.0 × 10-7). Further, male unaffected siblings under-inherited polygenic risk (n = 1,519; p = 0.03). Using both epidemiologic and genetic approaches, our findings strongly support an FPE against ASD's common inherited influences.
RESUMO
Genome-wide association studies have successfully discovered thousands of common variants associated with human diseases and traits, but the landscape of rare variations in human disease has not been explored at scale. Exome-sequencing studies of population biobanks provide an opportunity to systematically evaluate the impact of rare coding variations across a wide range of phenotypes to discover genes and allelic series relevant to human health and disease. Here, we present results from systematic association analyses of 4,529 phenotypes using single-variant and gene tests of 394,841 individuals in the UK Biobank with exome-sequence data. We find that the discovery of genetic associations is tightly linked to frequency and is correlated with metrics of deleteriousness and natural selection. We highlight biological findings elucidated by these data and release the dataset as a public resource alongside the Genebass browser for rapidly exploring rare-variant association results.
RESUMO
The use of external controls in genome-wide association study (GWAS) can significantly increase the size and diversity of the control sample, enabling high-resolution ancestry matching and enhancing the power to detect association signals. However, the aggregation of controls from multiple sources is challenging due to batch effects, difficulty in identifying genotyping errors and the use of different genotyping platforms. These obstacles have impeded the use of external controls in GWAS and can lead to spurious results if not carefully addressed. We propose a unified data harmonization pipeline that includes an iterative approach to quality control and imputation, implemented before and after merging cohorts and arrays. We apply this harmonization pipeline to aggregate 27 517 European control samples from 16 collections within dbGaP. We leverage these harmonized controls to conduct a GWAS of Crohn's disease. We demonstrate a boost in power over using the cohort samples alone, and that our procedure results in summary statistics free of any significant batch effects. This harmonization pipeline for aggregating genotype data from multiple sources can also serve other applications where individual level genotypes, rather than summary statistics, are required.
Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Estudos de Coortes , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único/genética , Controle de QualidadeRESUMO
Genome-wide association studies (GWAS) are not fully comprehensive, as current strategies typically test only the additive model, exclude the X chromosome, and use only one reference panel for genotype imputation. We implement an extensive GWAS strategy, GUIDANCE, which improves genotype imputation by using multiple reference panels and includes the analysis of the X chromosome and non-additive models to test for association. We apply this methodology to 62,281 subjects across 22 age-related diseases and identify 94 genome-wide associated loci, including 26 previously unreported. Moreover, we observe that 27.7% of the 94 loci are missed if we use standard imputation strategies with a single reference panel, such as HRC, and only test the additive model. Among the new findings, we identify three novel low-frequency recessive variants with odds ratios larger than 4, which need at least a three-fold larger sample size to be detected under the additive model. This study highlights the benefits of applying innovative strategies to better uncover the genetic architecture of complex diseases.
Assuntos
Envelhecimento , Doença/genética , Predisposição Genética para Doença/genética , Genoma Humano/genética , Estudo de Associação Genômica Ampla/métodos , Fatores Etários , Frequência do Gene , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Genótipo , Haplótipos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Malfunctions of voltage-gated sodium and calcium channels (encoded by SCNxA and CACNA1x family genes, respectively) have been associated with severe neurologic, psychiatric, cardiac, and other diseases. Altered channel activity is frequently grouped into gain or loss of ion channel function (GOF or LOF, respectively) that often corresponds not only to clinical disease manifestations but also to differences in drug response. Experimental studies of channel function are therefore important, but laborious and usually focus only on a few variants at a time. On the basis of known gene-disease mechanisms of 19 different diseases, we inferred LOF (n = 518) and GOF (n = 309) likely pathogenic variants from the disease phenotypes of variant carriers. By training a machine learning model on sequence- and structure-based features, we predicted LOF or GOF effects [area under the receiver operating characteristics curve (ROC) = 0.85] of likely pathogenic missense variants. Our LOF versus GOF prediction corresponded to molecular LOF versus GOF effects for 87 functionally tested variants in SCN1/2/8A and CACNA1I (ROC = 0.73) and was validated in exome-wide data from 21,703 cases and 128,957 controls. We showed respective regional clustering of inferred LOF and GOF nucleotide variants across the alignment of the entire gene family, suggesting shared pathomechanisms in the SCNxA/CACNA1x family genes.
Assuntos
Canais de Cálcio , Preparações Farmacêuticas , Mutação de Sentido Incorreto/genética , Fenótipo , SódioRESUMO
The exome sequences of approximately 8,000 children with autism spectrum disorder (ASD) and/or attention deficit hyperactivity disorder (ADHD) and 5,000 controls were analyzed, finding that individuals with ASD and individuals with ADHD had a similar burden of rare protein-truncating variants in evolutionarily constrained genes, both significantly higher than controls. This motivated a combined analysis across ASD and ADHD, identifying microtubule-associated protein 1A (MAP1A) as a new exome-wide significant gene conferring risk for childhood psychiatric disorders.
Assuntos
Transtorno do Deficit de Atenção com Hiperatividade/genética , Transtorno do Espectro Autista/genética , Predisposição Genética para Doença/genética , Variação Genética/genética , Proteínas Associadas aos Microtúbulos/genética , Transtorno do Deficit de Atenção com Hiperatividade/complicações , Transtorno do Espectro Autista/complicações , Estudos de Casos e Controles , Exoma/genética , Feminino , Humanos , MasculinoRESUMO
Differences among hosts, resulting from genetic variation in the immune system or heterogeneity in drug treatment, can impact within-host pathogen evolution. Genetic association studies can potentially identify such interactions. However, extensive and correlated genetic population structure in hosts and pathogens presents a substantial risk of confounding analyses. Moreover, the multiple testing burden of interaction scanning can potentially limit power. We present a Bayesian approach for detecting host influences on pathogen evolution that exploits vast existing data sets of pathogen diversity to improve power and control for stratification. The approach models key processes, including recombination and selection, and identifies regions of the pathogen genome affected by host factors. Our simulations and empirical analysis of drug-induced selection on the HIV-1 genome show that the method recovers known associations and has superior precision-recall characteristics compared to other approaches. We build a high-resolution map of HLA-induced selection in the HIV-1 genome, identifying novel epitope-allele combinations.
Assuntos
Evolução Molecular , HIV-1/genética , Antígenos HLA/imunologia , Interações Hospedeiro-Patógeno/genética , Modelos Genéticos , Fármacos Anti-HIV/farmacologia , Fármacos Anti-HIV/uso terapêutico , Teorema de Bayes , Conjuntos de Dados como Assunto , Epitopos/efeitos dos fármacos , Epitopos/genética , Epitopos/imunologia , Genoma Viral/efeitos dos fármacos , Infecções por HIV/tratamento farmacológico , Infecções por HIV/imunologia , Infecções por HIV/virologia , HIV-1/efeitos dos fármacos , HIV-1/imunologia , Interações Hospedeiro-Patógeno/imunologia , Humanos , Recombinação Genética/efeitos dos fármacos , Recombinação Genética/imunologia , Seleção Genética/efeitos dos fármacos , Seleção Genética/imunologiaRESUMO
Autism spectrum disorder (ASD) is a highly heritable and heterogeneous group of neurodevelopmental phenotypes diagnosed in more than 1% of children. Common genetic variants contribute substantially to ASD susceptibility, but to date no individual variants have been robustly associated with ASD. With a marked sample-size increase from a unique Danish population resource, we report a genome-wide association meta-analysis of 18,381 individuals with ASD and 27,969 controls that identified five genome-wide-significant loci. Leveraging GWAS results from three phenotypes with significantly overlapping genetic architectures (schizophrenia, major depression, and educational attainment), we identified seven additional loci shared with other traits at equally strict significance levels. Dissecting the polygenic architecture, we found both quantitative and qualitative polygenic heterogeneity across ASD subtypes. These results highlight biological insights, particularly relating to neuronal function and corticogenesis, and establish that GWAS performed at scale will be much more productive in the near term in ASD.
Assuntos
Transtorno do Espectro Autista/genética , Predisposição Genética para Doença/genética , Polimorfismo de Nucleotídeo Único/genética , Adolescente , Estudos de Casos e Controles , Criança , Pré-Escolar , Dinamarca , Feminino , Estudo de Associação Genômica Ampla/métodos , Humanos , Masculino , Herança Multifatorial/genética , Fenótipo , Fatores de RiscoRESUMO
Attention deficit/hyperactivity disorder (ADHD) is a highly heritable childhood behavioral disorder affecting 5% of children and 2.5% of adults. Common genetic variants contribute substantially to ADHD susceptibility, but no variants have been robustly associated with ADHD. We report a genome-wide association meta-analysis of 20,183 individuals diagnosed with ADHD and 35,191 controls that identifies variants surpassing genome-wide significance in 12 independent loci, finding important new information about the underlying biology of ADHD. Associations are enriched in evolutionarily constrained genomic regions and loss-of-function intolerant genes and around brain-expressed regulatory marks. Analyses of three replication studies: a cohort of individuals diagnosed with ADHD, a self-reported ADHD sample and a meta-analysis of quantitative measures of ADHD symptoms in the population, support these findings while highlighting study-specific differences on genetic overlap with educational attainment. Strong concordance with GWAS of quantitative population measures of ADHD symptoms supports that clinical diagnosis of ADHD is an extreme expression of continuous heritable traits.
Assuntos
Transtorno do Deficit de Atenção com Hiperatividade/genética , Loci Gênicos/genética , Predisposição Genética para Doença/genética , Polimorfismo de Nucleotídeo Único/genética , Adolescente , Encéfalo/fisiologia , Criança , Pré-Escolar , Estudos de Coortes , Feminino , Regulação da Expressão Gênica/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Masculino , RiscoRESUMO
Current antigenic targets for influenza vaccine development are either highly immunogenic epitopes of high variability or conserved epitopes of low immunogenicity. This requires continuous update of the variable epitopes in the vaccine formulation or boosting of immunity to invariant epitopes of low natural efficacy. Here we identify a highly immunogenic epitope of limited variability in the head domain of the H1 haemagglutinin protein. We show that a cohort of young children exhibit natural immunity to a set of historical influenza strains which they could not have previously encountered and that this is partially mediated through the epitope. Furthermore, vaccinating mice with these epitope conformations can induce immunity to human H1N1 influenza strains that have circulated since 1918. The identification of epitopes of limited variability offers a mechanism by which a universal influenza vaccine can be created; these vaccines would also have the potential to protect against newly emerging influenza strains.