ABSTRACT
Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes1. Here, using data from a genome-wide association study of 5.4 million individuals of diverse ancestries, we show that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a mean size of around 90 kb, covering about 21% of the genome. The density of independent associations varies across the genome and the regions of increased density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs (or all SNPs in the HapMap 3 panel2) account for 40% (45%) of phenotypic variance in populations of European ancestry but only around 10-20% (14-24%) in populations of other ancestries. Effect sizes, associated regions and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely to be explained by linkage disequilibrium and differences in allele frequency within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than are needed to implicate causal genes and variants. Overall, this study provides a comprehensive map of specific genomic regions that contain the vast majority of common height-associated variants. Although this map is saturated for populations of European ancestry, further research is needed to achieve equivalent saturation in other ancestries.
Subject(s)
Body Height , Chromosome Mapping , Polymorphism, Single Nucleotide , Humans , Body Height/genetics , Gene Frequency/genetics , Genome, Human/genetics , Genome-Wide Association Study , Haplotypes/genetics , Linkage Disequilibrium/genetics , Polymorphism, Single Nucleotide/genetics , Europe/ethnology , Sample Size , PhenotypeABSTRACT
Increased blood lipid levels are heritable risk factors of cardiovascular disease with varied prevalence worldwide owing to different dietary patterns and medication use1. Despite advances in prevention and treatment, in particular through reducing low-density lipoprotein cholesterol levels2, heart disease remains the leading cause of death worldwide3. Genome-wideassociation studies (GWAS) of blood lipid levels have led to important biological and clinical insights, as well as new drug targets, for cardiovascular disease. However, most previous GWAS4-23 have been conducted in European ancestry populations and may have missed genetic variants that contribute to lipid-level variation in other ancestry groups. These include differences in allele frequencies, effect sizes and linkage-disequilibrium patterns24. Here we conduct a multi-ancestry, genome-wide genetic discovery meta-analysis of lipid levels in approximately 1.65 million individuals, including 350,000 of non-European ancestries. We quantify the gain in studying non-European ancestries and provide evidence to support the expansion of recruitment of additional ancestries, even with relatively small sample sizes. We find that increasing diversity rather than studying additional individuals of European ancestry results in substantial improvements in fine-mapping functional variants and portability of polygenic prediction (evaluated in approximately 295,000 individuals from 7 ancestry groupings). Modest gains in the number of discovered loci and ancestry-specific variants were also achieved. As GWAS expand emphasis beyond the identification of genes and fundamental biology towards the use of genetic variants for preventive and precision medicine25, we anticipate that increased diversity of participants will lead to more accurate and equitable26 application of polygenic scores in clinical practice.
Subject(s)
Cardiovascular Diseases , Genome-Wide Association Study , Cardiovascular Diseases/genetics , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study/methods , Humans , Linkage Disequilibrium , Multifactorial Inheritance , Polymorphism, Single Nucleotide/genetics , Population GroupsABSTRACT
Findings from genome-wide association studies have facilitated the generation of genetic predictors for many common human phenotypes. Stratifying individuals misaligned to a genetic predictor based on common variants may be important for follow-up studies that aim to identify alternative causal factors. Using genome-wide imputed genetic data, we aimed to classify 158,951 unrelated individuals from the UK Biobank as either concordant or deviating from two well-measured phenotypes. We first applied our methods to standing height: our primary analysis classified 244 individuals (0.15%) as misaligned to their genetically predicted height. We show that these individuals are enriched for self-reporting being shorter or taller than average at age 10, diagnosed congenital malformations, and rare loss-of-function variants in genes previously catalogued as causal for growth disorders. Secondly, we apply our methods to LDL cholesterol (LDL-C). We classified 156 (0.12%) individuals as misaligned to their genetically predicted LDL-C and show that these individuals were enriched for both clinically actionable cardiovascular risk factors and rare genetic variants in genes previously shown to be involved in metabolic processes. Individuals whose LDL-C was higher than expected based on the genetic predictor were also at higher risk of developing coronary artery disease and type-two diabetes, even after adjustment for measured LDL-C, BMI and age, suggesting upward deviation from genetically predicted LDL-C is indicative of generally poor health. Our results remained broadly consistent when performing sensitivity analysis based on a variety of parametric and non-parametric methods to define individuals deviating from polygenic expectation. Our analyses demonstrate the potential importance of quantitatively identifying individuals for further follow-up based on deviation from genetic predictions.
Subject(s)
Coronary Artery Disease , Genome-Wide Association Study , Humans , Child , Cholesterol, LDL/genetics , Phenotype , Coronary Artery Disease/genetics , Follow-Up Studies , Mendelian Randomization Analysis , Risk Factors , Polymorphism, Single NucleotideABSTRACT
A major challenge of genome-wide association studies (GWASs) is to translate phenotypic associations into biological insights. Here, we integrate a large GWAS on blood lipids involving 1.6 million individuals from five ancestries with a wide array of functional genomic datasets to discover regulatory mechanisms underlying lipid associations. We first prioritize lipid-associated genes with expression quantitative trait locus (eQTL) colocalizations and then add chromatin interaction data to narrow the search for functional genes. Polygenic enrichment analysis across 697 annotations from a host of tissues and cell types confirms the central role of the liver in lipid levels and highlights the selective enrichment of adipose-specific chromatin marks in high-density lipoprotein cholesterol and triglycerides. Overlapping transcription factor (TF) binding sites with lipid-associated loci identifies TFs relevant in lipid biology. In addition, we present an integrative framework to prioritize causal variants at GWAS loci, producing a comprehensive list of candidate causal genes and variants with multiple layers of functional evidence. We highlight two of the prioritized genes, CREBRF and RRBP1, which show convergent evidence across functional datasets supporting their roles in lipid biology.
Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Chromatin/genetics , Genomics , Humans , Lipids/genetics , Polymorphism, Single Nucleotide/geneticsABSTRACT
Height is a highly heritable, classic polygenic trait with approximately 700 common associated variants identified through genome-wide association studies so far. Here, we report 83 height-associated coding variants with lower minor-allele frequencies (in the range of 0.1-4.8%) and effects of up to 2 centimetres per allele (such as those in IHH, STC2, AR and CRISPLD2), greater than ten times the average effect of common variants. In functional follow-up studies, rare height-increasing alleles of STC2 (giving an increase of 1-2 centimetres per allele) compromised proteolytic inhibition of PAPP-A and increased cleavage of IGFBP-4 in vitro, resulting in higher bioavailability of insulin-like growth factors. These 83 height-associated variants overlap genes that are mutated in monogenic growth disorders and highlight new biological candidates (such as ADAMTS3, IL11RA and NOX4) and pathways (such as proteoglycan and glycosaminoglycan synthesis) involved in growth. Our results demonstrate that sufficiently large sample sizes can uncover rare and low-frequency variants of moderate-to-large effect associated with polygenic human phenotypes, and that these variants implicate relevant genes and pathways.
Subject(s)
Body Height/genetics , Gene Frequency/genetics , Genetic Variation/genetics , ADAMTS Proteins/genetics , Adult , Alleles , Cell Adhesion Molecules/genetics , Female , Genome, Human/genetics , Glycoproteins/genetics , Glycoproteins/metabolism , Glycosaminoglycans/biosynthesis , Hedgehog Proteins/genetics , Humans , Intercellular Signaling Peptides and Proteins/genetics , Intercellular Signaling Peptides and Proteins/metabolism , Interferon Regulatory Factors/genetics , Interleukin-11 Receptor alpha Subunit/genetics , Male , Multifactorial Inheritance/genetics , NADPH Oxidase 4 , NADPH Oxidases/genetics , Phenotype , Pregnancy-Associated Plasma Protein-A/metabolism , Procollagen N-Endopeptidase/genetics , Proteoglycans/biosynthesis , Proteolysis , Receptors, Androgen/genetics , Somatomedins/metabolismABSTRACT
The growth hormone and insulin-like growth factor (IGF) system is integral to human growth. Genome-wide association studies (GWAS) have identified variants associated with height and located near the genes in this pathway. However, mechanisms underlying these genetic associations are not understood. To investigate the regulation of the genes in this pathway and mechanisms by which regulation could affect growth, we performed GWAS of measured serum protein levels of IGF-I, IGF binding protein-3 (IGFBP-3), pregnancy-associated plasma protein A (PAPP-A2), IGF-II and IGFBP-5 in 838 children (3-18 years) from the Cincinnati Genomic Control Cohort. We identified variants associated with protein levels near IGFBP3 and IGFBP5 genes, which contain multiple signals of association with height and other skeletal growth phenotypes. Surprisingly, variants that associate with protein levels at these two loci do not colocalize with height associations, confirmed through conditional analysis. Rather, the IGFBP3 signal (associated with total IGFBP-3 and IGF-II levels) colocalizes with an association with sitting height ratio (SHR); the IGFBP5 signal (associated with IGFBP-5 levels) colocalizes with birth weight. Indeed, height-associated single nucleotide polymorphisms near genes encoding other proteins in this pathway are not associated with serum levels, possibly excluding PAPP-A2. Mendelian randomization supports a stronger causal relationship of measured serum levels with SHR (for IGFBP-3) and birth weight (for IGFBP-5) than with height. In conclusion, we begin to characterize the genetic regulation of serum levels of IGF-related proteins in childhood. Furthermore, our data strongly suggest the existence of growth-regulating mechanisms acting through IGF-related genes in ways that are not reflected in measured serum levels of the corresponding proteins.
Subject(s)
Body Height/genetics , Growth Hormone/genetics , Insulin-Like Growth Factor Binding Protein 3/genetics , Insulin-Like Growth Factor Binding Protein 5/genetics , Insulin-Like Growth Factor I/genetics , Adolescent , Birth Weight/genetics , Child , Child, Preschool , Female , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , Humans , Insulin-Like Growth Factor Binding Protein 3/blood , Insulin-Like Growth Factor Binding Protein 5/blood , Insulin-Like Growth Factor II/genetics , Male , Mendelian Randomization Analysis , Pregnancy-Associated Plasma Protein-A/genetics , Sitting PositionABSTRACT
Body fat distribution is a heritable trait and a well-established predictor of adverse metabolic outcomes, independent of overall adiposity. To increase our understanding of the genetic basis of body fat distribution and its molecular links to cardiometabolic traits, here we conduct genome-wide association meta-analyses of traits related to waist and hip circumferences in up to 224,459 individuals. We identify 49 loci (33 new) associated with waist-to-hip ratio adjusted for body mass index (BMI), and an additional 19 loci newly associated with related waist and hip circumference measures (P < 5 × 10(-8)). In total, 20 of the 49 waist-to-hip ratio adjusted for BMI loci show significant sexual dimorphism, 19 of which display a stronger effect in women. The identified loci were enriched for genes expressed in adipose tissue and for putative regulatory elements in adipocytes. Pathway analyses implicated adipogenesis, angiogenesis, transcriptional regulation and insulin resistance as processes affecting fat distribution, providing insight into potential pathophysiological mechanisms.
Subject(s)
Adipose Tissue/metabolism , Body Fat Distribution , Genome-Wide Association Study , Insulin/metabolism , Quantitative Trait Loci/genetics , Adipocytes/metabolism , Adipogenesis/genetics , Age Factors , Body Mass Index , Epigenesis, Genetic , Europe/ethnology , Female , Genome, Human/genetics , Humans , Insulin Resistance/genetics , Male , Models, Biological , Neovascularization, Physiologic/genetics , Obesity/genetics , Polymorphism, Single Nucleotide/genetics , Racial Groups/genetics , Sex Characteristics , Transcription, Genetic/genetics , Waist-Hip RatioABSTRACT
Obesity is heritable and predisposes to many diseases. To understand the genetic basis of obesity better, here we conduct a genome-wide association study and Metabochip meta-analysis of body mass index (BMI), a measure commonly used to define obesity and assess adiposity, in up to 339,224 individuals. This analysis identifies 97 BMI-associated loci (P < 5 × 10(-8)), 56 of which are novel. Five loci demonstrate clear evidence of several independent association signals, and many loci have significant effects on other metabolic phenotypes. The 97 loci account for â¼2.7% of BMI variation, and genome-wide estimates suggest that common variation accounts for >20% of BMI variation. Pathway analyses provide strong support for a role of the central nervous system in obesity susceptibility and implicate new genes and pathways, including those related to synaptic function, glutamate signalling, insulin secretion/action, energy metabolism, lipid biology and adipogenesis.
Subject(s)
Body Mass Index , Genome-Wide Association Study , Obesity/genetics , Obesity/metabolism , Adipogenesis/genetics , Adiposity/genetics , Age Factors , Energy Metabolism/genetics , Europe/ethnology , Female , Genetic Predisposition to Disease/genetics , Glutamic Acid/metabolism , Humans , Insulin/metabolism , Insulin Secretion , Male , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Racial Groups/genetics , Synapses/metabolismABSTRACT
BACKGROUND: Obesity and its associated diseases are major health problems characterized by extensive metabolic disturbances. Understanding the causal connections between these phenotypes and variation in metabolite levels can uncover relevant biology and inform novel intervention strategies. Recent studies have combined metabolite profiling with genetic instrumental variable (IV) analysis (Mendelian randomization) to infer the direction of causality between metabolites and obesity, but often omitted a large portion of untargeted profiling data consisting of unknown, unidentified metabolite signals. METHODS: We expanded upon previous research by identifying body mass index (BMI)-associated metabolites in multiple untargeted metabolomics datasets, and then performing bidirectional IV analysis to classify metabolites based on their inferred causal relationships with BMI. Meta-analysis and pathway analysis of both known and unknown metabolites across datasets were enabled by our recently developed bioinformatics suite, PAIRUP-MS. RESULTS: We identified ten known metabolites that are more likely to be causes (e.g., alpha-hydroxybutyrate) or effects (e.g., valine) of BMI, or may have more complex bidirectional cause-effect relationships with BMI (e.g., glycine). Importantly, we also identified about five times more unknown than known metabolites in each of these three categories. Pathway analysis incorporating both known and unknown metabolites prioritized 40 enriched (p < 0.05) metabolite sets for the cause versus effect groups, providing further support that these two metabolite groups are linked to obesity via distinct biological mechanisms. CONCLUSIONS: These findings demonstrate the potential utility of our approach to uncover causal connections with obesity from untargeted metabolomics datasets. Combining genetically informed causal inference with the ability to map unknown metabolites across datasets provides a path to jointly analyze many untargeted datasets with obesity or other phenotypes. This approach, applied to larger datasets with genotype and untargeted metabolite data, should generate sufficient power for robust discovery and replication of causal biological connections between metabolites and various human diseases.
Subject(s)
Metabolome , Obesity/metabolism , Body Mass Index , Causality , Computational Biology , Humans , Metabolomics , Obesity/geneticsABSTRACT
Genome-wide association studies (GWAS) have identified >300 loci associated with measures of adiposity including body mass index (BMI) and waist-to-hip ratio (adjusted for BMI, WHRadjBMI), but few have been identified through screening of the African ancestry genomes. We performed large scale meta-analyses and replications in up to 52,895 individuals for BMI and up to 23,095 individuals for WHRadjBMI from the African Ancestry Anthropometry Genetics Consortium (AAAGC) using 1000 Genomes phase 1 imputed GWAS to improve coverage of both common and low frequency variants in the low linkage disequilibrium African ancestry genomes. In the sex-combined analyses, we identified one novel locus (TCF7L2/HABP2) for WHRadjBMI and eight previously established loci at P < 5×10-8: seven for BMI, and one for WHRadjBMI in African ancestry individuals. An additional novel locus (SPRYD7/DLEU2) was identified for WHRadjBMI when combined with European GWAS. In the sex-stratified analyses, we identified three novel loci for BMI (INTS10/LPL and MLC1 in men, IRX4/IRX2 in women) and four for WHRadjBMI (SSX2IP, CASC8, PDE3B and ZDHHC1/HSD11B2 in women) in individuals of African ancestry or both African and European ancestry. For four of the novel variants, the minor allele frequency was low (<5%). In the trans-ethnic fine mapping of 47 BMI loci and 27 WHRadjBMI loci that were locus-wide significant (P < 0.05 adjusted for effective number of variants per locus) from the African ancestry sex-combined and sex-stratified analyses, 26 BMI loci and 17 WHRadjBMI loci contained ≤ 20 variants in the credible sets that jointly account for 99% posterior probability of driving the associations. The lead variants in 13 of these loci had a high probability of being causal. As compared to our previous HapMap imputed GWAS for BMI and WHRadjBMI including up to 71,412 and 27,350 African ancestry individuals, respectively, our results suggest that 1000 Genomes imputation showed modest improvement in identifying GWAS loci including low frequency variants. Trans-ethnic meta-analyses further improved fine mapping of putative causal variants in loci shared between the African and European ancestry populations.
Subject(s)
Adiposity/genetics , Obesity/genetics , Serine Endopeptidases/genetics , Transcription Factor 7-Like 2 Protein/genetics , Anthropometry , Black People/genetics , Body Mass Index , Chromosome Mapping , Female , Gene Frequency , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Linkage Disequilibrium , Male , Obesity/pathology , Polymorphism, Single Nucleotide , Waist-Hip Ratio , White People/geneticsABSTRACT
Human height is a composite measurement, reflecting the sum of leg, spine, and head lengths. Many common variants influence total height, but the effects of these or other variants on the components of height (body proportion) remain largely unknown. We studied sitting height ratio (SHR), the ratio of sitting height to total height, to identify such effects in 3,545 African Americans and 21,590 individuals of European ancestry. We found that SHR is heritable: 26% and 39% of the total variance of SHR can be explained by common variants in European and African Americans, respectively, and global European admixture is negatively correlated with SHR in African Americans (r(2) ≈ 0.03). Six regions reached genome-wide significance (p < 5 × 10(-8)) for association with SHR and overlapped biological candidate genes, including TBX2 and IGFBP3. We found that 130 of 670 height-associated variants are nominally associated (p < 0.05) with SHR, more than expected by chance (p = 5 × 10(-40)). At these 130 loci, the height-increasing alleles are associated with either a decrease (71 loci) or increase (59 loci) in SHR, suggesting that different height loci disproportionally affect either leg length or spine/head length. Pathway analyses via DEPICT revealed that height loci affecting SHR, and especially those affecting leg length, show enrichment of different biological pathways (e.g., bone/cartilage/growth plate pathways) than do loci with no effect on SHR (e.g., embryonic development). These results highlight the value of using a pair of related but orthogonal phenotypes, in this case SHR with height, as a prism to dissect the biology underlying genetic associations in polygenic traits and diseases.
Subject(s)
Body Height/genetics , Genome-Wide Association Study , Multifactorial Inheritance/genetics , Adult , Black or African American/genetics , Chromosome Mapping , Female , Humans , Leg Bones/growth & development , Male , Middle Aged , Phenotype , Polymorphism, Single Nucleotide , White People/geneticsABSTRACT
BACKGROUND: A fundamental precept of the carbohydrate-insulin model of obesity is that insulin secretion drives weight gain. However, fasting hyperinsulinemia can also be driven by obesity-induced insulin resistance. We used genetic variation to isolate and estimate the potentially causal effect of insulin secretion on body weight. METHODS: Genetic instruments of variation of insulin secretion [assessed as insulin concentration 30 min after oral glucose (insulin-30)] were used to estimate the causal relationship between increased insulin secretion and body mass index (BMI), using bidirectional Mendelian randomization analysis of genome-wide association studies. Data sources included summary results from the largest published metaanalyses of predominantly European ancestry for insulin secretion (n = 26037) and BMI (n = 322154), as well as individual-level data from the UK Biobank (n = 138541). Data from the Cardiology and Metabolic Patient Cohort study at Massachusetts General Hospital (n = 1675) were used to validate genetic associations with insulin secretion and to test the observational association of insulin secretion and BMI. RESULTS: Higher genetically determined insulin-30 was strongly associated with higher BMI (ß = 0.098, P = 2.2 × 10-21), consistent with a causal role in obesity. Similar positive associations were noted in sensitivity analyses using other genetic variants as instrumental variables. By contrast, higher genetically determined BMI was not associated with insulin-30. CONCLUSIONS: Mendelian randomization analyses provide evidence for a causal relationship of glucose-stimulated insulin secretion on body weight, consistent with the carbohydrate-insulin model of obesity.
Subject(s)
Dietary Carbohydrates/administration & dosage , Insulin Secretion/genetics , Mendelian Randomization Analysis , Obesity/genetics , Obesity/metabolism , Body Mass Index , Cohort Studies , Fasting , Genome-Wide Association Study , Glucose/administration & dosage , Humans , Insulin Resistance , Models, Biological , Polymorphism, Single Nucleotide , Reproducibility of ResultsABSTRACT
Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits, but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P < 0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented among variants that alter amino-acid structure of proteins and expression levels of nearby genes. Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation). Although additional approaches are needed to dissect the genetic architecture of polygenic human traits fully, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.
Subject(s)
Body Height/genetics , Genetic Loci/genetics , Genome, Human/genetics , Metabolic Networks and Pathways/genetics , Polymorphism, Single Nucleotide/genetics , Chromosomes, Human, Pair 3/genetics , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , Humans , Multifactorial Inheritance/genetics , PhenotypeABSTRACT
Given the anthropometric differences between men and women and previous evidence of sex-difference in genetic effects, we conducted a genome-wide search for sexually dimorphic associations with height, weight, body mass index, waist circumference, hip circumference, and waist-to-hip-ratio (133,723 individuals) and took forward 348 SNPs into follow-up (additional 137,052 individuals) in a total of 94 studies. Seven loci displayed significant sex-difference (FDR<5%), including four previously established (near GRB14/COBLL1, LYPLAL1/SLC30A10, VEGFA, ADAMTS9) and three novel anthropometric trait loci (near MAP3K1, HSD17B4, PPARG), all of which were genome-wide significant in women (P<5×10(-8)), but not in men. Sex-differences were apparent only for waist phenotypes, not for height, weight, BMI, or hip circumference. Moreover, we found no evidence for genetic effects with opposite directions in men versus women. The PPARG locus is of specific interest due to its role in diabetes genetics and therapy. Our results demonstrate the value of sex-specific GWAS to unravel the sexually dimorphic genetic underpinning of complex traits.
Subject(s)
Anthropometry/methods , Body Weights and Measures , Genome-Wide Association Study , Sex Characteristics , Body Height/genetics , Body Mass Index , Body Weight/genetics , Female , Genetic Loci , Genome, Human , Humans , Male , Polymorphism, Single Nucleotide , Waist Circumference/genetics , Waist-Hip RatioABSTRACT
Findings from genome-wide association studies have facilitated the generation of genetic predictors for many common human phenotypes. Stratifying individuals misaligned to a genetic predictor based on common variants may be important for follow-up studies that aim to identify alternative causal factors. Using genome-wide imputed genetic data, we aimed to classify 158,951 unrelated individuals from the UK Biobank as either concordant or deviating from two well-measured phenotypes. We first applied our methods to standing height: our primary analysis classified 244 individuals (0.15%) as misaligned to their genetically predicted height. We show that these individuals are enriched for self-reporting being shorter or taller than average at age 10, diagnosed congenital malformations, and rare loss-of-function variants in genes previously catalogued as causal for growth disorders. Secondly, we apply our methods to LDL cholesterol. We classified 156 (0.12%) individuals as misaligned to their genetically predicted LDL cholesterol and show that these individuals were enriched for both clinically actionable cardiovascular risk factors and rare genetic variants in genes previously shown to be involved in metabolic processes. Individuals whose LDL-C was higher than expected based on the genetic predictor were also at higher risk of developing coronary artery disease and type-two diabetes, even after adjustment for measured LDL-C, BMI and age, suggesting upward deviation from genetically predicted LDL-C is indicative of generally poor health. Our results remained broadly consistent when performing sensitivity analysis based on a variety of parametric and non-parametric methods to define individuals deviating from polygenic expectation. Our analyses demonstrate the potential importance of quantitatively identifying individuals for further follow-up based on deviation from genetic predictions. Author Summary: Human genetics is becoming increasingly useful to help predict human traits across a population owing to findings from large-scale genetic association studies and advances in the power of genetic predictors. This provides an opportunity to potentially identify individuals that deviate from genetic predictions for a common phenotype under investigation. For example, an individual may be genetically predicted to be tall, but be shorter than expected. It is potentially important to identify individuals who deviate from genetic predictions as this can facilitate further follow-up to assess likely causes. Using 158,951 unrelated individuals from the UK Biobank, with height and LDL cholesterol, as exemplar traits, we demonstrate that approximately 0.15% & 0.12% of individuals deviate from their genetically predicted phenotypes respectively. We observed these individuals to be enriched for a range of rare clinical diagnoses, as well as rare genetic factors that may be causal. Our analyses also demonstrate several methods for detecting individuals who deviate from genetic predictions that can be applied to a range of continuous human phenotypes.
ABSTRACT
Alterations in the growth and maturation of chondrocytes can lead to variation in human height, including monogenic disorders of skeletal growth. We aimed to identify genes and pathways relevant to human growth by pairing human height genome-wide association studies (GWASs) with genome-wide knockout (KO) screens of growth-plate chondrocyte proliferation and maturation in vitro. We identified 145 genes that alter chondrocyte proliferation and maturation at early and/or late time points in culture, with 90% of genes validating in secondary screening. These genes are enriched in monogenic growth disorder genes and in KEGG pathways critical for skeletal growth and endochondral ossification. Further, common variants near these genes capture height heritability independent of genes computationally prioritized from GWASs. Our study emphasizes the value of functional studies in biologically relevant tissues as orthogonal datasets to refine likely causal genes from GWASs and implicates new genetic regulators of chondrocyte proliferation and maturation.
ABSTRACT
Human height can be divided into sitting height and leg length, reflecting growth of different parts of the skeleton whose relative proportions are captured by the ratio of sitting to total height (as sitting height ratio, SHR). Height is a highly heritable trait, and its genetic basis has been well-studied. However, the genetic determinants of skeletal proportion are much less well-characterized. Expanding substantially on past work, we performed a genome-wide association study (GWAS) of SHR in â¼450,000 individuals with European ancestry and â¼100,000 individuals with East Asian ancestry from the UK and China Kadoorie Biobanks. We identified 565 loci independently associated with SHR, including all genomic regions implicated in prior GWAS in these ancestries. While SHR loci largely overlap height-associated loci (P < 0.001), the fine-mapped SHR signals were often distinct from height. We additionally used fine-mapped signals to identify 36 credible sets with heterogeneous effects across ancestries. Lastly, we used SHR, sitting height, and leg length to identify genetic variation acting on specific body regions rather than on overall human height.
ABSTRACT
BACKGROUND: We aimed to identify novel genetic variants affecting asthma risk, since these might provide novel insights into molecular mechanisms underlying the disease. METHODS: We did a genome-wide association study (GWAS) in 2669 physician-diagnosed asthmatics and 4528 controls from Australia. Seven loci were prioritised for replication after combining our results with those from the GABRIEL consortium (n=26,475), and these were tested in an additional 25,358 independent samples from four in-silico cohorts. Quantitative multi-marker scores of genetic load were constructed on the basis of results from the GABRIEL study and tested for association with asthma in our Australian GWAS dataset. FINDINGS: Two loci were confirmed to associate with asthma risk in the replication cohorts and reached genome-wide significance in the combined analysis of all available studies (n=57,800): rs4129267 (OR 1·09, combined p=2·4×10(-8)) in the interleukin-6 receptor (IL6R) gene and rs7130588 (OR 1·09, p=1·8×10(-8)) on chromosome 11q13.5 near the leucine-rich repeat containing 32 gene (LRRC32, also known as GARP). The 11q13.5 locus was significantly associated with atopic status among asthmatics (OR 1·33, p=7×10(-4)), suggesting that it is a risk factor for allergic but not non-allergic asthma. Multi-marker association results are consistent with a highly polygenic contribution to asthma risk, including loci with weak effects that might be shared with other immune-related diseases, such as NDFIP1, HLA-B, LPP, and BACH2. INTERPRETATION: The IL6R association further supports the hypothesis that cytokine signalling dysregulation affects asthma risk, and raises the possibility that an IL6R antagonist (tocilizumab) may be effective to treat the disease, perhaps in a genotype-dependent manner. Results for the 11q13.5 locus suggest that it directly increases the risk of allergic sensitisation which, in turn, increases the risk of subsequent development of asthma. Larger or more functionally focused studies are needed to characterise the many loci with modest effects that remain to be identified for asthma. FUNDING: National Health and Medical Research Council of Australia. A full list of funding sources is provided in the webappendix.
Subject(s)
Asthma/genetics , Chromosomes, Human, Pair 11/genetics , Genetic Loci/genetics , Membrane Proteins/genetics , Polymorphism, Single Nucleotide , Receptors, Interleukin-6/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Asthma/immunology , Child , Child, Preschool , Female , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , Humans , Hypersensitivity, Immediate/genetics , Linkage Disequilibrium , Male , Middle Aged , Young AdultABSTRACT
BACKGROUND: Genetic variants within nearly 1000 loci are known to contribute to modulation of blood lipid levels. However, the biological pathways underlying these associations are frequently unknown, limiting understanding of these findings and hindering downstream translational efforts such as drug target discovery. RESULTS: To expand our understanding of the underlying biological pathways and mechanisms controlling blood lipid levels, we leverage a large multi-ancestry meta-analysis (N = 1,654,960) of blood lipids to prioritize putative causal genes for 2286 lipid associations using six gene prediction approaches. Using phenome-wide association (PheWAS) scans, we identify relationships of genetically predicted lipid levels to other diseases and conditions. We confirm known pleiotropic associations with cardiovascular phenotypes and determine novel associations, notably with cholelithiasis risk. We perform sex-stratified GWAS meta-analysis of lipid levels and show that 3-5% of autosomal lipid-associated loci demonstrate sex-biased effects. Finally, we report 21 novel lipid loci identified on the X chromosome. Many of the sex-biased autosomal and X chromosome lipid loci show pleiotropic associations with sex hormones, emphasizing the role of hormone regulation in lipid metabolism. CONCLUSIONS: Taken together, our findings provide insights into the biological mechanisms through which associated variants lead to altered lipid levels and potentially cardiovascular disease risk.