Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 368
Filter
1.
Am J Hum Genet ; 111(10): 2139-2149, 2024 Oct 03.
Article in English | MEDLINE | ID: mdl-39366334

ABSTRACT

Gene-based burden tests are a popular and powerful approach for analysis of exome-wide association studies. These approaches combine sets of variants within a gene into a single burden score that is then tested for association. Typically, a range of burden scores are calculated and tested across a range of annotation classes and frequency bins. Correlation between these tests can complicate the multiple testing correction and hamper interpretation of the results. We introduce a method called the sparse burden association test (SBAT) that tests the joint set of burden scores under the assumption that causal burden scores act in the same effect direction. The method simultaneously assesses the significance of the model fit and selects the set of burden scores that best explain the association at the same time. Using simulated data, we show that the method is well calibrated and highlight scenarios where the test outperforms existing gene-based tests. We apply the method to 73 quantitative traits from the UK Biobank, showing that SBAT is a valuable additional gene-based test when combined with other existing approaches. This test is implemented in the REGENIE software.


Subject(s)
Genome-Wide Association Study , Humans , Genome-Wide Association Study/methods , Least-Squares Analysis , Software , Models, Genetic , Exome/genetics , Genetic Variation , Computer Simulation
2.
Genet Epidemiol ; 2024 Oct 09.
Article in English | MEDLINE | ID: mdl-39385445

ABSTRACT

Persistent opioid use after surgery is a common morbidity outcome associated with subsequent opioid use disorder, overdose, and death. While phenotypic associations have been described, genetic associations remain unidentified. Here, we conducted the largest genetic study of persistent opioid use after surgery, comprising ~40,000 non-Hispanic, European-ancestry Michigan Genomics Initiative participants (3198 cases and 36,321 surgically exposed controls). Our study primarily focused on the reproducibility and reliability of 72 genetic studies of opioid use disorder phenotypes. Nominal associations (p < 0.05) occurred at 12 of 80 unique (r2 < 0.8) signals from these studies. Six occurred in OPRM1 (most significant: rs79704991-T, OR = 1.17, p = 8.7 × 10-5), with two surviving multiple testing correction. Other associations were rs640561-LRRIQ3 (p = 0.015), rs4680-COMT (p = 0.016), rs9478495 (p = 0.017, intergenic), rs10886472-GRK5 (p = 0.028), rs9291211-SLC30A9/BEND4 (p = 0.043), and rs112068658-KCNN1 (p = 0.048). Two highly referenced genes, OPRD1 and DRD2/ANKK1, had no signals in MGI. Associations at previously identified OPRM1 variants suggest common biology between persistent opioid use and opioid use disorder, further demonstrating connections between opioid dependence and addiction phenotypes. Lack of significant associations at other variants challenges previous studies' reliability.

3.
medRxiv ; 2024 Aug 26.
Article in English | MEDLINE | ID: mdl-39228737

ABSTRACT

Clonal hematopoiesis (CH) is defined by the expansion of a lineage of genetically identical cells in blood. Genetic lesions that confer a fitness advantage, such as point mutations or mosaic chromosomal alterations (mCAs) in genes associated with hematologic malignancy, are frequent mediators of CH. However, recent analyses of both single cell-derived colonies of hematopoietic cells and population sequencing cohorts have revealed CH frequently occurs in the absence of known driver genetic lesions. To characterize CH without known driver genetic lesions, we used 51,399 deeply sequenced whole genomes from the NHLBI TOPMed sequencing initiative to perform simultaneous germline and somatic mutation analyses among individuals without leukemogenic point mutations (LPM), which we term CH-LPMneg. We quantified CH by estimating the total mutation burden. Because estimating somatic mutation burden without a paired-tissue sample is challenging, we developed a novel statistical method, the Genomic and Epigenomic informed Mutation (GEM) rate, that uses external genomic and epigenomic data sources to distinguish artifactual signals from true somatic mutations. We performed a genome-wide association study of GEM to discover the germline determinants of CH-LPMneg. After fine-mapping and variant-to-gene analyses, we identified seven genes associated with CH-LPMneg (TCL1A, TERT, SMC4, NRIP1, PRDM16, MSRA, SCARB1), and one locus associated with a sex-associated mutation pathway (SRGAP2C). We performed a secondary analysis excluding individuals with mCAs, finding that the genetic architecture was largely unaffected by their inclusion. Functional analyses of SMC4 and NRIP1 implicated altered HSC self-renewal and proliferation as the primary mediator of mutation burden in blood. We then performed comprehensive multi-tissue transcriptomic analyses, finding that the expression levels of 404 genes are associated with GEM. Finally, we performed phenotypic association meta-analyses across four cohorts, finding that GEM is associated with increased white blood cell count and increased risk for incident peripheral artery disease, but is not significantly associated with incident stroke or coronary disease events. Overall, we develop GEM for quantifying mutation burden from WGS without a paired-tissue sample and use GEM to discover the genetic, genomic, and phenotypic correlates of CH-LPMneg.

4.
Mitochondrion ; 79: 101954, 2024 Sep 07.
Article in English | MEDLINE | ID: mdl-39245194

ABSTRACT

We rigorously assessed a comprehensive association testing framework for heteroplasmy, employing both simulated and real-world data. This framework employed a variant allele fraction (VAF) threshold and harnessed multiple gene-based tests for robust identification and association testing of heteroplasmy. Our simulation studies demonstrated that gene-based tests maintained an appropriate type I error rate at α = 0.001. Notably, when 5 % or more heteroplasmic variants within a target region were linked to an outcome, burden-extension tests (including the adaptive burden test, variable threshold burden test, and z-score weighting burden test) outperformed the sequence kernel association test (SKAT) and the original burden test. Applying this framework, we conducted association analyses on whole-blood derived heteroplasmy in 17,507 individuals of African and European ancestries (31 % of African Ancestry, mean age of 62, with 58 % women) with whole genome sequencing data. We performed both cohort- and ancestry-specific association analyses, followed by meta-analysis on both pooled samples and within each ancestry group. Our results suggest that mtDNA-encoded genes/regions are likely to exhibit varying rates in somatic aging, with the notably strong associations observed between heteroplasmy in the RNR1 and RNR2 genes (p < 0.001) and advance aging by the Original Burden test. In contrast, SKAT identified significant associations (p < 0.001) between diabetes and the aggregated effects of heteroplasmy in several protein-coding genes. Further research is warranted to validate these findings. In summary, our proposed statistical framework represents a valuable tool for facilitating association testing of heteroplasmy with disease traits in large human populations.

5.
Nat Genet ; 2024 Sep 25.
Article in English | MEDLINE | ID: mdl-39322778

ABSTRACT

Whole-genome sequencing (WGS), whole-exome sequencing (WES) and array genotyping with imputation (IMP) are common strategies for assessing genetic variation and its association with medically relevant phenotypes. To date, there has been no systematic empirical assessment of the yield of these approaches when applied to hundreds of thousands of samples to enable the discovery of complex trait genetic signals. Using data for 100 complex traits from 149,195 individuals in the UK Biobank, we systematically compare the relative yield of these strategies in genetic association studies. We find that WGS and WES combined with arrays and imputation (WES + IMP) have the largest association yield. Although WGS results in an approximately fivefold increase in the total number of assayed variants over WES + IMP, the number of detected signals differed by only 1% for both single-variant and gene-based association analyses. Given that WES + IMP typically results in savings of lab and computational time and resources expended per sample, we evaluate the potential benefits of applying WES + IMP to larger samples. When we extend our WES + IMP analyses to 468,169 UK Biobank individuals, we observe an approximately fourfold increase in association signals with the threefold increase in sample size. We conclude that prioritizing WES + IMP and large sample sizes rather than contemporary short-read WGS alternatives will maximize the number of discoveries in genetic association studies.

6.
Nature ; 631(8021): 583-592, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38768635

ABSTRACT

Rare coding variants that substantially affect function provide insights into the biology of a gene1-3. However, ascertaining the frequency of such variants requires large sample sizes4-8. Here we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. In total, 23% of the Regeneron Genetics Center Million Exome (RGC-ME) data come from individuals of African, East Asian, Indigenous American, Middle Eastern and South Asian ancestry. The catalogue includes more than 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss of function (LOF), we identify 3,988 LOF-intolerant genes, including 86 that were previously assessed as tolerant and 1,153 that lack established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions that are depleted of missense variants despite being tolerant of pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this resource of coding variation from the RGC-ME dataset publicly accessible through a variant allele frequency browser.


Subject(s)
Exome , Genetic Variation , Proteins , Humans , Alleles , Exome/genetics , Exome Sequencing , Gene Frequency , Genetic Variation/genetics , Heterozygote , Loss of Function Mutation/genetics , Mutation, Missense/genetics , Open Reading Frames/genetics , Proteins/genetics , RNA Splice Sites/genetics , Precision Medicine
8.
medRxiv ; 2024 Jan 13.
Article in English | MEDLINE | ID: mdl-38260412

ABSTRACT

We rigorously assessed a comprehensive association testing framework for heteroplasmy, employing both simulated and real-world data. This framework employed a variant allele fraction (VAF) threshold and harnessed multiple gene-based tests for robust identification and association testing of heteroplasmy. Our simulation studies demonstrated that gene-based tests maintained an appropriate type I error rate at α=0.001. Notably, when 5% or more heteroplasmic variants within a target region were linked to an outcome, burden-extension tests (including the adaptive burden test, variable threshold burden test, and z-score weighting burden test) outperformed the sequence kernel association test (SKAT) and the original burden test. Applying this framework, we conducted association analyses on whole-blood derived heteroplasmy in 17,507 individuals of African and European ancestries (31% of African Ancestry, mean age of 62, with 58% women) with whole genome sequencing data. We performed both cohort- and ancestry-specific association analyses, followed by meta-analysis on both pooled samples and within each ancestry group. Our results suggest that mtDNA-encoded genes/regions are likely to exhibit varying rates in somatic aging, with the notably strong associations observed between heteroplasmy in the RNR1 and RNR2 genes (p<0.001) and advance aging by the Original Burden test. In contrast, SKAT identified significant associations (p<0.001) between diabetes and the aggregated effects of heteroplasmy in several protein-coding genes. Further research is warranted to validate these findings. In summary, our proposed statistical framework represents a valuable tool for facilitating association testing of heteroplasmy with disease traits in large human populations.

9.
Hum Mol Genet ; 33(4): 374-385, 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-37934784

ABSTRACT

Genome-wide association studies have contributed extensively to the discovery of disease-associated common variants. However, the genetic contribution to complex traits is still largely difficult to interpret. We report a genome-wide association study of 2394 cases and 2393 controls for age-related macular degeneration (AMD) via whole-genome sequencing, with 46.9 million genetic variants. Our study reveals significant single-variant association signals at four loci and independent gene-based signals in CFH, C2, C3, and NRTN. Using data from the Exome Aggregation Consortium (ExAC) for a gene-based test, we demonstrate an enrichment of predicted rare loss-of-function variants in CFH, CFI, and an as-yet unreported gene in AMD, ORMDL2. Our method of using a large variant list without individual-level genotypes as an external reference provides a flexible and convenient approach to leverage the publicly available variant datasets to augment the search for rare variant associations, which can explain additional disease risk in AMD.


Subject(s)
Genome-Wide Association Study , Macular Degeneration , Humans , Genome-Wide Association Study/methods , Macular Degeneration/genetics , Genotype , Genetic Testing , Whole Genome Sequencing , Polymorphism, Single Nucleotide/genetics , Genetic Predisposition to Disease , Complement Factor H/genetics
10.
Nature ; 622(7984): 784-793, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37821707

ABSTRACT

The Mexico City Prospective Study is a prospective cohort of more than 150,000 adults recruited two decades ago from the urban districts of Coyoacán and Iztapalapa in Mexico City1. Here we generated genotype and exome-sequencing data for all individuals and whole-genome sequencing data for 9,950 selected individuals. We describe high levels of relatedness and substantial heterogeneity in ancestry composition across individuals. Most sequenced individuals had admixed Indigenous American, European and African ancestry, with extensive admixture from Indigenous populations in central, southern and southeastern Mexico. Indigenous Mexican segments of the genome had lower levels of coding variation but an excess of homozygous loss-of-function variants compared with segments of African and European origin. We estimated ancestry-specific allele frequencies at 142 million genomic variants, with an effective sample size of 91,856 for Indigenous Mexican ancestry at exome variants, all available through a public browser. Using whole-genome sequencing, we developed an imputation reference panel that outperforms existing panels at common variants in individuals with high proportions of central, southern and southeastern Indigenous Mexican ancestry. Our work illustrates the value of genetic studies in diverse populations and provides foundational imputation and allele frequency resources for future genetic studies in Mexico and in the United States, where the Hispanic/Latino population is predominantly of Mexican descent.


Subject(s)
Exome Sequencing , Genome, Human , Genotype , Hispanic or Latino , Adult , Humans , Africa/ethnology , Americas/ethnology , Europe/ethnology , Gene Frequency/genetics , Genetics, Population , Genome, Human/genetics , Genotyping Techniques , Hispanic or Latino/genetics , Homozygote , Loss of Function Mutation/genetics , Mexico , Prospective Studies
11.
J Am Heart Assoc ; 12(20): e029090, 2023 10 17.
Article in English | MEDLINE | ID: mdl-37804200

ABSTRACT

Background The relationship between mitochondrial DNA copy number (mtDNA CN) and cardiovascular disease remains elusive. Methods and Results We performed cross-sectional and prospective association analyses of blood-derived mtDNA CN and cardiovascular disease outcomes in 27 316 participants in 8 cohorts of multiple racial and ethnic groups with whole-genome sequencing. We also performed Mendelian randomization to explore causal relationships of mtDNA CN with coronary heart disease (CHD) and cardiometabolic risk factors (obesity, diabetes, hypertension, and hyperlipidemia). P<0.01 was used for significance. We validated most of the previously reported associations between mtDNA CN and cardiovascular disease outcomes. For example, 1-SD unit lower level of mtDNA CN was associated with 1.08 (95% CI, 1.04-1.12; P<0.001) times the hazard for developing incident CHD, adjusting for covariates. Mendelian randomization analyses showed no causal effect from a lower level of mtDNA CN to a higher CHD risk (ß=0.091; P=0.11) or in the reverse direction (ß=-0.012; P=0.076). Additional bidirectional Mendelian randomization analyses revealed that low-density lipoprotein cholesterol had a causal effect on mtDNA CN (ß=-0.084; P<0.001), but the reverse direction was not significant (P=0.059). No causal associations were observed between mtDNA CN and obesity, diabetes, and hypertension, in either direction. Multivariable Mendelian randomization analyses showed no causal effect of CHD on mtDNA CN, controlling for low-density lipoprotein cholesterol level (P=0.52), whereas there was a strong direct causal effect of higher low-density lipoprotein cholesterol on lower mtDNA CN, adjusting for CHD status (ß=-0.092; P<0.001). Conclusions Our findings indicate that high low-density lipoprotein cholesterol may underlie the complex relationships between mtDNA CN and vascular atherosclerosis.


Subject(s)
Cardiovascular Diseases , Coronary Disease , Diabetes Mellitus , Hypertension , Humans , DNA, Mitochondrial/genetics , Risk Factors , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/genetics , Cholesterol, LDL , DNA Copy Number Variations , Cross-Sectional Studies , Coronary Disease/genetics , Cholesterol, HDL , Hypertension/epidemiology , Hypertension/genetics , Obesity
12.
Nat Genet ; 55(8): 1277-1287, 2023 08.
Article in English | MEDLINE | ID: mdl-37558884

ABSTRACT

In this study, we leveraged the combined evidence of rare coding variants and common alleles to identify therapeutic targets for osteoporosis. We undertook a large-scale multiancestry exome-wide association study for estimated bone mineral density, which showed that the burden of rare coding alleles in 19 genes was associated with estimated bone mineral density (P < 3.6 × 10-7). These genes were highly enriched for a set of known causal genes for osteoporosis (65-fold; P = 2.5 × 10-5). Exome-wide significant genes had 96-fold increased odds of being the top ranked effector gene at a given GWAS locus (P = 1.8 × 10-10). By integrating proteomics Mendelian randomization evidence, we prioritized CD109 (cluster of differentiation 109) as a gene for which heterozygous loss of function is associated with higher bone density. CRISPR-Cas9 editing of CD109 in SaOS-2 osteoblast-like cell lines showed that partial CD109 knockdown led to increased mineralization. This study demonstrates that the convergence of common and rare variants, proteomics and CRISPR can highlight new bone biology to guide therapeutic development.


Subject(s)
Genetic Predisposition to Disease , Osteoporosis , Humans , Exome Sequencing , Osteoporosis/genetics , Bone Density/genetics , Alleles , Transcription Factors/genetics , Genome-Wide Association Study
13.
Nat Genet ; 55(7): 1138-1148, 2023 07.
Article in English | MEDLINE | ID: mdl-37308787

ABSTRACT

Human genetic studies of smoking behavior have been thus far largely limited to common variants. Studying rare coding variants has the potential to identify drug targets. We performed an exome-wide association study of smoking phenotypes in up to 749,459 individuals and discovered a protective association in CHRNB2, encoding the ß2 subunit of the α4ß2 nicotine acetylcholine receptor. Rare predicted loss-of-function and likely deleterious missense variants in CHRNB2 in aggregate were associated with a 35% decreased odds for smoking heavily (odds ratio (OR) = 0.65, confidence interval (CI) = 0.56-0.76, P = 1.9 × 10-8). An independent common variant association in the protective direction ( rs2072659 ; OR = 0.96; CI = 0.94-0.98; P = 5.3 × 10-6) was also evident, suggesting an allelic series. Our findings in humans align with decades-old experimental observations in mice that ß2 loss abolishes nicotine-mediated neuronal responses and attenuates nicotine self-administration. Our genetic discovery will inspire future drug designs targeting CHRNB2 in the brain for the treatment of nicotine addiction.


Subject(s)
Nicotine , Tobacco Use Disorder , Humans , Animals , Mice , Smoking/genetics , Tobacco Use Disorder/genetics , Phenotype , Odds Ratio
14.
bioRxiv ; 2023 Nov 02.
Article in English | MEDLINE | ID: mdl-37214792

ABSTRACT

Coding variants that have significant impact on function can provide insights into the biology of a gene but are typically rare in the population. Identifying and ascertaining the frequency of such rare variants requires very large sample sizes. Here, we present the largest catalog of human protein-coding variation to date, derived from exome sequencing of 985,830 individuals of diverse ancestry to serve as a rich resource for studying rare coding variants. Individuals of African, Admixed American, East Asian, Middle Eastern, and South Asian ancestry account for 20% of this Exome dataset. Our catalog of variants includes approximately 10.5 million missense (54% novel) and 1.1 million predicted loss-of-function (pLOF) variants (65% novel, 53% observed only once). We identified individuals with rare homozygous pLOF variants in 4,874 genes, and for 1,838 of these this work is the first to document at least one pLOF homozygote. Additional insights from the RGC-ME dataset include 1) improved estimates of selection against heterozygous loss-of-function and identification of 3,459 genes intolerant to loss-of-function, 83 of which were previously assessed as tolerant to loss-of-function and 1,241 that lack disease annotations; 2) identification of regions depleted of missense variation in 457 genes that are tolerant to loss-of-function; 3) functional interpretation for 10,708 variants of unknown or conflicting significance reported in ClinVar as cryptic splice sites using splicing score thresholds based on empirical variant deleteriousness scores derived from RGC-ME; and 4) an observation that approximately 3% of sequenced individuals carry a clinically actionable genetic variant in the ACMG SF 3.1 list of genes. We make this important resource of coding variation available to the public through a variant allele frequency browser. We anticipate that this report and the RGC-ME dataset will serve as a valuable reference for understanding rare coding variation and help advance precision medicine efforts.

15.
Cell Metab ; 35(4): 695-710.e6, 2023 04 04.
Article in English | MEDLINE | ID: mdl-36963395

ABSTRACT

Associations between human genetic variation and clinical phenotypes have become a foundation of biomedical research. Most repositories of these data seek to be disease-agnostic and therefore lack disease-focused views. The Type 2 Diabetes Knowledge Portal (T2DKP) is a public resource of genetic datasets and genomic annotations dedicated to type 2 diabetes (T2D) and related traits. Here, we seek to make the T2DKP more accessible to prospective users and more useful to existing users. First, we evaluate the T2DKP's comprehensiveness by comparing its datasets with those of other repositories. Second, we describe how researchers unfamiliar with human genetic data can begin using and correctly interpreting them via the T2DKP. Third, we describe how existing users can extend their current workflows to use the full suite of tools offered by the T2DKP. We finally discuss the lessons offered by the T2DKP toward the goal of democratizing access to complex disease genetic results.


Subject(s)
Diabetes Mellitus, Type 2 , Humans , Diabetes Mellitus, Type 2/genetics , Access to Information , Prospective Studies , Genomics/methods , Phenotype
16.
Genet Epidemiol ; 47(3): 231-248, 2023 04.
Article in English | MEDLINE | ID: mdl-36739617

ABSTRACT

Linkage analysis, a class of methods for detecting co-segregation of genomic segments and traits in families, was used to map disease-causing genes for decades before genotyping arrays and dense SNP genotyping enabled genome-wide association studies in population samples. Population samples often contain related individuals, but the segregation of alleles within families is rarely used because traditional linkage methods are computationally inefficient for larger datasets. Here, we describe Population Linkage, a novel application of Haseman-Elston regression as a method of moments estimator of variance components and their standard errors. We achieve additional computational efficiency by using modern methods for detection of IBD segments and variance component estimation, efficient preprocessing of input data, and minimizing redundant numerical calculations. We also refined variance component models to account for the biases in population-scale methods for IBD segment detection. We ran Population Linkage on four blood lipid traits in over 70,000 individuals from the HUNT and SardiNIA studies, successfully detecting 25 known genetic signals. One notable linkage signal that appeared in both was for low-density lipoprotein (LDL) cholesterol levels in the region near the gene APOE (LOD = 29.3, variance explained = 4.1%). This is the region where the missense variants rs7412 and rs429358, which together make up the ε2, ε3, and ε4 alleles each account for 2.4% and 0.8% of variation in circulating LDL cholesterol. Our results show the potential for linkage analysis and other large-scale applications of method of moments variance components estimation.


Subject(s)
Genome-Wide Association Study , Models, Genetic , Humans , Phenotype , Cholesterol, LDL/genetics , Genetic Linkage , Apolipoproteins E/genetics
18.
Cell Genom ; 3(2): 100257, 2023 Feb 08.
Article in English | MEDLINE | ID: mdl-36819667

ABSTRACT

Biobanks of linked clinical patient histories and biological samples are an efficient strategy to generate large cohorts for modern genetics research. Biobank recruitment varies by factors such as geographic catchment and sampling strategy, which affect biobank demographics and research utility. Here, we describe the Michigan Genomics Initiative (MGI), a single-health-system biobank currently consisting of >91,000 participants recruited primarily during surgical encounters at Michigan Medicine. The surgical enrollment results in a biobank enriched for many diseases and ideally suited for a disease genetics cohort. Compared with the much larger population-based UK Biobank, MGI has higher prevalence for nearly all diagnosis-code-based phenotypes and larger absolute case counts for many phenotypes. Genome-wide association study (GWAS) results replicate known findings, thereby validating the genetic and clinical data. Our results illustrate that opportunistic biobank sampling within single health systems provides a unique and complementary resource for exploring the genetics of complex diseases.

19.
Nat Commun ; 13(1): 7592, 2022 12 08.
Article in English | MEDLINE | ID: mdl-36481753

ABSTRACT

Genome-wide association studies have identified thousands of single nucleotide variants and small indels that contribute to variation in hematologic traits. While structural variants are known to cause rare blood or hematopoietic disorders, the genome-wide contribution of structural variants to quantitative blood cell trait variation is unknown. Here we utilized whole genome sequencing data in ancestrally diverse participants of the NHLBI Trans Omics for Precision Medicine program (N = 50,675) to detect structural variants associated with hematologic traits. Using single variant tests, we assessed the association of common and rare structural variants with red cell-, white cell-, and platelet-related quantitative traits and observed 21 independent signals (12 common and 9 rare) reaching genome-wide significance. The majority of these associations (N = 18) replicated in independent datasets. In genome-editing experiments, we provide evidence that a deletion associated with lower monocyte counts leads to disruption of an S1PR3 monocyte enhancer and decreased S1PR3 expression.


Subject(s)
Blood Cells , Genome-Wide Association Study , Humans , Whole Genome Sequencing
20.
Nature ; 612(7939): 301-309, 2022 12.
Article in English | MEDLINE | ID: mdl-36450978

ABSTRACT

Clonal haematopoiesis involves the expansion of certain blood cell lineages and has been associated with ageing and adverse health outcomes1-5. Here we use exome sequence data on 628,388 individuals to identify 40,208 carriers of clonal haematopoiesis of indeterminate potential (CHIP). Using genome-wide and exome-wide association analyses, we identify 24 loci (21 of which are novel) where germline genetic variation influences predisposition to CHIP, including missense variants in the lymphocytic antigen coding gene LY75, which are associated with reduced incidence of CHIP. We also identify novel rare variant associations with clonal haematopoiesis and telomere length. Analysis of 5,041 health traits from the UK Biobank (UKB) found relationships between CHIP and severe COVID-19 outcomes, cardiovascular disease, haematologic traits, malignancy, smoking, obesity, infection and all-cause mortality. Longitudinal and Mendelian randomization analyses revealed that CHIP is associated with solid cancers, including non-melanoma skin cancer and lung cancer, and that CHIP linked to DNMT3A is associated with the subsequent development of myeloid but not lymphoid leukaemias. Additionally, contrary to previous findings from the initial 50,000 UKB exomes6, our results in the full sample do not support a role for IL-6 inhibition in reducing the risk of cardiovascular disease among CHIP carriers. Our findings demonstrate that CHIP represents a complex set of heterogeneous phenotypes with shared and unique germline genetic causes and varied clinical implications.


Subject(s)
COVID-19 , Cardiovascular Diseases , Humans , Clonal Hematopoiesis/genetics , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/genetics
SELECTION OF CITATIONS
SEARCH DETAIL