Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 676
Filter
Add more filters

Publication year range
1.
Am J Hum Genet ; 111(3): 445-455, 2024 Mar 07.
Article in English | MEDLINE | ID: mdl-38320554

ABSTRACT

Regulation of transcription and translation are mechanisms through which genetic variants affect complex traits. Expression quantitative trait locus (eQTL) studies have been more successful at identifying cis-eQTL (within 1 Mb of the transcription start site) than trans-eQTL. Here, we tested the cis component of gene expression for association with observed plasma protein levels to identify cis- and trans-acting genes that regulate protein levels. We used transcriptome prediction models from 49 Genotype-Tissue Expression (GTEx) Project tissues to predict the cis component of gene expression and tested the predicted expression of every gene in every tissue for association with the observed abundance of 3,622 plasma proteins measured in 3,301 individuals from the INTERVAL study. We tested significant results for replication in 971 individuals from the Trans-omics for Precision Medicine (TOPMed) Multi-Ethnic Study of Atherosclerosis (MESA). We found 1,168 and 1,210 cis- and trans-acting associations that replicated in TOPMed (FDR < 0.05) with a median expected true positive rate (π1) across tissues of 0.806 and 0.390, respectively. The target proteins of trans-acting genes were enriched for transcription factor binding sites and autoimmune diseases in the GWAS catalog. Furthermore, we found a higher correlation between predicted expression and protein levels of the same underlying gene (R = 0.17) than observed expression (R = 0.10, p = 7.50 × 10-11). This indicates the cis-acting genetically regulated (heritable) component of gene expression is more consistent across tissues than total observed expression (genetics + environment) and is useful in uncovering the function of SNPs associated with complex traits.


Subject(s)
Proteome , Transcriptome , Humans , Transcriptome/genetics , Proteome/genetics , Multifactorial Inheritance , Quantitative Trait Loci/genetics , Genome-Wide Association Study , Polymorphism, Single Nucleotide/genetics
2.
Am J Hum Genet ; 111(1): 133-149, 2024 Jan 04.
Article in English | MEDLINE | ID: mdl-38181730

ABSTRACT

Bulk-tissue molecular quantitative trait loci (QTLs) have been the starting point for interpreting disease-associated variants, and context-specific QTLs show particular relevance for disease. Here, we present the results of mapping interaction QTLs (iQTLs) for cell type, age, and other phenotypic variables in multi-omic, longitudinal data from the blood of individuals of diverse ancestries. By modeling the interaction between genotype and estimated cell-type proportions, we demonstrate that cell-type iQTLs could be considered as proxies for cell-type-specific QTL effects, particularly for the most abundant cell type in the tissue. The interpretation of age iQTLs, however, warrants caution because the moderation effect of age on the genotype and molecular phenotype association could be mediated by changes in cell-type composition. Finally, we show that cell-type iQTLs contribute to cell-type-specific enrichment of diseases that, in combination with additional functional data, could guide future functional studies. Overall, this study highlights the use of iQTLs to gain insights into the context specificity of regulatory effects.


Subject(s)
Gene Expression Regulation , Quantitative Trait Loci , Humans , Quantitative Trait Loci/genetics , Genotype , Phenotype
3.
Am J Hum Genet ; 111(5): 990-995, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38636510

ABSTRACT

Since genotype imputation was introduced, researchers have been relying on the estimated imputation quality from imputation software to perform post-imputation quality control (QC). However, this quality estimate (denoted as Rsq) performs less well for lower-frequency variants. We recently published MagicalRsq, a machine-learning-based imputation quality calibration, which leverages additional typed markers from the same cohort and outperforms Rsq as a QC metric. In this work, we extended the original MagicalRsq to allow cross-cohort model training and named the new model MagicalRsq-X. We removed the cohort-specific estimated minor allele frequency and included linkage disequilibrium scores and recombination rates as additional features. Leveraging whole-genome sequencing data from TOPMed, specifically participants in the BioMe, JHS, WHI, and MESA studies, we performed comprehensive cross-cohort evaluations for predominantly European and African ancestral individuals based on their inferred global ancestry with the 1000 Genomes and Human Genome Diversity Project data as reference. Our results suggest MagicalRsq-X outperforms Rsq in almost every setting, with 7.3%-14.4% improvement in squared Pearson correlation with true R2, corresponding to 85-218 K variant gains. We further developed a metric to quantify the genetic distances of a target cohort relative to a reference cohort and showed that such metric largely explained the performance of MagicalRsq-X models. Finally, we found MagicalRsq-X saved up to 53 known genome-wide significant variants in one of the largest blood cell trait GWASs that would be missed using the original Rsq for QC. In conclusion, MagicalRsq-X shows superiority for post-imputation QC and benefits genetic studies by distinguishing well and poorly imputed lower-frequency variants.


Subject(s)
Gene Frequency , Genotype , Polymorphism, Single Nucleotide , Software , Humans , Cohort Studies , Linkage Disequilibrium , Genome-Wide Association Study/methods , Genome, Human , Quality Control , Machine Learning , Whole Genome Sequencing/standards , Whole Genome Sequencing/methods
4.
Hum Mol Genet ; 2024 May 15.
Article in English | MEDLINE | ID: mdl-38747556

ABSTRACT

Inflammation biomarkers can provide valuable insight into the role of inflammatory processes in many diseases and conditions. Sequencing based analyses of such biomarkers can also serve as an exemplar of the genetic architecture of quantitative traits. To evaluate the biological insight, which can be provided by a multi-ancestry, whole-genome based association study, we performed a comprehensive analysis of 21 inflammation biomarkers from up to 38 465 individuals with whole-genome sequencing from the Trans-Omics for Precision Medicine (TOPMed) program (with varying sample size by trait, where the minimum sample size was n = 737 for MMP-1). We identified 22 distinct single-variant associations across 6 traits-E-selectin, intercellular adhesion molecule 1, interleukin-6, lipoprotein-associated phospholipase A2 activity and mass, and P-selectin-that remained significant after conditioning on previously identified associations for these inflammatory biomarkers. We further expanded upon known biomarker associations by pairing the single-variant analysis with a rare variant set-based analysis that further identified 19 significant rare variant set-based associations with 5 traits. These signals were distinct from both significant single variant association signals within TOPMed and genetic signals observed in prior studies, demonstrating the complementary value of performing both single and rare variant analyses when analyzing quantitative traits. We also confirm several previously reported signals from semi-quantitative proteomics platforms. Many of these signals demonstrate the extensive allelic heterogeneity and ancestry-differentiated variant-trait associations common for inflammation biomarkers, a characteristic we hypothesize will be increasingly observed with well-powered, large-scale analyses of complex traits.

5.
Nature ; 581(7809): 444-451, 2020 05.
Article in English | MEDLINE | ID: mdl-32461652

ABSTRACT

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.


Subject(s)
Disease/genetics , Genetic Variation , Genetics, Medical/standards , Genetics, Population/standards , Genome, Human/genetics , Female , Genetic Testing , Genotyping Techniques , Humans , Male , Middle Aged , Mutation , Polymorphism, Single Nucleotide/genetics , Racial Groups/genetics , Reference Standards , Selection, Genetic , Whole Genome Sequencing
6.
PLoS Genet ; 19(5): e1010517, 2023 05.
Article in English | MEDLINE | ID: mdl-37216410

ABSTRACT

Integrative approaches that simultaneously model multi-omics data have gained increasing popularity because they provide holistic system biology views of multiple or all components in a biological system of interest. Canonical correlation analysis (CCA) is a correlation-based integrative method designed to extract latent features shared between multiple assays by finding the linear combinations of features-referred to as canonical variables (CVs)-within each assay that achieve maximal across-assay correlation. Although widely acknowledged as a powerful approach for multi-omics data, CCA has not been systematically applied to multi-omics data in large cohort studies, which has only recently become available. Here, we adapted sparse multiple CCA (SMCCA), a widely-used derivative of CCA, to proteomics and methylomics data from the Multi-Ethnic Study of Atherosclerosis (MESA) and Jackson Heart Study (JHS). To tackle challenges encountered when applying SMCCA to MESA and JHS, our adaptations include the incorporation of the Gram-Schmidt (GS) algorithm with SMCCA to improve orthogonality among CVs, and the development of Sparse Supervised Multiple CCA (SSMCCA) to allow supervised integration analysis for more than two assays. Effective application of SMCCA to the two real datasets reveals important findings. Applying our SMCCA-GS to MESA and JHS, we identified strong associations between blood cell counts and protein abundance, suggesting that adjustment of blood cell composition should be considered in protein-based association studies. Importantly, CVs obtained from two independent cohorts also demonstrate transferability across the cohorts. For example, proteomic CVs learned from JHS, when transferred to MESA, explain similar amounts of blood cell count phenotypic variance in MESA, explaining 39.0% ~ 50.0% variation in JHS and 38.9% ~ 49.1% in MESA. Similar transferability was observed for other omics-CV-trait pairs. This suggests that biologically meaningful and cohort-agnostic variation is captured by CVs. We anticipate that applying our SMCCA-GS and SSMCCA on various cohorts would help identify cohort-agnostic biologically meaningful relationships between multi-omics data and phenotypic traits.


Subject(s)
Canonical Correlation Analysis , Proteomics , Humans , Proteomics/methods , Multiomics , Cohort Studies
7.
Hum Mol Genet ; 32(6): 1048-1060, 2023 03 06.
Article in English | MEDLINE | ID: mdl-36444934

ABSTRACT

Diabetic kidney disease (DKD) is recognized as an important public health challenge. However, its genomic mechanisms are poorly understood. To identify rare variants for DKD, we conducted a whole-exome sequencing (WES) study leveraging large cohorts well-phenotyped for chronic kidney disease and diabetes. Our two-stage WES study included 4372 European and African ancestry participants from the Chronic Renal Insufficiency Cohort and Atherosclerosis Risk in Communities studies (stage 1) and 11 487 multi-ancestry Trans-Omics for Precision Medicine participants (stage 2). Generalized linear mixed models, which accounted for genetic relatedness and adjusted for age, sex and ancestry, were used to test associations between single variants and DKD. Gene-based aggregate rare variant analyses were conducted using an optimized sequence kernel association test implemented within our mixed model framework. We identified four novel exome-wide significant DKD-related loci through initiating diabetes. In single-variant analyses, participants carrying a rare, in-frame insertion in the DIS3L2 gene (rs141560952) exhibited a 193-fold increased odds [95% confidence interval (CI): 33.6, 1105] of DKD compared with noncarriers (P = 3.59 × 10-9). Likewise, each copy of a low-frequency KRT6B splice-site variant (rs425827) conferred a 5.31-fold higher odds (95% CI: 3.06, 9.21) of DKD (P = 2.72 × 10-9). Aggregate gene-based analyses further identified ERAP2 (P = 4.03 × 10-8) and NPEPPS (P = 1.51 × 10-7), which are both expressed in the kidney and implicated in renin-angiotensin-aldosterone system modulated immune response. In the largest WES study of DKD, we identified novel rare variant loci attaining exome-wide significance. These findings provide new insights into the molecular mechanisms underlying DKD.


Subject(s)
Diabetes Mellitus , Diabetic Nephropathies , Renal Insufficiency, Chronic , Humans , Aminopeptidases , Diabetic Nephropathies/genetics , Exome Sequencing , Kidney , Renal Insufficiency, Chronic/genetics
8.
Am J Hum Genet ; 109(6): 1175-1181, 2022 06 02.
Article in English | MEDLINE | ID: mdl-35504290

ABSTRACT

Current publicly available tools that allow rapid exploration of linkage disequilibrium (LD) between markers (e.g., HaploReg and LDlink) are based on whole-genome sequence (WGS) data from 2,504 individuals in the 1000 Genomes Project. Here, we present TOP-LD, an online tool to explore LD inferred with high-coverage (∼30×) WGS data from 15,578 individuals in the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. TOP-LD provides a significant upgrade compared to current LD tools, as the TOPMed WGS data provide a more comprehensive representation of genetic variation than the 1000 Genomes data, particularly for rare variants and in the specific populations that we analyzed. For example, TOP-LD encompasses LD information for 150.3, 62.2, and 36.7 million variants for European, African, and East Asian ancestral samples, respectively, offering 2.6- to 9.1-fold increase in variant coverage compared to HaploReg 4.0 or LDlink. In addition, TOP-LD includes tens of thousands of structural variants (SVs). We demonstrate the value of TOP-LD in fine-mapping at the GGT1 locus associated with gamma glutamyltransferase in the African ancestry participants in UK Biobank. Beyond fine-mapping, TOP-LD can facilitate a wide range of applications that are based on summary statistics and estimates of LD. TOP-LD is freely available online.


Subject(s)
Genome-Wide Association Study , Precision Medicine , Asian People , Humans , Linkage Disequilibrium/genetics , Polymorphism, Single Nucleotide/genetics , Whole Genome Sequencing
9.
Am J Hum Genet ; 109(7): 1286-1297, 2022 07 07.
Article in English | MEDLINE | ID: mdl-35716666

ABSTRACT

Despite the growing number of genome-wide association studies (GWASs), it remains unclear to what extent gene-by-gene and gene-by-environment interactions influence complex traits in humans. The magnitude of genetic interactions in complex traits has been difficult to quantify because GWASs are generally underpowered to detect individual interactions of small effect. Here, we develop a method to test for genetic interactions that aggregates information across all trait-associated loci. Specifically, we test whether SNPs in regions of European ancestry shared between European American and admixed African American individuals have the same causal effect sizes. We hypothesize that in African Americans, the presence of genetic interactions will drive the causal effect sizes of SNPs in regions of European ancestry to be more similar to those of SNPs in regions of African ancestry. We apply our method to two traits: gene expression in 296 African Americans and 482 European Americans in the Multi-Ethnic Study of Atherosclerosis (MESA) and low-density lipoprotein cholesterol (LDL-C) in 74K African Americans and 296K European Americans in the Million Veteran Program (MVP). We find significant evidence for genetic interactions in our analysis of gene expression; for LDL-C, we observe a similar point estimate, although this is not significant, most likely due to lower statistical power. These results suggest that gene-by-gene or gene-by-environment interactions modify the effect sizes of causal variants in human complex traits.


Subject(s)
Genome-Wide Association Study , Multifactorial Inheritance , Cholesterol, LDL , Gene Expression , Humans , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics , White People/genetics
10.
Am J Hum Genet ; 109(5): 857-870, 2022 05 05.
Article in English | MEDLINE | ID: mdl-35385699

ABSTRACT

While polygenic risk scores (PRSs) enable early identification of genetic risk for chronic obstructive pulmonary disease (COPD), predictive performance is limited when the discovery and target populations are not well matched. Hypothesizing that the biological mechanisms of disease are shared across ancestry groups, we introduce a PrediXcan-derived polygenic transcriptome risk score (PTRS) to improve cross-ethnic portability of risk prediction. We constructed the PTRS using summary statistics from application of PrediXcan on large-scale GWASs of lung function (forced expiratory volume in 1 s [FEV1] and its ratio to forced vital capacity [FEV1/FVC]) in the UK Biobank. We examined prediction performance and cross-ethnic portability of PTRS through smoking-stratified analyses both on 29,381 multi-ethnic participants from TOPMed population/family-based cohorts and on 11,771 multi-ethnic participants from TOPMed COPD-enriched studies. Analyses were carried out for two dichotomous COPD traits (moderate-to-severe and severe COPD) and two quantitative lung function traits (FEV1 and FEV1/FVC). While the proposed PTRS showed weaker associations with disease than PRS for European ancestry, the PTRS showed stronger association with COPD than PRS for African Americans (e.g., odds ratio [OR] = 1.24 [95% confidence interval [CI]: 1.08-1.43] for PTRS versus 1.10 [0.96-1.26] for PRS among heavy smokers with ≥ 40 pack-years of smoking) for moderate-to-severe COPD. Cross-ethnic portability of the PTRS was significantly higher than the PRS (paired t test p < 2.2 × 10-16 with portability gains ranging from 5% to 28%) for both dichotomous COPD traits and across all smoking strata. Our study demonstrates the value of PTRS for improved cross-ethnic portability compared to PRS in predicting COPD risk.


Subject(s)
Pulmonary Disease, Chronic Obstructive , Transcriptome , Humans , Lung , National Heart, Lung, and Blood Institute (U.S.) , Pulmonary Disease, Chronic Obstructive/genetics , Risk Factors , United States/epidemiology
11.
Hepatology ; 2024 May 22.
Article in English | MEDLINE | ID: mdl-38776184

ABSTRACT

BACKGROUND AND AIMS: The common genetic variant rs641738 C>T is a risk factor for metabolic dysfunction-associated steatotic liver disease (MASLD) and metabolic dysfunction-associated steatohepatitis (MASH), including liver fibrosis, and is associated with decreased expression of the phospholipid-remodeling enzyme MBOAT7 (LPIAT1). However, whether restoring MBOAT7 expression in established MASLD dampens the progression to liver fibrosis and, importantly, the mechanism through which decreased MBOAT7 expression exacerbates MASH fibrosis remain unclear. APPROACH AND RESULTS: We first showed that hepatocyte MBOAT7 restoration in mice with diet-induced steatohepatitis slows the progression to liver fibrosis. Conversely, when hepatocyte-MBOAT7 was silenced in mice with established hepatosteatosis, liver fibrosis but not hepatosteatosis was exacerbated. Mechanistic studies revealed that hepatocyte-MBOAT7 restoration in MASH mice lowered hepatocyte-TAZ (WWTR1), which is known to promote MASH fibrosis. Conversely, hepatocyte-MBOAT7 silencing enhanced TAZ upregulation in MASH. Finally, we discovered that changes in hepatocyte phospholipids due to MBOAT7 loss-of-function promote a cholesterol trafficking pathway that upregulates TAZ and the TAZ-induced profibrotic factor Indian hedgehog (IHH). As evidence for relevance in humans, we found that the livers of individuals with MASH carrying the rs641738-T allele had higher hepatocyte nuclear TAZ, indicating higher TAZ activity, and increased IHH mRNA. CONCLUSIONS: This study provides evidence for a novel mechanism linking MBOAT7-LoF to MASH fibrosis; adds new insight into an established genetic locus for MASH; and, given the druggability of hepatocyte TAZ for MASH fibrosis, suggests a personalized medicine approach for subjects at increased risk for MASH fibrosis due to inheritance of variants that lower MBOAT7.

12.
PLoS Genet ; 18(9): e1010294, 2022 09.
Article in English | MEDLINE | ID: mdl-36048760

ABSTRACT

For Alzheimer's disease-a leading cause of dementia and global morbidity-improved identification of presymptomatic high-risk individuals and identification of new circulating biomarkers are key public health needs. Here, we tested the hypothesis that a polygenic predictor of risk for Alzheimer's disease would identify a subset of the population with increased risk of clinically diagnosed dementia, subclinical neurocognitive dysfunction, and a differing circulating proteomic profile. Using summary association statistics from a recent genome-wide association study, we first developed a polygenic predictor of Alzheimer's disease comprised of 7.1 million common DNA variants. We noted a 7.3-fold (95% CI 4.8 to 11.0; p < 0.001) gradient in risk across deciles of the score among 288,289 middle-aged participants of the UK Biobank study. In cross-sectional analyses stratified by age, minimal differences in risk of Alzheimer's disease and performance on a digit recall test were present according to polygenic score decile at age 50 years, but significant gradients emerged by age 65. Similarly, among 30,541 participants of the Mass General Brigham Biobank, we again noted no significant differences in Alzheimer's disease diagnosis at younger ages across deciles of the score, but for those over 65 years we noted an odds ratio of 2.0 (95% CI 1.3 to 3.2; p = 0.002) in the top versus bottom decile of the polygenic score. To understand the proteomic signature of inherited risk, we performed aptamer-based profiling in 636 blood donors (mean age 43 years) with very high or low polygenic scores. In addition to the well-known apolipoprotein E biomarker, this analysis identified 27 additional proteins, several of which have known roles related to disease pathogenesis. Differences in protein concentrations were consistent even among the youngest subset of blood donors (mean age 33 years). Of these 28 proteins, 7 of the 8 proteins with concentrations available were similarly associated with the polygenic score in participants of the Multi-Ethnic Study of Atherosclerosis. These data highlight the potential for a DNA-based score to identify high-risk individuals during the prolonged presymptomatic phase of Alzheimer's disease and to enable biomarker discovery based on profiling of young individuals in the extremes of the score distribution.


Subject(s)
Alzheimer Disease , Adult , Aged , Alzheimer Disease/pathology , Biomarkers , Cross-Sectional Studies , Genome-Wide Association Study , Humans , Middle Aged , Proteomics
13.
PLoS Genet ; 18(12): e1010557, 2022 12.
Article in English | MEDLINE | ID: mdl-36574455

ABSTRACT

Genetic association studies of many heritable traits resulting from physiological testing often have modest sample sizes due to the cost and burden of the required phenotyping. This reduces statistical power and limits discovery of multiple genetic associations. We present a strategy to leverage pleiotropy between traits to both discover new loci and to provide mechanistic hypotheses of the underlying pathophysiology. Specifically, we combine a colocalization test with a locus-level test of pleiotropy. In simulations, we show that this approach is highly selective for identifying true pleiotropy driven by the same causative variant, thereby improves the chance to replicate the associations in underpowered validation cohorts and leads to higher interpretability. Here, as an exemplar, we use Obstructive Sleep Apnea (OSA), a common disorder diagnosed using overnight multi-channel physiological testing. We leverage pleiotropy with relevant cellular and cardio-metabolic phenotypes and gene expression traits to map new risk loci in an underpowered OSA GWAS. We identify several pleiotropic loci harboring suggestive associations to OSA and genome-wide significant associations to other traits, and show that their OSA association replicates in independent cohorts of diverse ancestries. By investigating pleiotropic loci, our strategy allows proposing new hypotheses about OSA pathobiology across many physiological layers. For example, we identify and replicate the pleiotropy across the plateletcrit, OSA and an eQTL of DNA primase subunit 1 (PRIM1) in immune cells. We find suggestive links between OSA, a measure of lung function (FEV1/FVC), and an eQTL of matrix metallopeptidase 15 (MMP15) in lung tissue. We also link a previously known genome-wide significant peak for OSA in the hexokinase 1 (HK1) locus to hematocrit and other red blood cell related traits. Thus, the analysis of pleiotropic associations has the potential to assemble diverse phenotypes into a chain of mechanistic hypotheses that provide insight into the pathogenesis of complex human diseases.


Subject(s)
Genome-Wide Association Study , Sleep Apnea, Obstructive , Humans , Genome-Wide Association Study/methods , Phenotype , Genetic Association Studies , Sleep , Genetic Pleiotropy , Polymorphism, Single Nucleotide , DNA Primase
14.
PLoS Genet ; 18(9): e1010356, 2022 09.
Article in English | MEDLINE | ID: mdl-36137075

ABSTRACT

Rare variants in ten genes have been reported to cause Mendelian sleep conditions characterised by extreme sleep duration or timing. These include familial natural short sleep (ADRB1, DEC2/BHLHE41, GRM1 and NPSR1), advanced sleep phase (PER2, PER3, CRY2, CSNK1D and TIMELESS) and delayed sleep phase (CRY1). The association of variants in these genes with extreme sleep conditions were usually based on clinically ascertained families, and their effects when identified in the population are unknown. We aimed to determine the effects of these variants on sleep traits in large population-based cohorts. We performed genetic association analysis of variants previously reported to be causal for Mendelian sleep and circadian conditions. Analyses were performed using 191,929 individuals with data on sleep and whole-exome or genome-sequence data from 4 population-based studies: UK Biobank, FINRISK, Health-2000-2001, and the Multi-Ethnic Study of Atherosclerosis (MESA). We identified sleep disorders from self-report, hospital and primary care data. We estimated sleep duration and timing measures from self-report and accelerometery data. We identified carriers for 10 out of 12 previously reported pathogenic variants for 8 of the 10 genes. They ranged in frequency from 1 individual with the variant in CSNK1D to 1,574 individuals with a reported variant in the PER3 gene in the UK Biobank. No carriers for variants reported in NPSR1 or PER2 were identified. We found no association between variants analyzed and extreme sleep or circadian phenotypes. Using sleep timing as a proxy measure for sleep phase, only PER3 and CRY1 variants demonstrated association with earlier and later sleep timing, respectively; however, the magnitude of effect was smaller than previously reported (sleep midpoint ~7 mins earlier and ~5 mins later, respectively). We also performed burden tests of protein truncating (PTVs) or rare missense variants for the 10 genes. Only PTVs in PER2 and PER3 were associated with a relevant trait (for example, 64 individuals with a PTV in PER2 had an odds ratio of 4.4 for being "definitely a morning person", P = 4x10-8; and had a 57-minute earlier midpoint sleep, P = 5x10-7). Our results indicate that previously reported variants for Mendelian sleep and circadian conditions are often not highly penetrant when ascertained incidentally from the general population.


Subject(s)
Circadian Rhythm , Sleep Wake Disorders , Circadian Rhythm/genetics , Humans , Phenotype , Receptors, G-Protein-Coupled/genetics , Sleep/genetics , Sleep Wake Disorders/genetics
15.
Hum Mol Genet ; 31(22): 3873-3885, 2022 11 10.
Article in English | MEDLINE | ID: mdl-35766891

ABSTRACT

RATIONALE: Genetic variation has a substantial contribution to chronic obstructive pulmonary disease (COPD) and lung function measurements. Heritability estimates using genome-wide genotyping data can be biased if analyses do not appropriately account for the nonuniform distribution of genetic effects across the allele frequency and linkage disequilibrium (LD) spectrum. In addition, the contribution of rare variants has been unclear. OBJECTIVES: We sought to assess the heritability of COPD and lung function using whole-genome sequence data from the Trans-Omics for Precision Medicine program. METHODS: Using the genome-based restricted maximum likelihood method, we partitioned the genome into bins based on minor allele frequency and LD scores and estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio in 11 051 European ancestry and 5853 African-American participants. MEASUREMENTS AND MAIN RESULTS: In European ancestry participants, the estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio were 35.5%, 55.6% and 32.5%, of which 18.8%, 19.7%, 17.8% were from common variants, and 16.6%, 35.8%, and 14.6% were from rare variants. These estimates had wide confidence intervals, with common variants and some sets of rare variants showing a statistically significant contribution (P-value < 0.05). In African-Americans, common variant heritability was similar to European ancestry participants, but lower sample size precluded calculation of rare variant heritability. CONCLUSIONS: Our study provides updated and unbiased estimates of heritability for COPD and lung function, and suggests an important contribution of rare variants. Larger studies of more diverse ancestry will improve accuracy of these estimates.


Subject(s)
Genetic Predisposition to Disease , Pulmonary Disease, Chronic Obstructive , Humans , Polymorphism, Single Nucleotide/genetics , Pulmonary Disease, Chronic Obstructive/genetics , Genome-Wide Association Study , Phenotype
16.
Hum Mol Genet ; 31(20): 3566-3579, 2022 10 10.
Article in English | MEDLINE | ID: mdl-35234888

ABSTRACT

Progressive dilation of the infrarenal aortic diameter is a consequence of the ageing process and is considered the main determinant of abdominal aortic aneurysm (AAA). We aimed to investigate the genetic and clinical determinants of abdominal aortic diameter (AAD). We conducted a meta-analysis of genome-wide association studies in 10 cohorts (n = 13 542) imputed to the 1000 Genome Project reference panel including 12 815 subjects in the discovery phase and 727 subjects [Partners Biobank cohort 1 (PBIO)] as replication. Maximum anterior-posterior diameter of the infrarenal aorta was used as AAD. We also included exome array data (n = 14 480) from seven epidemiologic studies. Single-variant and gene-based associations were done using SeqMeta package. A Mendelian randomization analysis was applied to investigate the causal effect of a number of clinical risk factors on AAD. In genome-wide association study (GWAS) on AAD, rs74448815 in the intronic region of LDLRAD4 reached genome-wide significance (beta = -0.02, SE = 0.004, P-value = 2.10 × 10-8). The association replicated in the PBIO1 cohort (P-value = 8.19 × 10-4). In exome-array single-variant analysis (P-value threshold = 9 × 10-7), the lowest P-value was found for rs239259 located in SLC22A20 (beta = 0.007, P-value = 1.2 × 10-5). In the gene-based analysis (P-value threshold = 1.85 × 10-6), PCSK5 showed an association with AAD (P-value = 8.03 × 10-7). Furthermore, in Mendelian randomization analyses, we found evidence for genetic association of pulse pressure (beta = -0.003, P-value = 0.02), triglycerides (beta = -0.16, P-value = 0.008) and height (beta = 0.03, P-value < 0.0001), known risk factors for AAA, consistent with a causal association with AAD. Our findings point to new biology as well as highlighting gene regions in mechanisms that have previously been implicated in the genetics of other vascular diseases.


Subject(s)
Genome-Wide Association Study , Mendelian Randomization Analysis , Exome/genetics , Humans , Polymorphism, Single Nucleotide/genetics , Triglycerides
17.
Am J Hum Genet ; 108(5): 874-893, 2021 05 06.
Article in English | MEDLINE | ID: mdl-33887194

ABSTRACT

Whole-genome sequencing (WGS), a powerful tool for detecting novel coding and non-coding disease-causing variants, has largely been applied to clinical diagnosis of inherited disorders. Here we leveraged WGS data in up to 62,653 ethnically diverse participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program and assessed statistical association of variants with seven red blood cell (RBC) quantitative traits. We discovered 14 single variant-RBC trait associations at 12 genomic loci, which have not been reported previously. Several of the RBC trait-variant associations (RPN1, ELL2, MIDN, HBB, HBA1, PIEZO1, and G6PD) were replicated in independent GWAS datasets imputed to the TOPMed reference panel. Most of these discovered variants are rare/low frequency, and several are observed disproportionately among non-European Ancestry (African, Hispanic/Latino, or East Asian) populations. We identified a 3 bp indel p.Lys2169del (g.88717175_88717177TCT[4]) (common only in the Ashkenazi Jewish population) of PIEZO1, a gene responsible for the Mendelian red cell disorder hereditary xerocytosis (MIM: 194380), associated with higher mean corpuscular hemoglobin concentration (MCHC). In stepwise conditional analysis and in gene-based rare variant aggregated association analysis, we identified several of the variants in HBB, HBA1, TMPRSS6, and G6PD that represent the carrier state for known coding, promoter, or splice site loss-of-function variants that cause inherited RBC disorders. Finally, we applied base and nuclease editing to demonstrate that the sentinel variant rs112097551 (nearest gene RPN1) acts through a cis-regulatory element that exerts long-range control of the gene RUVBL1 which is essential for hematopoiesis. Together, these results demonstrate the utility of WGS in ethnically diverse population-based samples and gene editing for expanding knowledge of the genetic architecture of quantitative hematologic traits and suggest a continuum between complex trait and Mendelian red cell disorders.


Subject(s)
Erythrocytes/metabolism , Erythrocytes/pathology , Genome-Wide Association Study , National Heart, Lung, and Blood Institute (U.S.)/organization & administration , Phenotype , Adult , Aged , Chromosomes, Human, Pair 16/genetics , Datasets as Topic , Female , Gene Editing , Genetic Variation/genetics , HEK293 Cells , Humans , Male , Middle Aged , Quality Control , Reproducibility of Results , United States
18.
Blood ; 139(3): 357-368, 2022 01 20.
Article in English | MEDLINE | ID: mdl-34855941

ABSTRACT

Chronic obstructive pulmonary disease (COPD) is associated with age and smoking, but other determinants of the disease are incompletely understood. Clonal hematopoiesis of indeterminate potential (CHIP) is a common, age-related state in which somatic mutations in clonal blood populations induce aberrant inflammatory responses. Patients with CHIP have an elevated risk for cardiovascular disease, but the association of CHIP with COPD remains unclear. We analyzed whole-genome sequencing and whole-exome sequencing data to detect CHIP in 48 835 patients, of whom 8444 had moderate to very severe COPD, from four separate cohorts with COPD phenotyping and smoking history. We measured emphysema in murine models in which Tet2 was deleted in hematopoietic cells. In the COPDGene cohort, individuals with CHIP had risks of moderate-to-severe, severe, or very severe COPD that were 1.6 (adjusted 95% confidence interval [CI], 1.1-2.2) and 2.2 (adjusted 95% CI, 1.5-3.2) times greater than those for noncarriers. These findings were consistently observed in three additional cohorts and meta-analyses of all patients. CHIP was also associated with decreased FEV1% predicted in the COPDGene cohort (mean between-group differences, -5.7%; adjusted 95% CI, -8.8% to -2.6%), a finding replicated in additional cohorts. Smoke exposure was associated with a small but significant increased risk of having CHIP (odds ratio, 1.03 per 10 pack-years; 95% CI, 1.01-1.05 per 10 pack-years) in the meta-analysis of all patients. Inactivation of Tet2 in mouse hematopoietic cells exacerbated the development of emphysema and inflammation in models of cigarette smoke exposure. Somatic mutations in blood cells are associated with the development and severity of COPD, independent of age and cumulative smoke exposure.


Subject(s)
Clonal Hematopoiesis , Pulmonary Disease, Chronic Obstructive/genetics , Animals , Female , Humans , Male , Mice , Middle Aged , Odds Ratio , Pulmonary Disease, Chronic Obstructive/etiology , Risk Factors , Smoking/adverse effects , Exome Sequencing
19.
Circ Res ; 131(7): 601-615, 2022 09 16.
Article in English | MEDLINE | ID: mdl-36052690

ABSTRACT

BACKGROUND: Racial differences in metabolomic profiles may reflect underlying differences in social determinants of health by self-reported race and may be related to racial disparities in coronary heart disease (CHD) among women in the United States. However, the magnitude of differences in metabolomic profiles between Black and White women in the United States has not been well-described. It also remains unknown whether such differences are related to differences in CHD risk. METHODS: Plasma metabolomic profiles were analyzed using liquid chromatography-tandem mass spectrometry in the WHI-OS (Women's Health Initiative-Observational Study; 138 Black and 696 White women), WHI-HT trials (WHI-Hormone Therapy; 156 Black and 1138 White women), MESA (Multi-Ethnic Study of Atherosclerosis; 114 Black and 219 White women), JHS (Jackson Heart Study; 1465 Black women with 107 incident CHD cases), and NHS (Nurses' Health Study; 2506 White women with 136 incident CHD cases). First, linear regression models were used to estimate associations between self-reported race and 472 metabolites in WHI-OS (discovery); findings were replicated in WHI-HT and validated in MESA. Second, we used elastic net regression to construct a racial difference metabolomic pattern (RDMP) representing differences in the metabolomic patterns between Black and White women in the WHI-OS; the RDMP was validated in the WHI-HT and MESA. Third, using conditional logistic regressions in the WHI (717 CHD cases and 719 matched controls), we examined associations of metabolites with large differences in levels by race and the RDMP with risk of CHD, and the results were replicated in Black women from the JHS and White women from the NHS. RESULTS: Of the 472 tested metabolites, levels of 259 (54.9%) metabolites, mostly lipid metabolites and amino acids, significantly differed between Black and White women in both WHI-OS and WHI-HT after adjusting for baseline characteristics, socioeconomic status, lifestyle factors, baseline health conditions, and medication use (false discovery rate <0.05); similar trends were observed in MESA. The RDMP, composed of 152 metabolites, was identified in the WHI-OS and showed significantly different distributions between Black and White women in the WHI-HT and MESA. Higher RDMP quartiles were associated with an increased risk of incident CHD (odds ratio=1.51 [0.97-2.37] for the highest quartile comparing to the lowest; Ptrend=0.02), independent of self-reported race and known CHD risk factors. In race-stratified analyses, the RDMP-CHD associations were more pronounced in White women. Similar patterns were observed in Black women from the JHS and White women from the NHS. CONCLUSIONS: Metabolomic profiles significantly and substantially differ between Black and White women and may be associated with CHD risk and racial disparities in US women.


Subject(s)
Coronary Disease , Amino Acids , Coronary Disease/diagnosis , Coronary Disease/epidemiology , Female , Hormones , Humans , Lipids , Risk Factors , United States/epidemiology
20.
Circ Res ; 131(2): e51-e69, 2022 07 08.
Article in English | MEDLINE | ID: mdl-35658476

ABSTRACT

BACKGROUND: Epigenetic dysregulation has been proposed as a key mechanism for arsenic-related cardiovascular disease (CVD). We evaluated differentially methylated positions (DMPs) as potential mediators on the association between arsenic and CVD. METHODS: Blood DNA methylation was measured in 2321 participants (mean age 56.2, 58.6% women) of the Strong Heart Study, a prospective cohort of American Indians. Urinary arsenic species were measured using high-performance liquid chromatography coupled to inductively coupled plasma mass spectrometry. We identified DMPs that are potential mediators between arsenic and CVD. In a cross-species analysis, we compared those DMPs with differential liver DNA methylation following early-life arsenic exposure in the apoE knockout (apoE-/-) mouse model of atherosclerosis. RESULTS: A total of 20 and 13 DMPs were potential mediators for CVD incidence and mortality, respectively, several of them annotated to genes related to diabetes. Eleven of these DMPs were similarly associated with incident CVD in 3 diverse prospective cohorts (Framingham Heart Study, Women's Health Initiative, and Multi-Ethnic Study of Atherosclerosis). In the mouse model, differentially methylated regions in 20 of those genes and DMPs in 10 genes were associated with arsenic. CONCLUSIONS: Differential DNA methylation might be part of the biological link between arsenic and CVD. The gene functions suggest that diabetes might represent a relevant mechanism for arsenic-related cardiovascular risk in populations with a high burden of diabetes.


Subject(s)
Arsenic , Atherosclerosis , Cardiovascular Diseases , Animals , Apolipoproteins E , Arsenic/toxicity , Atherosclerosis/chemically induced , Atherosclerosis/genetics , Cardiovascular Diseases/chemically induced , Cardiovascular Diseases/genetics , DNA Methylation , Female , Humans , Male , Mice , Middle Aged , Prospective Studies
SELECTION OF CITATIONS
SEARCH DETAIL