RESUMO
Long non-coding RNAs (lncRNAs) are known to perform important regulatory functions in lipid metabolism. Large-scale whole-genome sequencing (WGS) studies and new statistical methods for variant set tests now provide an opportunity to assess more associations between rare variants in lncRNA genes and complex traits across the genome. In this study, we used high-coverage WGS from 66,329 participants of diverse ancestries with measurement of blood lipids and lipoproteins (LDL-C, HDL-C, TC, and TG) in the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program to investigate the role of lncRNAs in lipid variability. We aggregated rare variants for 165,375 lncRNA genes based on their genomic locations and conducted rare-variant aggregate association tests using the STAAR (variant-set test for association using annotation information) framework. We performed STAAR conditional analysis adjusting for common variants in known lipid GWAS loci and rare-coding variants in nearby protein-coding genes. Our analyses revealed 83 rare lncRNA variant sets significantly associated with blood lipid levels, all of which were located in known lipid GWAS loci (in a ±500-kb window of a Global Lipids Genetics Consortium index variant). Notably, 61 out of 83 signals (73%) were conditionally independent of common regulatory variation and rare protein-coding variation at the same loci. We replicated 34 out of 61 (56%) conditionally independent associations using the independent UK Biobank WGS data. Our results expand the genetic architecture of blood lipids to rare variants in lncRNAs.
Assuntos
RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Estudo de Associação Genômica Ampla , Medicina de Precisão , Sequenciamento Completo do Genoma/métodos , Lipídeos/genética , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Diabetic kidney disease (DKD) is recognized as an important public health challenge. However, its genomic mechanisms are poorly understood. To identify rare variants for DKD, we conducted a whole-exome sequencing (WES) study leveraging large cohorts well-phenotyped for chronic kidney disease and diabetes. Our two-stage WES study included 4372 European and African ancestry participants from the Chronic Renal Insufficiency Cohort and Atherosclerosis Risk in Communities studies (stage 1) and 11 487 multi-ancestry Trans-Omics for Precision Medicine participants (stage 2). Generalized linear mixed models, which accounted for genetic relatedness and adjusted for age, sex and ancestry, were used to test associations between single variants and DKD. Gene-based aggregate rare variant analyses were conducted using an optimized sequence kernel association test implemented within our mixed model framework. We identified four novel exome-wide significant DKD-related loci through initiating diabetes. In single-variant analyses, participants carrying a rare, in-frame insertion in the DIS3L2 gene (rs141560952) exhibited a 193-fold increased odds [95% confidence interval (CI): 33.6, 1105] of DKD compared with noncarriers (P = 3.59 × 10-9). Likewise, each copy of a low-frequency KRT6B splice-site variant (rs425827) conferred a 5.31-fold higher odds (95% CI: 3.06, 9.21) of DKD (P = 2.72 × 10-9). Aggregate gene-based analyses further identified ERAP2 (P = 4.03 × 10-8) and NPEPPS (P = 1.51 × 10-7), which are both expressed in the kidney and implicated in renin-angiotensin-aldosterone system modulated immune response. In the largest WES study of DKD, we identified novel rare variant loci attaining exome-wide significance. These findings provide new insights into the molecular mechanisms underlying DKD.
Assuntos
Diabetes Mellitus , Nefropatias Diabéticas , Insuficiência Renal Crônica , Humanos , Aminopeptidases , Nefropatias Diabéticas/genética , Sequenciamento do Exoma , Rim , Insuficiência Renal Crônica/genéticaRESUMO
While polygenic risk scores (PRSs) enable early identification of genetic risk for chronic obstructive pulmonary disease (COPD), predictive performance is limited when the discovery and target populations are not well matched. Hypothesizing that the biological mechanisms of disease are shared across ancestry groups, we introduce a PrediXcan-derived polygenic transcriptome risk score (PTRS) to improve cross-ethnic portability of risk prediction. We constructed the PTRS using summary statistics from application of PrediXcan on large-scale GWASs of lung function (forced expiratory volume in 1 s [FEV1] and its ratio to forced vital capacity [FEV1/FVC]) in the UK Biobank. We examined prediction performance and cross-ethnic portability of PTRS through smoking-stratified analyses both on 29,381 multi-ethnic participants from TOPMed population/family-based cohorts and on 11,771 multi-ethnic participants from TOPMed COPD-enriched studies. Analyses were carried out for two dichotomous COPD traits (moderate-to-severe and severe COPD) and two quantitative lung function traits (FEV1 and FEV1/FVC). While the proposed PTRS showed weaker associations with disease than PRS for European ancestry, the PTRS showed stronger association with COPD than PRS for African Americans (e.g., odds ratio [OR] = 1.24 [95% confidence interval [CI]: 1.08-1.43] for PTRS versus 1.10 [0.96-1.26] for PRS among heavy smokers with ≥ 40 pack-years of smoking) for moderate-to-severe COPD. Cross-ethnic portability of the PTRS was significantly higher than the PRS (paired t test p < 2.2 × 10-16 with portability gains ranging from 5% to 28%) for both dichotomous COPD traits and across all smoking strata. Our study demonstrates the value of PTRS for improved cross-ethnic portability compared to PRS in predicting COPD risk.
Assuntos
Doença Pulmonar Obstrutiva Crônica , Transcriptoma , Humanos , Pulmão , National Heart, Lung, and Blood Institute (U.S.) , Doença Pulmonar Obstrutiva Crônica/genética , Fatores de Risco , Estados Unidos/epidemiologiaRESUMO
Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.
Assuntos
Estudo de Associação Genômica Ampla , Genoma , Humanos , Estudo de Associação Genômica Ampla/métodos , Sequenciamento Completo do Genoma/métodos , Fenótipo , Variação GenéticaRESUMO
RATIONALE: Genetic variation has a substantial contribution to chronic obstructive pulmonary disease (COPD) and lung function measurements. Heritability estimates using genome-wide genotyping data can be biased if analyses do not appropriately account for the nonuniform distribution of genetic effects across the allele frequency and linkage disequilibrium (LD) spectrum. In addition, the contribution of rare variants has been unclear. OBJECTIVES: We sought to assess the heritability of COPD and lung function using whole-genome sequence data from the Trans-Omics for Precision Medicine program. METHODS: Using the genome-based restricted maximum likelihood method, we partitioned the genome into bins based on minor allele frequency and LD scores and estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio in 11 051 European ancestry and 5853 African-American participants. MEASUREMENTS AND MAIN RESULTS: In European ancestry participants, the estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio were 35.5%, 55.6% and 32.5%, of which 18.8%, 19.7%, 17.8% were from common variants, and 16.6%, 35.8%, and 14.6% were from rare variants. These estimates had wide confidence intervals, with common variants and some sets of rare variants showing a statistically significant contribution (P-value < 0.05). In African-Americans, common variant heritability was similar to European ancestry participants, but lower sample size precluded calculation of rare variant heritability. CONCLUSIONS: Our study provides updated and unbiased estimates of heritability for COPD and lung function, and suggests an important contribution of rare variants. Larger studies of more diverse ancestry will improve accuracy of these estimates.
Assuntos
Predisposição Genética para Doença , Doença Pulmonar Obstrutiva Crônica , Humanos , Polimorfismo de Nucleotídeo Único/genética , Doença Pulmonar Obstrutiva Crônica/genética , Estudo de Associação Genômica Ampla , FenótipoRESUMO
Plasma levels of fibrinogen, coagulation factors VII and VIII and von Willebrand factor (vWF) are four intermediate phenotypes that are heritable and have been associated with the risk of clinical thrombotic events. To identify rare and low-frequency variants associated with these hemostatic factors, we conducted whole-exome sequencing in 10 860 individuals of European ancestry (EA) and 3529 African Americans (AAs) from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium and the National Heart, Lung and Blood Institute's Exome Sequencing Project. Gene-based tests demonstrated significant associations with rare variation (minor allele frequency < 5%) in fibrinogen gamma chain (FGG) (with fibrinogen, P = 9.1 × 10-13), coagulation factor VII (F7) (with factor VII, P = 1.3 × 10-72; seven novel variants) and VWF (with factor VIII and vWF; P = 3.2 × 10-14; one novel variant). These eight novel rare variant associations were independent of the known common variants at these loci and tended to have much larger effect sizes. In addition, one of the rare novel variants in F7 was significantly associated with an increased risk of venous thromboembolism in AAs (Ile200Ser; rs141219108; P = 4.2 × 10-5). After restricting gene-based analyses to only loss-of-function variants, a novel significant association was detected and replicated between factor VIII levels and a stop-gain mutation exclusive to AAs (rs3211938) in CD36 molecule (CD36). This variant has previously been linked to dyslipidemia but not with the levels of a hemostatic factor. These efforts represent the largest integration of whole-exome sequence data from two national projects to identify genetic variation associated with plasma hemostatic factors.
Assuntos
Fator VIII , Hemostáticos , Fator VII/genética , Fator VIII/genética , Fibrinogênio/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética , Sequenciamento do Exoma , Fator de von Willebrand/análise , Fator de von Willebrand/genéticaRESUMO
Platelets play a key role in thrombosis and hemostasis. Platelet count (PLT) and mean platelet volume (MPV) are highly heritable quantitative traits, with hundreds of genetic signals previously identified, mostly in European ancestry populations. We here utilize whole genome sequencing (WGS) from NHLBI's Trans-Omics for Precision Medicine initiative (TOPMed) in a large multi-ethnic sample to further explore common and rare variation contributing to PLT (n = 61 200) and MPV (n = 23 485). We identified and replicated secondary signals at MPL (rs532784633) and PECAM1 (rs73345162), both more common in African ancestry populations. We also observed rare variation in Mendelian platelet-related disorder genes influencing variation in platelet traits in TOPMed cohorts (not enriched for blood disorders). For example, association of GP9 with lower PLT and higher MPV was partly driven by a pathogenic Bernard-Soulier syndrome variant (rs5030764, p.Asn61Ser), and the signals at TUBB1 and CD36 were partly driven by loss of function variants not annotated as pathogenic in ClinVar (rs199948010 and rs571975065). However, residual signal remained for these gene-based signals after adjusting for lead variants, suggesting that additional variants in Mendelian genes with impacts in general population cohorts remain to be identified. Gene-based signals were also identified at several genome-wide association study identified loci for genes not annotated for Mendelian platelet disorders (PTPRH, TET2, CHEK2), with somatic variation driving the result at TET2. These results highlight the value of WGS in populations of diverse genetic ancestry to identify novel regulatory and coding signals, even for well-studied traits like platelet traits.
Assuntos
Estudo de Associação Genômica Ampla , Medicina de Precisão , Plaquetas , Humanos , National Heart, Lung, and Blood Institute (U.S.) , Fenótipo , Polimorfismo de Nucleotídeo Único , Medicina de Precisão/métodos , Estados UnidosRESUMO
There is a known genetic susceptibility to anthracycline-induced cardiac dysfunction in childhood cancer survivors, but this has not been adequately shown in adolescent and young adult (AYA) patients. Our aim was to determine if the previously identified variants associated with cardiac dysfunction in childhood cancer patients affect AYA cancer patients similarly. Forty-five variants were selected for analysis in 253 AYAs previously treated with anthracyclines. We identified four variants that were associated with cardiac dysfunction: SLC10A2:rs7319981 (p = 0.017), SLC22A17:rs4982753 (p = 0.019), HAS3:rs2232228 (p = 0.023), and RARG:rs2229774 (p = 0.050). HAS3:rs2232228 and SLC10A2:rs7319981 displayed significant effects in our AYA cancer survivor population that were in the opposite direction than that reported in childhood cancer survivors. Genetic variants in the host genes were further analyzed for additional associations with cardiotoxicity in AYA cancer survivors. The host genes were then evaluated in a panel of induced pluripotent stem cell-derived cardiomyocytes to assess changes in levels of expression when treated with doxorubicin. Significant upregulation of HAS3 and SLC22A17 expression was observed (p < 0.05), with non-significant anthracycline-responsivity observed for RARG. Our study demonstrates that there is a genetic influence on cardiac dysfunction in AYA cancer patients, but there may be a difference in the role of genetics between childhood and AYA cancer survivors.
Assuntos
Antraciclinas , Sobreviventes de Câncer , Cardiotoxicidade , Predisposição Genética para Doença , Humanos , Adolescente , Antraciclinas/efeitos adversos , Adulto Jovem , Masculino , Feminino , Cardiotoxicidade/genética , Adulto , Miócitos Cardíacos/efeitos dos fármacos , Miócitos Cardíacos/metabolismo , Polimorfismo de Nucleotídeo Único/genética , Neoplasias/tratamento farmacológico , Neoplasias/genética , Cardiopatias/induzido quimicamente , Cardiopatias/genética , Antibióticos Antineoplásicos/efeitos adversos , Fatores de RiscoRESUMO
BACKGROUND: Antithrombin, PC (protein C), and PS (protein S) are circulating natural anticoagulant proteins that regulate hemostasis and of which partial deficiencies are causes of venous thromboembolism. Previous genetic association studies involving antithrombin, PC, and PS were limited by modest sample sizes or by being restricted to candidate genes. In the setting of the Cohorts for Heart and Aging Research in Genomic Epidemiology consortium, we meta-analyzed across ancestries the results from 10 genome-wide association studies of plasma levels of antithrombin, PC, PS free, and PS total. METHODS: Study participants were of European and African ancestries, and genotype data were imputed to TOPMed, a dense multiancestry reference panel. Each of the 10 studies conducted a genome-wide association studies for each phenotype and summary results were meta-analyzed, stratified by ancestry. Analysis of antithrombin included 25 243 European ancestry and 2688 African ancestry participants, PC analysis included 16 597 European ancestry and 2688 African ancestry participants, PSF and PST analysis included 4113 and 6409 European ancestry participants. We also conducted transcriptome-wide association analyses and multiphenotype analysis to discover additional associations. Novel genome-wide association studies and transcriptome-wide association analyses findings were validated by in vitro functional experiments. Mendelian randomization was performed to assess the causal relationship between these proteins and cardiovascular outcomes. RESULTS: Genome-wide association studies meta-analyses identified 4 newly associated loci: 3 with antithrombin levels (GCKR, BAZ1B, and HP-TXNL4B) and 1 with PS levels (ORM1-ORM2). transcriptome-wide association analyses identified 3 newly associated genes: 1 with antithrombin level (FCGRT), 1 with PC (GOLM2), and 1 with PS (MYL7). In addition, we replicated 7 independent loci reported in previous studies. Functional experiments provided evidence for the involvement of GCKR, SNX17, and HP genes in antithrombin regulation. CONCLUSIONS: The use of larger sample sizes, diverse populations, and a denser imputation reference panel allowed the detection of 7 novel genomic loci associated with plasma antithrombin, PC, and PS levels.
Assuntos
Proteína C , Proteína S , Proteína C/genética , Proteína S/genética , Estudo de Associação Genômica Ampla , Antitrombinas , Transcriptoma , Anticoagulantes , Antitrombina III/genética , Polimorfismo de Nucleotídeo ÚnicoRESUMO
AIMS/HYPOTHESIS: Genetic predisposition to type 2 diabetes is well-established, and genetic risk scores (GRS) have been developed that capture heritable liabilities for type 2 diabetes phenotypes. However, the proteins through which these genetic variants influence risk have not been thoroughly investigated. This study aimed to identify proteins and pathways through which type 2 diabetes risk variants may influence pathophysiology. METHODS: Using a proteomics data-driven approach in a discovery sample of 7241 White participants in the Atherosclerosis Risk in Communities Study (ARIC) cohort and a replication sample of 1674 Black ARIC participants, we interrogated plasma levels of 4870 proteins and four GRS of specific type 2 diabetes phenotypes related to beta cell function, insulin resistance, lipodystrophy, BMI/blood lipid abnormalities and a composite score of all variants combined. RESULTS: Twenty-two plasma proteins were identified in White participants after Bonferroni correction. Of the 22 protein-GRS associations that were statistically significant, 10 were replicated in Black participants and all but one were directionally consistent. In a secondary analysis, 18 of the 22 proteins were found to be associated with prevalent type 2 diabetes and ten proteins were associated with incident type 2 diabetes. Two-sample Mendelian randomisation indicated that complement C2 may be causally related to greater type 2 diabetes risk (inverse variance weighted estimate: OR 1.65 per SD; p=7.0 × 10-3), while neuropilin-2 was inversely associated (OR 0.44 per SD; p=8.0 × 10-3). CONCLUSIONS/INTERPRETATION: Identified proteins may represent viable intervention or pharmacological targets to prevent, reverse or slow type 2 diabetes progression, and further research is needed to pursue these targets.
Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Complemento C2 , Proteômica , Fatores de RiscoRESUMO
Hematological measures are important intermediate clinical phenotypes for many acute and chronic diseases and are highly heritable. Although genome-wide association studies (GWAS) have identified thousands of loci containing trait-associated variants, the causal genes underlying these associations are often uncertain. To better understand the underlying genetic regulatory mechanisms, we performed a transcriptome-wide association study (TWAS) to systematically investigate the association between genetically predicted gene expression and hematological measures in 54,542 Europeans from the Genetic Epidemiology Research on Aging cohort. We found 239 significant gene-trait associations with hematological measures; we replicated 71 associations at p < 0.05 in a TWAS meta-analysis consisting of up to 35,900 Europeans from the Women's Health Initiative, Atherosclerosis Risk in Communities Study, and BioMe Biobank. Additionally, we attempted to refine this list of candidate genes by performing conditional analyses, adjusting for individual variants previously associated with hematological measures, and performed further fine-mapping of TWAS loci. To facilitate interpretation of our findings, we designed an R Shiny application to interactively visualize our TWAS results by integrating them with additional genetic data sources (GWAS, TWAS from multiple reference panels, conditional analyses, known GWAS variants, etc.). Our results and application highlight frequently overlooked TWAS challenges and illustrate the complexity of TWAS fine-mapping.
Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Células Sanguíneas , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características QuantitativasRESUMO
BACKGROUND: Understanding the effect of lifestyle and genetic risk on the lifetime risk of coronary heart disease (CHD) is important to improving public health initiatives. Our objective was to quantify remaining lifetime risk and years free of CHD according to polygenic risk and the American Heart Association's Life's Simple 7 (LS7) guidelines in a population-based cohort study. METHODS: Our analysis included data from participants of the ARIC (Atherosclerosis Risk in Communities) study: 8372 White and 2314 Black participants; 45 years of age and older; and free of CHD at baseline examination. A polygenic risk score (PRS) comprised more than 6 million genetic variants was categorized into low (<20th percentile), intermediate, and high (>80th percentile). An overall LS7 score was calculated at baseline and categorized into "poor," "intermediate," and "ideal" cardiovascular health. Lifetime risk and CHD-free years were computed according to polygenic risk and LS7 categories. RESULTS: The overall remaining lifetime risk was 27%, ranging from 16.6% in individuals with an ideal LS7 score to 43.1% for individuals with a poor LS7 score. The association of PRS with lifetime risk differed according to ancestry. In White participants, remaining lifetime risk ranged from 19.8% to 39.3% according to increasing PRS categories. Individuals with a high PRS and poor LS7 had a remaining lifetime risk of 67.1% and 15.9 fewer CHD-free years than did those with intermediate polygenic risk and LS7 scores. In the high-PRS group, ideal LS7 was associated with 20.2 more CHD-free years compared with poor LS7. In Black participants, remaining lifetime risk ranged from 19.1% to 28.6% according to increasing PRS category. Similar lifetime risk estimates were observed for individuals of poor LS7 regardless of PRS category. In the high-PRS group, an ideal LS7 score was associated with only 4.5 more CHD-free years compared with a poor LS7 score. CONCLUSIONS: Ideal adherence to LS7 recommendations was associated with lower lifetime risk of CHD for all individuals, especially in those with high genetic susceptibility. In Black participants, adherence to LS7 guidelines contributed to lifetime risk of CHD more so than current PRSs. Improved PRSs are needed to properly evaluate genetic susceptibility for CHD in diverse populations.
Assuntos
Doenças Cardiovasculares , Doença das Coronárias , American Heart Association , Doenças Cardiovasculares/diagnóstico , Estudos de Coortes , Doença das Coronárias/diagnóstico , Doença das Coronárias/epidemiologia , Doença das Coronárias/genética , Predisposição Genética para Doença , Humanos , Estilo de Vida , Fatores de Risco , Estados Unidos/epidemiologiaRESUMO
BACKGROUND: Rare sequence variation in genes underlying cardiac repolarization and common polygenic variation influence QT interval duration. However, current clinical genetic testing of individuals with unexplained QT prolongation is restricted to examination of monogenic rare variants. The recent emergence of large-scale biorepositories with sequence data enables examination of the joint contribution of rare and common variations to the QT interval in the population. METHODS: We performed a genome-wide association study of the QTc in 84 630 UK Biobank participants and created a polygenic risk score (PRS). Among 26 976 participants with whole-genome sequencing and ECG data in the TOPMed (Trans-Omics for Precision Medicine) program, we identified 160 carriers of putative pathogenic rare variants in 10 genes known to be associated with the QT interval. We examined QTc associations with the PRS and with rare variants in TOPMed. RESULTS: Fifty-four independent loci were identified by genome-wide association study in the UK Biobank. Twenty-one loci were novel, of which 12 were replicated in TOPMed. The PRS composed of 1 110 494 common variants was significantly associated with the QTc in TOPMed (ΔQTc/decile of PRS=1.4 ms [95% CI, 1.3 to 1.5]; P=1.1×10-196). Carriers of putative pathogenic rare variants had longer QTc than noncarriers (ΔQTc=10.9 ms [95% CI, 7.4 to 14.4]). Of individuals with QTc>480 ms, 23.7% carried either a monogenic rare variant or had a PRS in the top decile (3.4% monogenic, 21% top decile of PRS). CONCLUSIONS: QTc duration in the population is influenced by both rare variants in genes underlying cardiac repolarization and polygenic risk, with a sizeable contribution from polygenic risk. Comprehensive assessment of the genetic determinants of QTc prolongation includes incorporation of both polygenic and monogenic risk.
Assuntos
Estudo de Associação Genômica Ampla , Síndrome do QT Longo , Eletrocardiografia , Heterozigoto , Humanos , Síndrome do QT Longo/diagnóstico , Síndrome do QT Longo/genética , Herança Multifatorial , Sequenciamento Completo do GenomaRESUMO
Variation in levels of the human metabolome reflect changes in homeostasis, providing a window into health and disease. The genetic impact on circulating metabolites in Hispanics, a population with high cardiometabolic disease burden, is largely unknown. We conducted genome-wide association analyses on 640 circulating metabolites in 3,926 Hispanic Community Health Study/Study of Latinos participants. The estimated heritability for 640 metabolites ranged between 0%-54% with a median at 2.5%. We discovered 46 variant-metabolite pairs (p value < 1.2 × 10-10, minor allele frequency ≥ 1%, proportion of variance explained [PEV] mean = 3.4%, PEVrange = 1%-22%) with generalized effects in two population-based studies and confirmed 301 known locus-metabolite associations. Half of the identified variants with generalized effect were located in genes, including five nonsynonymous variants. We identified co-localization with the expression quantitative trait loci at 105 discovered and 151 known loci-metabolites sets. rs5855544, upstream of SLC51A, was associated with higher levels of three steroid sulfates and co-localized with expression levels of SLC51A in several tissues. Mendelian randomization (MR) analysis identified several metabolites associated with coronary heart disease (CHD) and type 2 diabetes. For example, two variants located in or near CYP4F2 (rs2108622 and rs79400241, respectively), involved in vitamin E metabolism, were associated with the levels of octadecanedioate and vitamin E metabolites (gamma-CEHC and gamma-CEHC glucuronide); MR analysis showed that genetically high levels of these metabolites were associated with lower odds of CHD. Our findings document the genetic architecture of circulating metabolites in an underrepresented Hispanic/Latino community, shedding light on disease etiology.
Assuntos
Doença das Coronárias/genética , Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença , Genoma Humano , Metaboloma/genética , Locos de Características Quantitativas , Adulto , Cromanos/metabolismo , Estudos de Coortes , Doença das Coronárias/diagnóstico , Doença das Coronárias/etnologia , Doença das Coronárias/metabolismo , Família 4 do Citocromo P450/genética , Família 4 do Citocromo P450/metabolismo , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/etnologia , Diabetes Mellitus Tipo 2/metabolismo , Feminino , Expressão Gênica , Estudo de Associação Genômica Ampla , Hispânico ou Latino , Humanos , Masculino , Proteínas de Membrana Transportadoras/genética , Proteínas de Membrana Transportadoras/metabolismo , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único , Propionatos/metabolismo , Saúde Pública , Característica Quantitativa Herdável , Vitamina E/metabolismoRESUMO
Genetic risk score (GRS) analysis is a popular approach to derive individual risk prediction models for complex diseases. In venous thrombosis (VT), such type of analysis shall integrate information at the ABO blood group locus, which is one of the major susceptibility loci. However, there is no consensus about which single nucleotide polymorphisms (SNPs) must be investigated when properly assessing association between ABO locus and VT risk. Using comprehensive haplotype analyses of ABO blood group tagging SNPs in 5425 cases and 8445 controls from 6 studies, we demonstrate that using only rs8176719 (tagging O1) to correctly assess the impact of ABO locus on VT risk is suboptimal, because 5% of rs8176719-delG carriers do not have an increased risk of developing VT. Instead, we recommend the use of 4 SNPs, rs2519093 (tagging A1), rs1053878 (A2), rs8176743 (B), and rs41302905 (O2), when assessing the impact of ABO locus on VT risk to avoid any risk misestimation. Compared with the O1 haplotype, the A2 haplotype is associated with a modest increase in VT risk (odds ratio, â¼1.2), the A1 and B haplotypes are associated with an â¼1.8-fold increased risk, whereas the O2 haplotype tends to be slightly protective (odds ratio, â¼0.80). In addition, although the A1 and B blood groups are associated with increased von Willebrand factor and factor VIII plasma levels, only the A1 blood group is associated with ICAM levels, but in an opposite direction, leaving additional avenues to be explored to fully understand the spectrum of biological effects mediated by ABO locus on cardiovascular traits.
Assuntos
Sistema ABO de Grupos Sanguíneos/genética , Doenças Cardiovasculares/patologia , Predisposição Genética para Doença , Haplótipos , Polimorfismo de Nucleotídeo Único , Trombose Venosa/patologia , Idoso , Doenças Cardiovasculares/etiologia , Doenças Cardiovasculares/metabolismo , Fator VIII/metabolismo , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Fenótipo , Prognóstico , Fatores de Risco , Trombose Venosa/etiologia , Trombose Venosa/metabolismo , Fator de von Willebrand/metabolismoRESUMO
BACKGROUND AND PURPOSE: Clonal hematopoiesis of indeterminate potential (CHIP) is a novel age-related risk factor for cardiovascular disease-related morbidity and mortality. The association of CHIP with risk of incident ischemic stroke was reported previously in an exploratory analysis including a small number of incident stroke cases without replication and lack of stroke subphenotyping. The purpose of this study was to discover whether CHIP is a risk factor for ischemic or hemorrhagic stroke. METHODS: We utilized plasma genome sequence data of blood DNA to identify CHIP in 78 752 individuals from 8 prospective cohorts and biobanks. We then assessed the association of CHIP and commonly mutated individual CHIP driver genes (DNMT3A, TET2, and ASXL1) with any stroke, ischemic stroke, and hemorrhagic stroke. RESULTS: CHIP was associated with an increased risk of total stroke (hazard ratio, 1.14 [95% CI, 1.03-1.27]; P=0.01) after adjustment for age, sex, and race. We observed associations with CHIP with risk of hemorrhagic stroke (hazard ratio, 1.24 [95% CI, 1.01-1.51]; P=0.04) and with small vessel ischemic stroke subtypes. In gene-specific association results, TET2 showed the strongest association with total stroke and ischemic stroke, whereas DMNT3A and TET2 were each associated with increased risk of hemorrhagic stroke. CONCLUSIONS: CHIP is associated with an increased risk of stroke, particularly with hemorrhagic and small vessel ischemic stroke. Future studies clarifying the relationship between CHIP and subtypes of stroke are needed.
Assuntos
Hematopoiese Clonal/fisiologia , Acidente Vascular Cerebral Hemorrágico/epidemiologia , AVC Isquêmico/epidemiologia , Adulto , Idoso , Idoso de 80 Anos ou mais , Hematopoiese Clonal/genética , DNA Metiltransferase 3A/genética , Proteínas de Ligação a DNA/genética , Dioxigenases/genética , Feminino , Acidente Vascular Cerebral Hemorrágico/genética , Acidente Vascular Cerebral Hemorrágico/fisiopatologia , Humanos , Incidência , AVC Isquêmico/genética , AVC Isquêmico/fisiopatologia , Masculino , Pessoa de Meia-Idade , Prevalência , Proteínas Repressoras/genética , RiscoRESUMO
Set-based analysis that jointly tests the association of variants in a group has emerged as a popular tool for analyzing rare and low-frequency variants in sequencing studies. The existing set-based tests can suffer significant power loss when only a small proportion of variants are causal, and their powers can be sensitive to the number, effect sizes, and effect directions of the causal variants and the choices of weights. Here we propose an aggregated Cauchy association test (ACAT), a general, powerful, and computationally efficient p value combination method for boosting power in sequencing studies. First, by combining variant-level p values, we use ACAT to construct a set-based test (ACAT-V) that is particularly powerful in the presence of only a small number of causal variants in a variant set. Second, by combining different variant-set-level p values, we use ACAT to construct an omnibus test (ACAT-O) that combines the strength of multiple complimentary set-based tests, including the burden test, sequence kernel association test (SKAT), and ACAT-V. Through analysis of extensively simulated data and the whole-genome sequencing data from the Atherosclerosis Risk in Communities (ARIC) study, we demonstrate that ACAT-V complements the SKAT and the burden test, and that ACAT-O has a substantially more robust and higher power than those of the alternative tests.
Assuntos
Algoritmos , Doença/genética , Estudos de Associação Genética/métodos , Variação Genética , Genoma Humano , Modelos Genéticos , Análise de Sequência de DNA/métodos , Simulação por Computador , Interpretação Estatística de Dados , HumanosRESUMO
Whole-genome sequencing (WGS) studies are being widely conducted in order to identify rare variants associated with human diseases and disease-related traits. Classical single-marker association analyses for rare variants have limited power, and variant-set-based analyses are commonly used by researchers for analyzing rare variants. However, existing variant-set-based approaches need to pre-specify genetic regions for analysis; hence, they are not directly applicable to WGS data because of the large number of intergenic and intron regions that consist of a massive number of non-coding variants. The commonly used sliding-window method requires the pre-specification of fixed window sizes, which are often unknown as a priori, are difficult to specify in practice, and are subject to limitations given that the sizes of genetic-association regions are likely to vary across the genome and phenotypes. We propose a computationally efficient and dynamic scan-statistic method (Scan the Genome [SCANG]) for analyzing WGS data; this method flexibly detects the sizes and the locations of rare-variant association regions without the need to specify a prior, fixed window size. The proposed method controls for the genome-wise type I error rate and accounts for the linkage disequilibrium among genetic variants. It allows the detected sizes of rare-variant association regions to vary across the genome. Through extensive simulated studies that consider a wide variety of scenarios, we show that SCANG substantially outperforms several alternative methods for detecting rare-variant-associations while controlling for the genome-wise type I error rates. We illustrate SCANG by analyzing the WGS lipids data from the Atherosclerosis Risk in Communities (ARIC) study.
Assuntos
Algoritmos , Biologia Computacional/métodos , Variação Genética , Genoma Humano , Estudo de Associação Genômica Ampla , Sequenciamento Completo do Genoma/métodos , Humanos , Desequilíbrio de Ligação , Modelos GenéticosRESUMO
Hemoglobin A1c (HbA1c) is widely used to diagnose diabetes and assess glycemic control in individuals with diabetes. However, nonglycemic determinants, including genetic variation, may influence how accurately HbA1c reflects underlying glycemia. Analyzing the NHLBI Trans-Omics for Precision Medicine (TOPMed) sequence data in 10,338 individuals from five studies and four ancestries (6,158 Europeans, 3,123 African-Americans, 650 Hispanics, and 407 East Asians), we confirmed five regions associated with HbA1c (GCK in Europeans and African-Americans, HK1 in Europeans and Hispanics, FN3K and/or FN3KRP in Europeans, and G6PD in African-Americans and Hispanics) and we identified an African-ancestry-specific low-frequency variant (rs1039215 in HBG2 and HBE1, minor allele frequency (MAF) = 0.03). The most associated G6PD variant (rs1050828-T, p.Val98Met, MAF = 12% in African-Americans, MAF = 2% in Hispanics) lowered HbA1c (-0.88% in hemizygous males, -0.34% in heterozygous females) and explained 23% of HbA1c variance in African-Americans and 4% in Hispanics. Additionally, we identified a rare distinct G6PD coding variant (rs76723693, p.Leu353Pro, MAF = 0.5%; -0.98% in hemizygous males, -0.46% in heterozygous females) and detected significant association with HbA1c when aggregating rare missense variants in G6PD. We observed similar magnitude and direction of effects for rs1039215 (HBG2) and rs76723693 (G6PD) in the two largest TOPMed African American cohorts, and we replicated the rs76723693 association in the UK Biobank African-ancestry participants. These variants in G6PD and HBG2 were monomorphic in the European and Asian samples. African or Hispanic ancestry individuals carrying G6PD variants may be underdiagnosed for diabetes when screened with HbA1c. Thus, assessment of these variants should be considered for incorporation into precision medicine approaches for diabetes diagnosis.
Assuntos
Diabetes Mellitus/diagnóstico , Diabetes Mellitus/genética , Variação Genética , Hemoglobinas Glicadas/genética , Grupos Populacionais/genética , Medicina de Precisão , Estudos de Coortes , Feminino , Humanos , Masculino , Polimorfismo de Nucleotídeo ÚnicoRESUMO
With advances in whole-genome sequencing (WGS) technology, more advanced statistical methods for testing genetic association with rare variants are being developed. Methods in which variants are grouped for analysis are also known as variant-set, gene-based, and aggregate unit tests. The burden test and sequence kernel association test (SKAT) are two widely used variant-set tests, which were originally developed for samples of unrelated individuals and later have been extended to family data with known pedigree structures. However, computationally efficient and powerful variant-set tests are needed to make analyses tractable in large-scale WGS studies with complex study samples. In this paper, we propose the variant-set mixed model association tests (SMMAT) for continuous and binary traits using the generalized linear mixed model framework. These tests can be applied to large-scale WGS studies involving samples with population structure and relatedness, such as in the National Heart, Lung, and Blood Institute's Trans-Omics for Precision Medicine (TOPMed) program. SMMATs share the same null model for different variant sets, and a virtue of this null model, which includes covariates only, is that it needs to be fit only once for all tests in each genome-wide analysis. Simulation studies show that all the proposed SMMATs correctly control type I error rates for both continuous and binary traits in the presence of population structure and relatedness. We also illustrate our tests in a real data example of analysis of plasma fibrinogen levels in the TOPMed program (n = 23,763), using the Analysis Commons, a cloud-based computing platform.