Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 144
Filtrar
1.
Cell ; 182(5): 1198-1213.e14, 2020 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-32888493

RESUMO

Most loci identified by GWASs have been found in populations of European ancestry (EUR). In trans-ethnic meta-analyses for 15 hematological traits in 746,667 participants, including 184,535 non-EUR individuals, we identified 5,552 trait-variant associations at p < 5 × 10-9, including 71 novel associations not found in EUR populations. We also identified 28 additional novel variants in ancestry-specific, non-EUR meta-analyses, including an IL7 missense variant in South Asians associated with lymphocyte count in vivo and IL-7 secretion levels in vitro. Fine-mapping prioritized variants annotated as functional and generated 95% credible sets that were 30% smaller when using the trans-ethnic as opposed to the EUR-only results. We explored the clinical significance and predictive value of trans-ethnic variants in multiple populations and compared genetic architecture and the effect of natural selection on these blood phenotypes between populations. Altogether, our results for hematological traits highlight the value of a more global representation of populations in genetic studies.


Assuntos
Povo Asiático/genética , Mutação de Sentido Incorreto/genética , Polimorfismo de Nucleotídeo Único/genética , População Branca/genética , Genética , Estudo de Associação Genômica Ampla/métodos , Células HEK293 , Humanos , Interleucina-7/genética , Fenótipo
2.
Cell ; 182(5): 1214-1231.e11, 2020 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-32888494

RESUMO

Blood cells play essential roles in human health, underpinning physiological processes such as immunity, oxygen transport, and clotting, which when perturbed cause a significant global health burden. Here we integrate data from UK Biobank and a large-scale international collaborative effort, including data for 563,085 European ancestry participants, and discover 5,106 new genetic variants independently associated with 29 blood cell phenotypes covering a range of variation impacting hematopoiesis. We holistically characterize the genetic architecture of hematopoiesis, assess the relevance of the omnigenic model to blood cell phenotypes, delineate relevant hematopoietic cell states influenced by regulatory genetic variants and gene networks, identify novel splice-altering variants mediating the associations, and assess the polygenic prediction potential for blood traits and clinical disorders at the interface of complex and Mendelian genetics. These results show the power of large-scale blood cell trait GWAS to interrogate clinically meaningful variants across a wide allelic spectrum of human variation.


Assuntos
Predisposição Genética para Doença/genética , Herança Multifatorial/genética , Feminino , Redes Reguladoras de Genes/genética , Estudo de Associação Genômica Ampla/métodos , Hematopoese/genética , Humanos , Masculino , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
3.
Am J Hum Genet ; 111(5): 990-995, 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38636510

RESUMO

Since genotype imputation was introduced, researchers have been relying on the estimated imputation quality from imputation software to perform post-imputation quality control (QC). However, this quality estimate (denoted as Rsq) performs less well for lower-frequency variants. We recently published MagicalRsq, a machine-learning-based imputation quality calibration, which leverages additional typed markers from the same cohort and outperforms Rsq as a QC metric. In this work, we extended the original MagicalRsq to allow cross-cohort model training and named the new model MagicalRsq-X. We removed the cohort-specific estimated minor allele frequency and included linkage disequilibrium scores and recombination rates as additional features. Leveraging whole-genome sequencing data from TOPMed, specifically participants in the BioMe, JHS, WHI, and MESA studies, we performed comprehensive cross-cohort evaluations for predominantly European and African ancestral individuals based on their inferred global ancestry with the 1000 Genomes and Human Genome Diversity Project data as reference. Our results suggest MagicalRsq-X outperforms Rsq in almost every setting, with 7.3%-14.4% improvement in squared Pearson correlation with true R2, corresponding to 85-218 K variant gains. We further developed a metric to quantify the genetic distances of a target cohort relative to a reference cohort and showed that such metric largely explained the performance of MagicalRsq-X models. Finally, we found MagicalRsq-X saved up to 53 known genome-wide significant variants in one of the largest blood cell trait GWASs that would be missed using the original Rsq for QC. In conclusion, MagicalRsq-X shows superiority for post-imputation QC and benefits genetic studies by distinguishing well and poorly imputed lower-frequency variants.


Assuntos
Frequência do Gene , Genótipo , Polimorfismo de Nucleotídeo Único , Software , Humanos , Estudos de Coortes , Desequilíbrio de Ligação , Estudo de Associação Genômica Ampla/métodos , Genoma Humano , Controle de Qualidade , Aprendizado de Máquina , Sequenciamento Completo do Genoma/normas , Sequenciamento Completo do Genoma/métodos
4.
Hum Mol Genet ; 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38747556

RESUMO

Inflammation biomarkers can provide valuable insight into the role of inflammatory processes in many diseases and conditions. Sequencing based analyses of such biomarkers can also serve as an exemplar of the genetic architecture of quantitative traits. To evaluate the biological insight, which can be provided by a multi-ancestry, whole-genome based association study, we performed a comprehensive analysis of 21 inflammation biomarkers from up to 38 465 individuals with whole-genome sequencing from the Trans-Omics for Precision Medicine (TOPMed) program (with varying sample size by trait, where the minimum sample size was n = 737 for MMP-1). We identified 22 distinct single-variant associations across 6 traits-E-selectin, intercellular adhesion molecule 1, interleukin-6, lipoprotein-associated phospholipase A2 activity and mass, and P-selectin-that remained significant after conditioning on previously identified associations for these inflammatory biomarkers. We further expanded upon known biomarker associations by pairing the single-variant analysis with a rare variant set-based analysis that further identified 19 significant rare variant set-based associations with 5 traits. These signals were distinct from both significant single variant association signals within TOPMed and genetic signals observed in prior studies, demonstrating the complementary value of performing both single and rare variant analyses when analyzing quantitative traits. We also confirm several previously reported signals from semi-quantitative proteomics platforms. Many of these signals demonstrate the extensive allelic heterogeneity and ancestry-differentiated variant-trait associations common for inflammation biomarkers, a characteristic we hypothesize will be increasingly observed with well-powered, large-scale analyses of complex traits.

5.
Blood ; 143(18): 1845-1855, 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38320121

RESUMO

ABSTRACT: Coagulation factor VIII (FVIII) and its carrier protein von Willebrand factor (VWF) are critical to coagulation and platelet aggregation. We leveraged whole-genome sequence data from the Trans-Omics for Precision Medicine (TOPMed) program along with TOPMed-based imputation of genotypes in additional samples to identify genetic associations with circulating FVIII and VWF levels in a single-variant meta-analysis, including up to 45 289 participants. Gene-based aggregate tests were implemented in TOPMed. We identified 3 candidate causal genes and tested their functional effect on FVIII release from human liver endothelial cells (HLECs) and VWF release from human umbilical vein endothelial cells. Mendelian randomization was also performed to provide evidence for causal associations of FVIII and VWF with thrombotic outcomes. We identified associations (P < 5 × 10-9) at 7 new loci for FVIII (ST3GAL4, CLEC4M, B3GNT2, ASGR1, F12, KNG1, and TREM1/NCR2) and 1 for VWF (B3GNT2). VWF, ABO, and STAB2 were associated with FVIII and VWF in gene-based analyses. Multiphenotype analysis of FVIII and VWF identified another 3 new loci, including PDIA3. Silencing of B3GNT2 and the previously reported CD36 gene decreased release of FVIII by HLECs, whereas silencing of B3GNT2, CD36, and PDIA3 decreased release of VWF by HVECs. Mendelian randomization supports causal association of higher FVIII and VWF with increased risk of thrombotic outcomes. Seven new loci were identified for FVIII and 1 for VWF, with evidence supporting causal associations of FVIII and VWF with thrombotic outcomes. B3GNT2, CD36, and PDIA3 modulate the release of FVIII and/or VWF in vitro.


Assuntos
Moléculas de Adesão Celular , Fator VIII , Cininogênios , Lectinas Tipo C , Receptores de Superfície Celular , Fator de von Willebrand , Humanos , Fator de von Willebrand/genética , Fator de von Willebrand/metabolismo , Fator VIII/genética , Fator VIII/metabolismo , Polimorfismo de Nucleotídeo Único , Células Endoteliais da Veia Umbilical Humana/metabolismo , Análise da Randomização Mendeliana , Estudo de Associação Genômica Ampla , Trombose/genética , Trombose/sangue , Estudos de Associação Genética , Masculino , Células Endoteliais/metabolismo , Feminino
6.
PLoS Genet ; 19(5): e1010517, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37216410

RESUMO

Integrative approaches that simultaneously model multi-omics data have gained increasing popularity because they provide holistic system biology views of multiple or all components in a biological system of interest. Canonical correlation analysis (CCA) is a correlation-based integrative method designed to extract latent features shared between multiple assays by finding the linear combinations of features-referred to as canonical variables (CVs)-within each assay that achieve maximal across-assay correlation. Although widely acknowledged as a powerful approach for multi-omics data, CCA has not been systematically applied to multi-omics data in large cohort studies, which has only recently become available. Here, we adapted sparse multiple CCA (SMCCA), a widely-used derivative of CCA, to proteomics and methylomics data from the Multi-Ethnic Study of Atherosclerosis (MESA) and Jackson Heart Study (JHS). To tackle challenges encountered when applying SMCCA to MESA and JHS, our adaptations include the incorporation of the Gram-Schmidt (GS) algorithm with SMCCA to improve orthogonality among CVs, and the development of Sparse Supervised Multiple CCA (SSMCCA) to allow supervised integration analysis for more than two assays. Effective application of SMCCA to the two real datasets reveals important findings. Applying our SMCCA-GS to MESA and JHS, we identified strong associations between blood cell counts and protein abundance, suggesting that adjustment of blood cell composition should be considered in protein-based association studies. Importantly, CVs obtained from two independent cohorts also demonstrate transferability across the cohorts. For example, proteomic CVs learned from JHS, when transferred to MESA, explain similar amounts of blood cell count phenotypic variance in MESA, explaining 39.0% ~ 50.0% variation in JHS and 38.9% ~ 49.1% in MESA. Similar transferability was observed for other omics-CV-trait pairs. This suggests that biologically meaningful and cohort-agnostic variation is captured by CVs. We anticipate that applying our SMCCA-GS and SSMCCA on various cohorts would help identify cohort-agnostic biologically meaningful relationships between multi-omics data and phenotypic traits.


Assuntos
Análise de Correlação Canônica , Proteômica , Humanos , Proteômica/métodos , Multiômica , Estudos de Coortes
7.
Genet Epidemiol ; 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38940271

RESUMO

In most Proteome-Wide Association Studies (PWAS), variants near the protein-coding gene (±1 Mb), also known as cis single nucleotide polymorphisms (SNPs), are used to predict protein levels, which are then tested for association with phenotypes. However, proteins can be regulated through variants outside of the cis region. An intermediate GWAS step to identify protein quantitative trait loci (pQTL) allows for the inclusion of trans SNPs outside the cis region in protein-level prediction models. Here, we assess the prediction of 540 proteins in 1002 individuals from the Women's Health Initiative (WHI), split equally into a GWAS set, an elastic net training set, and a testing set. We compared the testing r2 between measured and predicted protein levels using this proposed approach, to the testing r2 using only cis SNPs. The two methods usually resulted in similar testing r2, but some proteins showed a significant increase in testing r2 with our method. For example, for cartilage acidic protein 1, the testing r2 increased from 0.101 to 0.351. We also demonstrate reproducible findings for predicted protein association with lipid and blood cell traits in WHI participants without proteomics data and in UK Biobank utilizing our PWAS weights.

8.
Hum Mol Genet ; 32(6): 1048-1060, 2023 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-36444934

RESUMO

Diabetic kidney disease (DKD) is recognized as an important public health challenge. However, its genomic mechanisms are poorly understood. To identify rare variants for DKD, we conducted a whole-exome sequencing (WES) study leveraging large cohorts well-phenotyped for chronic kidney disease and diabetes. Our two-stage WES study included 4372 European and African ancestry participants from the Chronic Renal Insufficiency Cohort and Atherosclerosis Risk in Communities studies (stage 1) and 11 487 multi-ancestry Trans-Omics for Precision Medicine participants (stage 2). Generalized linear mixed models, which accounted for genetic relatedness and adjusted for age, sex and ancestry, were used to test associations between single variants and DKD. Gene-based aggregate rare variant analyses were conducted using an optimized sequence kernel association test implemented within our mixed model framework. We identified four novel exome-wide significant DKD-related loci through initiating diabetes. In single-variant analyses, participants carrying a rare, in-frame insertion in the DIS3L2 gene (rs141560952) exhibited a 193-fold increased odds [95% confidence interval (CI): 33.6, 1105] of DKD compared with noncarriers (P = 3.59 × 10-9). Likewise, each copy of a low-frequency KRT6B splice-site variant (rs425827) conferred a 5.31-fold higher odds (95% CI: 3.06, 9.21) of DKD (P = 2.72 × 10-9). Aggregate gene-based analyses further identified ERAP2 (P = 4.03 × 10-8) and NPEPPS (P = 1.51 × 10-7), which are both expressed in the kidney and implicated in renin-angiotensin-aldosterone system modulated immune response. In the largest WES study of DKD, we identified novel rare variant loci attaining exome-wide significance. These findings provide new insights into the molecular mechanisms underlying DKD.


Assuntos
Diabetes Mellitus , Nefropatias Diabéticas , Insuficiência Renal Crônica , Humanos , Aminopeptidases , Nefropatias Diabéticas/genética , Sequenciamento do Exoma , Rim , Insuficiência Renal Crônica/genética
9.
Am J Hum Genet ; 109(11): 1986-1997, 2022 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-36198314

RESUMO

Whole-genome sequencing (WGS) is the gold standard for fully characterizing genetic variation but is still prohibitively expensive for large samples. To reduce costs, many studies sequence only a subset of individuals or genomic regions, and genotype imputation is used to infer genotypes for the remaining individuals or regions without sequencing data. However, not all variants can be well imputed, and the current state-of-the-art imputation quality metric, denoted as standard Rsq, is poorly calibrated for lower-frequency variants. Here, we propose MagicalRsq, a machine-learning-based method that integrates variant-level imputation and population genetics statistics, to provide a better calibrated imputation quality metric. Leveraging WGS data from the Cystic Fibrosis Genome Project (CFGP), and whole-exome sequence data from UK BioBank (UKB), we performed comprehensive experiments to evaluate the performance of MagicalRsq compared to standard Rsq for partially sequenced studies. We found that MagicalRsq aligns better with true R2 than standard Rsq in almost every situation evaluated, for both European and African ancestry samples. For example, when applying models trained from 1,992 CFGP sequenced samples to an independent 3,103 samples with no sequencing but TOPMed imputation from array genotypes, MagicalRsq, compared to standard Rsq, achieved net gains of 1.4 million rare, 117k low-frequency, and 18k common variants, where net gains were gained numbers of correctly distinguished variants by MagicalRsq over standard Rsq. MagicalRsq can serve as an improved post-imputation quality metric and will benefit downstream analysis by better distinguishing well-imputed variants from those poorly imputed. MagicalRsq is freely available on GitHub.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único/genética , Calibragem , Genótipo , Aprendizado de Máquina
10.
Am J Hum Genet ; 109(6): 1175-1181, 2022 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-35504290

RESUMO

Current publicly available tools that allow rapid exploration of linkage disequilibrium (LD) between markers (e.g., HaploReg and LDlink) are based on whole-genome sequence (WGS) data from 2,504 individuals in the 1000 Genomes Project. Here, we present TOP-LD, an online tool to explore LD inferred with high-coverage (∼30×) WGS data from 15,578 individuals in the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. TOP-LD provides a significant upgrade compared to current LD tools, as the TOPMed WGS data provide a more comprehensive representation of genetic variation than the 1000 Genomes data, particularly for rare variants and in the specific populations that we analyzed. For example, TOP-LD encompasses LD information for 150.3, 62.2, and 36.7 million variants for European, African, and East Asian ancestral samples, respectively, offering 2.6- to 9.1-fold increase in variant coverage compared to HaploReg 4.0 or LDlink. In addition, TOP-LD includes tens of thousands of structural variants (SVs). We demonstrate the value of TOP-LD in fine-mapping at the GGT1 locus associated with gamma glutamyltransferase in the African ancestry participants in UK Biobank. Beyond fine-mapping, TOP-LD can facilitate a wide range of applications that are based on summary statistics and estimates of LD. TOP-LD is freely available online.


Assuntos
Estudo de Associação Genômica Ampla , Medicina de Precisão , Povo Asiático , Humanos , Desequilíbrio de Ligação/genética , Polimorfismo de Nucleotídeo Único/genética , Sequenciamento Completo do Genoma
11.
Nat Methods ; 19(12): 1599-1611, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36303018

RESUMO

Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.


Assuntos
Estudo de Associação Genômica Ampla , Genoma , Humanos , Estudo de Associação Genômica Ampla/métodos , Sequenciamento Completo do Genoma/métodos , Fenótipo , Variação Genética
12.
PLoS Genet ; 18(1): e1009984, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-35100265

RESUMO

Existing studies of chromatin conformation have primarily focused on potential enhancers interacting with gene promoters. By contrast, the interactivity of promoters per se, while equally critical to understanding transcriptional control, has been largely unexplored, particularly in a cell type-specific manner for blood lineage cell types. In this study, we leverage promoter capture Hi-C data across a compendium of blood lineage cell types to identify and characterize cell type-specific super-interactive promoters (SIPs). Notably, promoter-interacting regions (PIRs) of SIPs are more likely to overlap with cell type-specific ATAC-seq peaks and GWAS variants for relevant blood cell traits than PIRs of non-SIPs. Moreover, PIRs of cell-type-specific SIPs show enriched heritability of relevant blood cell trait (s), and are more enriched with GWAS variants associated with blood cell traits compared to PIRs of non-SIPs. Further, SIP genes tend to express at a higher level in the corresponding cell type. Importantly, SIP subnetworks incorporating cell-type-specific SIPs and ATAC-seq peaks help interpret GWAS variants. Examples include GWAS variants associated with platelet count near the megakaryocyte SIP gene EPHB3 and variants associated lymphocyte count near the native CD4 T-Cell SIP gene ETS1. Interestingly, around 25.7% ~ 39.6% blood cell traits GWAS variants residing in SIP PIR regions disrupt transcription factor binding motifs. Importantly, our analysis shows the potential of using promoter-centric analyses of chromatin spatial organization data to identify biologically important genes and their regulatory regions.


Assuntos
Células Sanguíneas/metabolismo , Linhagem da Célula/genética , Redes Reguladoras de Genes , Regiões Promotoras Genéticas , Estudo de Associação Genômica Ampla , Humanos , Proteína Proto-Oncogênica c-ets-1/genética , Receptor EphB3/genética
13.
Hum Mol Genet ; 31(14): 2333-2347, 2022 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-35138379

RESUMO

Previous genome-wide association studies (GWAS) of hematological traits have identified over 10 000 distinct trait-specific risk loci. However, at these loci, the underlying causal mechanisms remain incompletely characterized. To elucidate novel biology and better understand causal mechanisms at known loci, we performed a transcriptome-wide association study (TWAS) of 29 hematological traits in 399 835 UK Biobank (UKB) participants of European ancestry using gene expression prediction models trained from whole blood RNA-seq data in 922 individuals. We discovered 557 gene-trait associations for hematological traits distinct from previously reported GWAS variants in European populations. Among the 557 associations, 301 were available for replication in a cohort of 141 286 participants of European ancestry from the Million Veteran Program. Of these 301 associations, 108 replicated at a strict Bonferroni adjusted threshold ($\alpha$= 0.05/301). Using our TWAS results, we systematically assigned 4261 out of 16 900 previously identified hematological trait GWAS variants to putative target genes. Compared to coloc, our TWAS results show reduced specificity and increased sensitivity in external datasets to assign variants to target genes.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Bancos de Espécimes Biológicos , Células Sanguíneas , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Transcriptoma/genética , Reino Unido
14.
Am J Hum Genet ; 108(10): 1836-1851, 2021 10 07.
Artigo em Inglês | MEDLINE | ID: mdl-34582791

RESUMO

Many common and rare variants associated with hematologic traits have been discovered through imputation on large-scale reference panels. However, the majority of genome-wide association studies (GWASs) have been conducted in Europeans, and determining causal variants has proved challenging. We performed a GWAS of total leukocyte, neutrophil, lymphocyte, monocyte, eosinophil, and basophil counts generated from 109,563,748 variants in the autosomes and the X chromosome in the Trans-Omics for Precision Medicine (TOPMed) program, which included data from 61,802 individuals of diverse ancestry. We discovered and replicated 7 leukocyte trait associations, including (1) the association between a chromosome X, pseudo-autosomal region (PAR), noncoding variant located between cytokine receptor genes (CSF2RA and CLRF2) and lower eosinophil count; and (2) associations between single variants found predominantly among African Americans at the S1PR3 (9q22.1) and HBB (11p15.4) loci and monocyte and lymphocyte counts, respectively. We further provide evidence indicating that the newly discovered eosinophil-lowering chromosome X PAR variant might be associated with reduced susceptibility to common allergic diseases such as atopic dermatitis and asthma. Additionally, we found a burden of very rare FLT3 (13q12.2) variants associated with monocyte counts. Together, these results emphasize the utility of whole-genome sequencing in diverse samples in identifying associations missed by European-ancestry-driven GWASs.


Assuntos
Asma/epidemiologia , Biomarcadores/metabolismo , Dermatite Atópica/epidemiologia , Leucócitos/patologia , Polimorfismo de Nucleotídeo Único , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Locos de Características Quantitativas , Asma/genética , Asma/metabolismo , Asma/patologia , Dermatite Atópica/genética , Dermatite Atópica/metabolismo , Dermatite Atópica/patologia , Predisposição Genética para Doença , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , National Heart, Lung, and Blood Institute (U.S.) , Fenótipo , Prognóstico , Proteoma/análise , Proteoma/metabolismo , Doença Pulmonar Obstrutiva Crônica/genética , Doença Pulmonar Obstrutiva Crônica/metabolismo , Doença Pulmonar Obstrutiva Crônica/patologia , Reino Unido/epidemiologia , Estados Unidos/epidemiologia , Sequenciamento Completo do Genoma
15.
Am J Hum Genet ; 108(5): 874-893, 2021 05 06.
Artigo em Inglês | MEDLINE | ID: mdl-33887194

RESUMO

Whole-genome sequencing (WGS), a powerful tool for detecting novel coding and non-coding disease-causing variants, has largely been applied to clinical diagnosis of inherited disorders. Here we leveraged WGS data in up to 62,653 ethnically diverse participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program and assessed statistical association of variants with seven red blood cell (RBC) quantitative traits. We discovered 14 single variant-RBC trait associations at 12 genomic loci, which have not been reported previously. Several of the RBC trait-variant associations (RPN1, ELL2, MIDN, HBB, HBA1, PIEZO1, and G6PD) were replicated in independent GWAS datasets imputed to the TOPMed reference panel. Most of these discovered variants are rare/low frequency, and several are observed disproportionately among non-European Ancestry (African, Hispanic/Latino, or East Asian) populations. We identified a 3 bp indel p.Lys2169del (g.88717175_88717177TCT[4]) (common only in the Ashkenazi Jewish population) of PIEZO1, a gene responsible for the Mendelian red cell disorder hereditary xerocytosis (MIM: 194380), associated with higher mean corpuscular hemoglobin concentration (MCHC). In stepwise conditional analysis and in gene-based rare variant aggregated association analysis, we identified several of the variants in HBB, HBA1, TMPRSS6, and G6PD that represent the carrier state for known coding, promoter, or splice site loss-of-function variants that cause inherited RBC disorders. Finally, we applied base and nuclease editing to demonstrate that the sentinel variant rs112097551 (nearest gene RPN1) acts through a cis-regulatory element that exerts long-range control of the gene RUVBL1 which is essential for hematopoiesis. Together, these results demonstrate the utility of WGS in ethnically diverse population-based samples and gene editing for expanding knowledge of the genetic architecture of quantitative hematologic traits and suggest a continuum between complex trait and Mendelian red cell disorders.


Assuntos
Eritrócitos/metabolismo , Eritrócitos/patologia , Estudo de Associação Genômica Ampla , National Heart, Lung, and Blood Institute (U.S.)/organização & administração , Fenótipo , Adulto , Idoso , Cromossomos Humanos Par 16/genética , Conjuntos de Dados como Assunto , Feminino , Edição de Genes , Variação Genética/genética , Células HEK293 , Humanos , Masculino , Pessoa de Meia-Idade , Controle de Qualidade , Reprodutibilidade dos Testes , Estados Unidos
16.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34882196

RESUMO

Multiple statistical methods for aggregate association testing have been developed for whole-genome sequencing (WGS) data. Many aggregate variants in a given genomic window and ignore existing knowledge to define test regions, resulting in many identified regions not clearly linked to genes, and thus, limiting biological understanding. Functional information from new technologies (such as Hi-C and its derivatives), which can help link enhancers to their effector genes, can be leveraged to predefine variant sets for aggregate testing in WGS data. Here, we propose the eSCAN (scan the enhancers) method for genome-wide assessment of enhancer regions in sequencing studies, combining the advantages of dynamic window selection in SCANG (SCAN the Genome), a previously developed method, with the advantages of incorporating putative regulatory regions from annotation. eSCAN, by searching in putative enhancers, increases statistical power and aids mechanistic interpretation, as demonstrated by extensive simulation studies. We also apply eSCAN for blood cell traits using NHLBI Trans-Omics for Precision Medicine WGS data. Results from real data analysis show that eSCAN is able to capture more significant signals, and these signals are of shorter length (indicating higher resolution fine-mapping capability) and drive association of larger regions detected by other methods.


Assuntos
Estudo de Associação Genômica Ampla , Genoma , Estudo de Associação Genômica Ampla/métodos , Genômica , Sequências Reguladoras de Ácido Nucleico , Sequenciamento Completo do Genoma/métodos
17.
Alzheimers Dement ; 20(3): 1913-1922, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38153336

RESUMO

INTRODUCTION: We examined midlife (1990-1992, mean age 57) and late-life (2011-2013, mean age 75) nonalcoholic fatty liver disease (NAFLD) and aminotransferase with incident dementia risk through 2019 in the Atherosclerosis Risk in Communities (ARIC) Study. METHODS: We characterized NAFLD using the fatty liver index and fibrosis-4, and we categorized aminotransferase using the optimal equal-hazard ratio (HR) approach. We estimated HRs for incident dementia ascertained from multiple data sources. RESULTS: Adjusted for demographics, alcohol consumption, and kidney function, individuals with low, intermediate, and high liver fibrosis in midlife (HRs: 1.45, 1.40, and 2.25, respectively), but not at older age, had higher dementia risks than individuals without fatty liver. A U-shaped association was observed for alanine aminotransferase with dementia risk, which was more pronounced in late-life assessment. DISCUSSION: Our findings highlight dementia burden in high-prevalent NAFLD and the important feature of late-life aminotransaminase as a surrogate biomarker linking liver hypometabolism to dementia. Highlights Although evidence of liver involvement in dementia development has been documented in animal studies, the evidence in humans is limited. Midlife NAFLD raised dementia risk proportionate to severity. Late-life NAFLD was not associated with a high risk of dementia. Low alanine aminotransferase was associated with an elevated dementia risk, especially when measured in late life.


Assuntos
Doença de Alzheimer , Hepatopatia Gordurosa não Alcoólica , Humanos , Pessoa de Meia-Idade , Idoso , Doença de Alzheimer/epidemiologia , Hepatopatia Gordurosa não Alcoólica/epidemiologia , Alanina Transaminase , Consumo de Bebidas Alcoólicas , Fatores de Risco
18.
Diabetologia ; 66(7): 1273-1288, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37148359

RESUMO

AIMS/HYPOTHESIS: The Latino population has been systematically underrepresented in large-scale genetic analyses, and previous studies have relied on the imputation of ungenotyped variants based on the 1000 Genomes (1000G) imputation panel, which results in suboptimal capture of low-frequency or Latino-enriched variants. The National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) released the largest multi-ancestry genotype reference panel representing a unique opportunity to analyse rare genetic variations in the Latino population. We hypothesise that a more comprehensive analysis of low/rare variation using the TOPMed panel would improve our knowledge of the genetics of type 2 diabetes in the Latino population. METHODS: We evaluated the TOPMed imputation performance using genotyping array and whole-exome sequence data in six Latino cohorts. To evaluate the ability of TOPMed imputation to increase the number of identified loci, we performed a Latino type 2 diabetes genome-wide association study (GWAS) meta-analysis in 8150 individuals with type 2 diabetes and 10,735 control individuals and replicated the results in six additional cohorts including whole-genome sequence data from the All of Us cohort. RESULTS: Compared with imputation with 1000G, the TOPMed panel improved the identification of rare and low-frequency variants. We identified 26 genome-wide significant signals including a novel variant (minor allele frequency 1.7%; OR 1.37, p=3.4 × 10-9). A Latino-tailored polygenic score constructed from our data and GWAS data from East Asian and European populations improved the prediction accuracy in a Latino target dataset, explaining up to 7.6% of the type 2 diabetes risk variance. CONCLUSIONS/INTERPRETATION: Our results demonstrate the utility of TOPMed imputation for identifying low-frequency variants in understudied populations, leading to the discovery of novel disease associations and the improvement of polygenic scores. DATA AVAILABILITY: Full summary statistics are available through the Common Metabolic Diseases Knowledge Portal ( https://t2d.hugeamp.org/downloads.html ) and through the GWAS catalog ( https://www.ebi.ac.uk/gwas/ , accession ID: GCST90255648). Polygenic score (PS) weights for each ancestry are available via the PGS catalog ( https://www.pgscatalog.org , publication ID: PGP000445, scores IDs: PGS003443, PGS003444 and PGS003445).


Assuntos
Diabetes Mellitus Tipo 2 , Saúde da População , Humanos , Estudo de Associação Genômica Ampla , Diabetes Mellitus Tipo 2/genética , Medicina de Precisão , Genótipo , Hispânico ou Latino/genética , Polimorfismo de Nucleotídeo Único/genética
19.
Genet Epidemiol ; 46(1): 3-16, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34779012

RESUMO

Hematological measures are important intermediate clinical phenotypes for many acute and chronic diseases and are highly heritable. Although genome-wide association studies (GWAS) have identified thousands of loci containing trait-associated variants, the causal genes underlying these associations are often uncertain. To better understand the underlying genetic regulatory mechanisms, we performed a transcriptome-wide association study (TWAS) to systematically investigate the association between genetically predicted gene expression and hematological measures in 54,542 Europeans from the Genetic Epidemiology Research on Aging cohort. We found 239 significant gene-trait associations with hematological measures; we replicated 71 associations at p < 0.05 in a TWAS meta-analysis consisting of up to 35,900 Europeans from the Women's Health Initiative, Atherosclerosis Risk in Communities Study, and BioMe Biobank. Additionally, we attempted to refine this list of candidate genes by performing conditional analyses, adjusting for individual variants previously associated with hematological measures, and performed further fine-mapping of TWAS loci. To facilitate interpretation of our findings, we designed an R Shiny application to interactively visualize our TWAS results by integrating them with additional genetic data sources (GWAS, TWAS from multiple reference panels, conditional analyses, known GWAS variants, etc.). Our results and application highlight frequently overlooked TWAS challenges and illustrate the complexity of TWAS fine-mapping.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Células Sanguíneas , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
20.
Circulation ; 145(18): 1398-1411, 2022 05 03.
Artigo em Inglês | MEDLINE | ID: mdl-35387486

RESUMO

BACKGROUND: SARS-CoV-2, the causal agent of COVID-19, enters human cells using the ACE2 (angiotensin-converting enzyme 2) protein as a receptor. ACE2 is thus key to the infection and treatment of the coronavirus. ACE2 is highly expressed in the heart and respiratory and gastrointestinal tracts, playing important regulatory roles in the cardiovascular and other biological systems. However, the genetic basis of the ACE2 protein levels is not well understood. METHODS: We have conducted the largest genome-wide association meta-analysis of plasma ACE2 levels in >28 000 individuals of the SCALLOP Consortium (Systematic and Combined Analysis of Olink Proteins). We summarize the cross-sectional epidemiological correlates of circulating ACE2. Using the summary statistics-based high-definition likelihood method, we estimate relevant genetic correlations with cardiometabolic phenotypes, COVID-19, and other human complex traits and diseases. We perform causal inference of soluble ACE2 on vascular disease outcomes and COVID-19 severity using mendelian randomization. We also perform in silico functional analysis by integrating with other types of omics data. RESULTS: We identified 10 loci, including 8 novel, capturing 30% of the heritability of the protein. We detected that plasma ACE2 was genetically correlated with vascular diseases, severe COVID-19, and a wide range of human complex diseases and medications. An X-chromosome cis-protein quantitative trait loci-based mendelian randomization analysis suggested a causal effect of elevated ACE2 levels on COVID-19 severity (odds ratio, 1.63 [95% CI, 1.10-2.42]; P=0.01), hospitalization (odds ratio, 1.52 [95% CI, 1.05-2.21]; P=0.03), and infection (odds ratio, 1.60 [95% CI, 1.08-2.37]; P=0.02). Tissue- and cell type-specific transcriptomic and epigenomic analysis revealed that the ACE2 regulatory variants were enriched for DNA methylation sites in blood immune cells. CONCLUSIONS: Human plasma ACE2 shares a genetic basis with cardiovascular disease, COVID-19, and other related diseases. The genetic architecture of the ACE2 protein is mapped, providing a useful resource for further biological and clinical studies on this coronavirus receptor.


Assuntos
Enzima de Conversão de Angiotensina 2 , COVID-19 , Enzima de Conversão de Angiotensina 2/genética , COVID-19/genética , Estudos Transversais , Estudo de Associação Genômica Ampla , Humanos , Receptores de Coronavírus , SARS-CoV-2
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA