RESUMEN
Primary open-angle glaucoma (POAG), the leading cause of irreversible blindness worldwide, disproportionately affects individuals of African ancestry. We conducted a genome-wide association study (GWAS) for POAG in 11,275 individuals of African ancestry (6,003 cases; 5,272 controls). We detected 46 risk loci associated with POAG at genome-wide significance. Replication and post-GWAS analyses, including functionally informed fine-mapping, multiple trait co-localization, and in silico validation, implicated two previously undescribed variants (rs1666698 mapping to DBF4P2; rs34957764 mapping to ROCK1P1) and one previously associated variant (rs11824032 mapping to ARHGEF12) as likely causal. For individuals of African ancestry, a polygenic risk score (PRS) for POAG from our mega-analysis (African ancestry individuals) outperformed a PRS from summary statistics of a much larger GWAS derived from European ancestry individuals. This study quantifies the genetic architecture similarities and differences between African and non-African ancestry populations for this blinding disease.
Asunto(s)
Estudio de Asociación del Genoma Completo , Glaucoma de Ángulo Abierto , Humanos , Predisposición Genética a la Enfermedad , Glaucoma de Ángulo Abierto/genética , Población Negra/genética , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
Whole-genome sequencing (WGS), a powerful tool for detecting novel coding and non-coding disease-causing variants, has largely been applied to clinical diagnosis of inherited disorders. Here we leveraged WGS data in up to 62,653 ethnically diverse participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program and assessed statistical association of variants with seven red blood cell (RBC) quantitative traits. We discovered 14 single variant-RBC trait associations at 12 genomic loci, which have not been reported previously. Several of the RBC trait-variant associations (RPN1, ELL2, MIDN, HBB, HBA1, PIEZO1, and G6PD) were replicated in independent GWAS datasets imputed to the TOPMed reference panel. Most of these discovered variants are rare/low frequency, and several are observed disproportionately among non-European Ancestry (African, Hispanic/Latino, or East Asian) populations. We identified a 3 bp indel p.Lys2169del (g.88717175_88717177TCT[4]) (common only in the Ashkenazi Jewish population) of PIEZO1, a gene responsible for the Mendelian red cell disorder hereditary xerocytosis (MIM: 194380), associated with higher mean corpuscular hemoglobin concentration (MCHC). In stepwise conditional analysis and in gene-based rare variant aggregated association analysis, we identified several of the variants in HBB, HBA1, TMPRSS6, and G6PD that represent the carrier state for known coding, promoter, or splice site loss-of-function variants that cause inherited RBC disorders. Finally, we applied base and nuclease editing to demonstrate that the sentinel variant rs112097551 (nearest gene RPN1) acts through a cis-regulatory element that exerts long-range control of the gene RUVBL1 which is essential for hematopoiesis. Together, these results demonstrate the utility of WGS in ethnically diverse population-based samples and gene editing for expanding knowledge of the genetic architecture of quantitative hematologic traits and suggest a continuum between complex trait and Mendelian red cell disorders.
Asunto(s)
Eritrocitos/metabolismo , Eritrocitos/patología , Estudio de Asociación del Genoma Completo , National Heart, Lung, and Blood Institute (U.S.)/organización & administración , Fenotipo , Adulto , Anciano , Cromosomas Humanos Par 16/genética , Conjuntos de Datos como Asunto , Femenino , Edición Génica , Variación Genética/genética , Células HEK293 , Humanos , Masculino , Persona de Mediana Edad , Control de Calidad , Reproducibilidad de los Resultados , Estados UnidosRESUMEN
Many common and rare variants associated with hematologic traits have been discovered through imputation on large-scale reference panels. However, the majority of genome-wide association studies (GWASs) have been conducted in Europeans, and determining causal variants has proved challenging. We performed a GWAS of total leukocyte, neutrophil, lymphocyte, monocyte, eosinophil, and basophil counts generated from 109,563,748 variants in the autosomes and the X chromosome in the Trans-Omics for Precision Medicine (TOPMed) program, which included data from 61,802 individuals of diverse ancestry. We discovered and replicated 7 leukocyte trait associations, including (1) the association between a chromosome X, pseudo-autosomal region (PAR), noncoding variant located between cytokine receptor genes (CSF2RA and CLRF2) and lower eosinophil count; and (2) associations between single variants found predominantly among African Americans at the S1PR3 (9q22.1) and HBB (11p15.4) loci and monocyte and lymphocyte counts, respectively. We further provide evidence indicating that the newly discovered eosinophil-lowering chromosome X PAR variant might be associated with reduced susceptibility to common allergic diseases such as atopic dermatitis and asthma. Additionally, we found a burden of very rare FLT3 (13q12.2) variants associated with monocyte counts. Together, these results emphasize the utility of whole-genome sequencing in diverse samples in identifying associations missed by European-ancestry-driven GWASs.
Asunto(s)
Asma/epidemiología , Biomarcadores/metabolismo , Dermatitis Atópica/epidemiología , Leucocitos/patología , Polimorfismo de Nucleótido Simple , Enfermedad Pulmonar Obstructiva Crónica/epidemiología , Sitios de Carácter Cuantitativo , Asma/genética , Asma/metabolismo , Asma/patología , Dermatitis Atópica/genética , Dermatitis Atópica/metabolismo , Dermatitis Atópica/patología , Predisposición Genética a la Enfermedad , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , National Heart, Lung, and Blood Institute (U.S.) , Fenotipo , Pronóstico , Proteoma/análisis , Proteoma/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/genética , Enfermedad Pulmonar Obstructiva Crónica/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/patología , Reino Unido/epidemiología , Estados Unidos/epidemiología , Secuenciación Completa del GenomaRESUMEN
BACKGROUND: In the last decade, genomic studies have identified and replicated thousands of genetic associations with measures of health and disease and contributed to the understanding of the etiology of a variety of health conditions. Proteins are key biomarkers in clinical medicine and often drug-therapy targets. Like genomics, proteomics can advance our understanding of biology. METHODS AND RESULTS: In the setting of the Cardiovascular Health Study (CHS), a cohort study of older adults, an aptamer-based method that has high sensitivity for low-abundance proteins was used to assay 4979 proteins in frozen, stored plasma from 3188 participants (61% women, mean age 74 years). CHS provides active support, including central analysis, for seven phenotype-specific working groups (WGs). Each CHS WG is led by one or two senior investigators and includes 10 to 20 early or mid-career scientists. In this setting of mentored access, the proteomic data and analytic methods are widely shared with the WGs and investigators so that they may evaluate associations between baseline levels of circulating proteins and the incidence of a variety of health outcomes in prospective cohort analyses. We describe the design of CHS, the CHS Proteomics Study, characteristics of participants, quality control measures, and structural characteristics of the data provided to CHS WGs. We additionally highlight plans for validation and replication of novel proteomic associations. CONCLUSION: The CHS Proteomics Study offers an opportunity for collaborative data sharing to improve our understanding of the etiology of a variety of health conditions in older adults.
Asunto(s)
Difusión de la Información , Proteómica , Biomarcadores , Estudios de Cohortes , Femenino , Humanos , Masculino , Estudios Prospectivos , Proteómica/métodosRESUMEN
Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are limited. In addition, these populations have more complex linkage disequilibrium structure. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with genome-wide genotyping array data. We demonstrated that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhanced gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3- to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels. Impressively, even for extremely rare variants with minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~21,600 African-ancestry and ~21,700 Hispanic/Latino individuals identified associations with two rare variants in the HBB gene (rs33930165 with higher WBC [p = 8.8x10-15] in African populations, rs11549407 with lower HGB [p = 1.5x10-12] and HCT [p = 8.8x10-10] in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of the TOPMed imputation reference panel for identification of novel rare variant associations not previously detected in similarly sized genome-wide studies of under-represented African and Hispanic/Latino populations.
Asunto(s)
Negro o Afroamericano/genética , Hispánicos o Latinos/genética , Medicina de Precisión/métodos , Secuenciación Completa del Genoma/métodos , Globinas beta/genética , Adulto , Anciano , Anciano de 80 o más Años , Biología Computacional/métodos , Bases de Datos Genéticas , Femenino , Frecuencia de los Genes , Predisposición Genética a la Enfermedad , Genética de Población , Estudio de Asociación del Genoma Completo , Técnicas de Genotipaje , Humanos , Desequilibrio de Ligamiento , Masculino , Persona de Mediana Edad , Estados UnidosRESUMEN
BACKGROUND: Total serum IgE (tIgE) is an important intermediate phenotype of allergic disease. Whole genome genetic association studies across ancestries may identify important determinants of IgE. OBJECTIVE: We aimed to increase understanding of genetic variants affecting tIgE production across the ancestry and allergic disease spectrum by leveraging data from the National Heart, Lung and Blood Institute Trans-Omics for Precision Medicine program; the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA); and the Atopic Dermatitis Research Network (N = 21,901). METHODS: We performed genome-wide association within strata of study, disease, and ancestry groups, and we combined results via a meta-regression approach that models heterogeneity attributable to ancestry. We also tested for association between HLA alleles called from whole genome sequence data and tIgE, assessing replication of associations in HLA alleles called from genotype array data. RESULTS: We identified 6 loci at genome-wide significance (P < 5 × 10-9), including 4 loci previously reported as genome-wide significant for tIgE, as well as new regions in chr11q13.5 and chr15q22.2, which were also identified in prior genome-wide association studies of atopic dermatitis and asthma. In the HLA allele association study, HLA-A∗02:01 was associated with decreased tIgE level (Pdiscovery = 2 × 10-4; Preplication = 5 × 10-4; Pdiscovery+replication = 4 × 10-7), and HLA-DQB1∗03:02 was strongly associated with decreased tIgE level in Hispanic/Latino ancestry populations (PHispanic/Latino discovery+replication = 8 × 10-8). CONCLUSION: We performed the largest genome-wide association study and HLA association study of tIgE focused on ancestrally diverse populations and found several known tIgE and allergic disease loci that are relevant in non-European ancestry populations.
Asunto(s)
Asma/genética , Dermatitis Atópica/genética , Etnicidad , Genotipo , Antígeno HLA-A2/genética , Cadenas beta de HLA-DQ/genética , Adolescente , Adulto , Anciano , Niño , Preescolar , Femenino , Frecuencia de los Genes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Inmunoglobulina E/sangre , Masculino , Persona de Mediana Edad , National Heart, Lung, and Blood Institute (U.S.) , Estados Unidos , Secuenciación Completa del Genoma , Adulto JovenRESUMEN
Prior GWAS have identified loci associated with red blood cell (RBC) traits in populations of European, African, and Asian ancestry. These studies have not included individuals with an Amerindian ancestral background, such as Hispanics/Latinos, nor evaluated the full spectrum of genomic variation beyond single nucleotide variants. Using a custom genotyping array enriched for Amerindian ancestral content and 1000 Genomes imputation, we performed GWAS in 12,502 participants of Hispanic Community Health Study and Study of Latinos (HCHS/SOL) for hematocrit, hemoglobin, RBC count, RBC distribution width (RDW), and RBC indices. Approximately 60% of previously reported RBC trait loci generalized to HCHS/SOL Hispanics/Latinos, including African ancestral alpha- and beta-globin gene variants. In addition to the known 3.8kb alpha-globin copy number variant, we identified an Amerindian ancestral association in an alpha-globin regulatory region on chromosome 16p13.3 for mean corpuscular volume and mean corpuscular hemoglobin. We also discovered and replicated three genome-wide significant variants in previously unreported loci for RDW (SLC12A2 rs17764730, PSMB5 rs941718), and hematocrit (PROX1 rs3754140). Among the proxy variants at the SLC12A2 locus we identified rs3812049, located in a bi-directional promoter between SLC12A2 (which encodes a red cell membrane ion-transport protein) and an upstream anti-sense long-noncoding RNA, LINC01184, as the likely causal variant. We further demonstrate that disruption of the regulatory element harboring rs3812049 affects transcription of SLC12A2 and LINC01184 in human erythroid progenitor cells. Together, these results reinforce the importance of genetic study of diverse ancestral populations, in particular Hispanics/Latinos.
Asunto(s)
Proteínas de Homeodominio/genética , Complejo de la Endopetidasa Proteasomal/genética , ARN Largo no Codificante/genética , Miembro 2 de la Familia de Transportadores de Soluto 12/genética , Proteínas Supresoras de Tumor/genética , Globinas alfa/genética , Recuento de Eritrocitos , Eritrocitos , Femenino , Estudio de Asociación del Genoma Completo , Hemoglobinas/genética , Hispánicos o Latinos/genética , Humanos , Masculino , Polimorfismo de Nucleótido Simple , Globinas beta/genéticaRESUMEN
Platelets play an essential role in hemostasis and thrombosis. We performed a genome-wide association study of platelet count in 12,491 participants of the Hispanic Community Health Study/Study of Latinos by using a mixed-model method that accounts for admixture and family relationships. We discovered and replicated associations with five genes (ACTN1, ETV7, GABBR1-MOG, MEF2C, and ZBTB9-BAK1). Our strongest association was with Amerindian-specific variant rs117672662 (p value = 1.16 × 10(-28)) in ACTN1, a gene implicated in congenital macrothrombocytopenia. rs117672662 exhibited allelic differences in transcriptional activity and protein binding in hematopoietic cells. Our results underscore the value of diverse populations to extend insights into the allelic architecture of complex traits.
Asunto(s)
Estudios de Asociación Genética/métodos , Sitios Genéticos , Hispánicos o Latinos/genética , Recuento de Plaquetas , Actinina/genética , Adolescente , Adulto , Anciano , Alelos , Frecuencia de los Genes , Genotipo , Técnicas de Genotipaje , Humanos , Factores de Transcripción MEF2/genética , Proteínas de la Membrana/genética , Persona de Mediana Edad , Fenotipo , Polimorfismo de Nucleótido Simple , Receptores de GABA-B/genética , Adulto JovenRESUMEN
US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a "genetic-analysis group" variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness.
Asunto(s)
Variación Genética , Hispánicos o Latinos/genética , Estudio de Asociación del Genoma Completo , Humanos , Estados UnidosRESUMEN
The plasma glycoprotein von Willebrand factor (VWF) exhibits fivefold antigen level variation across the normal human population determined by both genetic and environmental factors. Low levels of VWF are associated with bleeding and elevated levels with increased risk for thrombosis, myocardial infarction, and stroke. To identify additional genetic determinants of VWF antigen levels and to minimize the impact of age and illness-related environmental factors, we performed genome-wide association analysis in two young and healthy cohorts (n = 1,152 and n = 2,310) and identified signals at ABO (P < 7.9E-139) and VWF (P < 5.5E-16), consistent with previous reports. Additionally, linkage analysis based on sibling structure within the cohorts, identified significant signals at chromosome 2q12-2p13 (LOD score 5.3) and at the ABO locus on chromosome 9q34 (LOD score 2.9) that explained 19.2% and 24.5% of the variance in VWF levels, respectively. Given its strong effect, the linkage region on chromosome 2 could harbor a potentially important determinant of bleeding and thrombosis risk. The absence of a chromosome 2 association signal in this or previous association studies suggests a causative gene harboring many genetic variants that are individually rare, but in aggregate common. These results raise the possibility that similar loci could explain a significant portion of the "missing heritability" for other complex genetic traits.
Asunto(s)
Cromosomas Humanos Par 2/genética , Cromosomas Humanos Par 9/genética , Ligamiento Genético/genética , Sitios de Carácter Cuantitativo/genética , Factor de von Willebrand/genética , Sistema del Grupo Sanguíneo ABO/genética , Adolescente , Adulto , Factores de Edad , Biología Computacional , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Genotipo , Haplotipos/genética , Humanos , Escala de Lod , Polimorfismo de Nucleótido Simple/genética , Análisis de Componente Principal , Factores Sexuales , Factor de von Willebrand/metabolismoRESUMEN
A major concern for all copy number variation (CNV) detection algorithms is their reliability and repeatability. However, it is difficult to evaluate the reliability of CNV-calling strategies due to the lack of gold-standard data that would tell us which CNVs are real. We propose that if CNVs are called in duplicate samples, or inherited from parent to child, then these can be considered validated CNVs. We used two large family-based genome-wide association study (GWAS) datasets from the GENEVA consortium to look at concordance rates of CNV calls between duplicate samples, parent-child pairs, and unrelated pairs. Our goal was to make recommendations for ways to filter and use CNV calls in GWAS datasets that do not include family data. We used PennCNV as our primary CNV-calling algorithm, and tested CNV calls using different datasets and marker sets, and with various filters on CNVs and samples. Using the Illumina core HumanHap550 single nucleotide polymorphism (SNP) set, we saw duplicate concordance rates of approximately 55% and parent-child transmission rates of approximately 28% in our datasets. GC model adjustment and sample quality filtering had little effect on these reliability measures. Stratification on CNV size and DNA sample type did have some effect. Overall, our results show that it is probably not possible to find a CNV-calling strategy (including filtering and algorithm) that will give us a set of "reliable" CNV calls using current chip technologies. But if we understand the error process, we can still use CNV calls appropriately in genetic association studies.
Asunto(s)
Algoritmos , Variaciones en el Número de Copia de ADN , Estudio de Asociación del Genoma Completo , Factores de Edad , Estudios de Casos y Controles , Niño , Caries Dental/genética , Femenino , Humanos , Masculino , Linaje , Polimorfismo de Nucleótido SimpleRESUMEN
GWASTools is an R/Bioconductor package for quality control and analysis of genome-wide association studies (GWAS). GWASTools brings the interactive capability and extensive statistical libraries of R to GWAS. Data are stored in NetCDF format to accommodate extremely large datasets that cannot fit within R's memory limits. The documentation includes instructions for converting data from multiple formats, including variants called from sequencing. GWASTools provides a convenient interface for linking genotypes and intensity data with sample and single nucleotide polymorphism annotation.
Asunto(s)
Estudio de Asociación del Genoma Completo/normas , Polimorfismo de Nucleótido Simple , Programas Informáticos , Genotipo , Humanos , Control de CalidadRESUMEN
There is common consensus that data sharing accelerates science. Data sharing enhances the utility of data and promotes the creation and competition of scientific ideas. Within the Alzheimer's disease and related dementias (ADRD) community, data types and modalities are spread across many organizations, geographies, and governance structures. The ADRD community is not alone in facing these challenges, however, the problem is even more difficult because of the need to share complex biomarker data from centers around the world. Heavy-handed data sharing mandates have, to date, been met with limited success and often outright resistance. Interest in making data Findable, Accessible, Interoperable, and Reusable (FAIR) has often resulted in centralized platforms. However, when data governance and sovereignty structures do not allow the movement of data, other methods, such as federation, must be pursued. Implementation of fully federated data approaches are not without their challenges. The user experience may become more complicated, and federated analysis of unstructured data types remains challenging. Advancement in federated data sharing should be accompanied by improvement in federated learning methodologies so that federated data sharing becomes functionally equivalent to direct access to record level data. In this article, we discuss federated data sharing approaches implemented by three data platforms in the ADRD field: Dementia's Platform UK (DPUK) in 2014, the Global Alzheimer's Association Interactive Network (GAAIN) in 2012, and the Alzheimer's Disease Data Initiative (ADDI) in 2020. We conclude by addressing open questions that the research community needs to solve together.
RESUMEN
Coronary artery calcification (CAC) is a measure of atherosclerosis and a well-established predictor of coronary artery disease (CAD) events. Here we describe a genome-wide association study (GWAS) of CAC in 22,400 participants from multiple ancestral groups. We confirmed associations with four known loci and identified two additional loci associated with CAC (ARSE and MMP16), with evidence of significant associations in replication analyses for both novel loci. Functional assays of ARSE and MMP16 in human vascular smooth muscle cells (VSMCs) demonstrate that ARSE is a promoter of VSMC calcification and VSMC phenotype switching from a contractile to a calcifying or osteogenic phenotype. Furthermore, we show that the association of variants near ARSE with reduced CAC is likely explained by reduced ARSE expression with the G allele of enhancer variant rs5982944. Our study highlights ARSE as an important contributor to atherosclerotic vascular calcification, and a potential drug target for vascular calcific disease.
RESUMEN
How race, ethnicity, and ancestry are used in genomic research has wide-ranging implications for how research is translated into clinical care and incorporated into public understanding. Correlation between race and genetic ancestry contributes to unresolved complexity for the scientific community, as illustrated by heterogeneous definitions and applications of these variables. Here, we offer commentary and recommendations on the use of race, ethnicity, and ancestry across the arc of genetic research, including data harmonization, analysis, and reporting. While informed by our experiences as researchers affiliated with the NHLBI Trans-Omics for Precision Medicine (TOPMed) program, these recommendations are applicable to basic and translational genomic research in diverse populations with genome-wide data. Moving forward, considerable collaborative effort will be required to ensure that race, ethnicity, and ancestry are described and used appropriately to generate scientific knowledge that yields broad and equitable benefit.
RESUMEN
Genome-wide association studies have identified thousands of single nucleotide variants and small indels that contribute to variation in hematologic traits. While structural variants are known to cause rare blood or hematopoietic disorders, the genome-wide contribution of structural variants to quantitative blood cell trait variation is unknown. Here we utilized whole genome sequencing data in ancestrally diverse participants of the NHLBI Trans Omics for Precision Medicine program (N = 50,675) to detect structural variants associated with hematologic traits. Using single variant tests, we assessed the association of common and rare structural variants with red cell-, white cell-, and platelet-related quantitative traits and observed 21 independent signals (12 common and 9 rare) reaching genome-wide significance. The majority of these associations (N = 18) replicated in independent datasets. In genome-editing experiments, we provide evidence that a deletion associated with lower monocyte counts leads to disruption of an S1PR3 monocyte enhancer and decreased S1PR3 expression.
Asunto(s)
Células Sanguíneas , Estudio de Asociación del Genoma Completo , Humanos , Secuenciación Completa del GenomaRESUMEN
Genetic studies on telomere length are important for understanding age-related diseases. Prior GWAS for leukocyte TL have been limited to European and Asian populations. Here, we report the first sequencing-based association study for TL across ancestrally-diverse individuals (European, African, Asian and Hispanic/Latino) from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. We used whole genome sequencing (WGS) of whole blood for variant genotype calling and the bioinformatic estimation of telomere length in n=109,122 individuals. We identified 59 sentinel variants (p-value <5×10-9) in 36 loci associated with telomere length, including 20 newly associated loci (13 were replicated in external datasets). There was little evidence of effect size heterogeneity across populations. Fine-mapping at OBFC1 indicated the independent signals colocalized with cell-type specific eQTLs for OBFC1 (STN1). Using a multi-variant gene-based approach, we identified two genes newly implicated in telomere length, DCLRE1B (SNM1B) and PARN. In PheWAS, we demonstrated our TL polygenic trait scores (PTS) were associated with increased risk of cancer-related phenotypes.
RESUMEN
Purpose: POAG is the leading cause of irreversible blindness in African Americans. In this study, we quantitatively assess the association of autosomal ancestry with POAG risk in a large cohort of self-identified African Americans. Methods: Subjects recruited to the Primary Open-Angle African American Glaucoma Genetics (POAAGG) study were classified as glaucoma cases or controls by fellowship-trained glaucoma specialists. POAAGG subjects were genotyped using the MEGA Ex array (discovery cohort, n = 3830; replication cohort, n = 2135). Population structure was interrogated using principal component analysis in the context of the 1000 Genomes Project superpopulations. Results: The majority of POAAGG samples lie on an axis between African and European superpopulations, with great variation in admixture. Cases had a significantly lower mean value of the ancestral component q0 than controls for both cohorts (P = 6.14-4; P = 3-6), consistent with higher degree of African ancestry. Among POAG cases, higher African ancestry was also associated with thinner central corneal thickness (P = 2-4). Admixture mapping showed that local genetic ancestry was not a significant risk factor for POAG. A polygenic risk score, comprised of 23 glaucoma-associated single nucleotide polymorphisms from the NHGRI-EBI genome-wide association study catalog, was significant in both cohorts (P < 0.001), suggesting that both known POAG single nucleotide polymorphisms and an omnigenic ancestry effect influence POAG risk. Conclusions: In sum, the POAAGG study population is very admixed, with a higher degree of African ancestry associated with an increased POAG risk. Further analyses should consider social and environmental factors as possible confounding factors for disease predisposition.