Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 31
Filter
Add more filters

Publication year range
1.
Cell ; 187(2): 464-480.e10, 2024 01 18.
Article in English | MEDLINE | ID: mdl-38242088

ABSTRACT

Primary open-angle glaucoma (POAG), the leading cause of irreversible blindness worldwide, disproportionately affects individuals of African ancestry. We conducted a genome-wide association study (GWAS) for POAG in 11,275 individuals of African ancestry (6,003 cases; 5,272 controls). We detected 46 risk loci associated with POAG at genome-wide significance. Replication and post-GWAS analyses, including functionally informed fine-mapping, multiple trait co-localization, and in silico validation, implicated two previously undescribed variants (rs1666698 mapping to DBF4P2; rs34957764 mapping to ROCK1P1) and one previously associated variant (rs11824032 mapping to ARHGEF12) as likely causal. For individuals of African ancestry, a polygenic risk score (PRS) for POAG from our mega-analysis (African ancestry individuals) outperformed a PRS from summary statistics of a much larger GWAS derived from European ancestry individuals. This study quantifies the genetic architecture similarities and differences between African and non-African ancestry populations for this blinding disease.


Subject(s)
Genome-Wide Association Study , Glaucoma, Open-Angle , Humans , Genetic Predisposition to Disease , Glaucoma, Open-Angle/genetics , Black People/genetics , Polymorphism, Single Nucleotide/genetics
2.
Hum Mol Genet ; 31(3): 347-361, 2022 02 03.
Article in English | MEDLINE | ID: mdl-34553764

ABSTRACT

Platelets play a key role in thrombosis and hemostasis. Platelet count (PLT) and mean platelet volume (MPV) are highly heritable quantitative traits, with hundreds of genetic signals previously identified, mostly in European ancestry populations. We here utilize whole genome sequencing (WGS) from NHLBI's Trans-Omics for Precision Medicine initiative (TOPMed) in a large multi-ethnic sample to further explore common and rare variation contributing to PLT (n = 61 200) and MPV (n = 23 485). We identified and replicated secondary signals at MPL (rs532784633) and PECAM1 (rs73345162), both more common in African ancestry populations. We also observed rare variation in Mendelian platelet-related disorder genes influencing variation in platelet traits in TOPMed cohorts (not enriched for blood disorders). For example, association of GP9 with lower PLT and higher MPV was partly driven by a pathogenic Bernard-Soulier syndrome variant (rs5030764, p.Asn61Ser), and the signals at TUBB1 and CD36 were partly driven by loss of function variants not annotated as pathogenic in ClinVar (rs199948010 and rs571975065). However, residual signal remained for these gene-based signals after adjusting for lead variants, suggesting that additional variants in Mendelian genes with impacts in general population cohorts remain to be identified. Gene-based signals were also identified at several genome-wide association study identified loci for genes not annotated for Mendelian platelet disorders (PTPRH, TET2, CHEK2), with somatic variation driving the result at TET2. These results highlight the value of WGS in populations of diverse genetic ancestry to identify novel regulatory and coding signals, even for well-studied traits like platelet traits.


Subject(s)
Genome-Wide Association Study , Precision Medicine , Blood Platelets , Humans , National Heart, Lung, and Blood Institute (U.S.) , Phenotype , Polymorphism, Single Nucleotide , Precision Medicine/methods , United States
3.
Am J Hum Genet ; 108(5): 874-893, 2021 05 06.
Article in English | MEDLINE | ID: mdl-33887194

ABSTRACT

Whole-genome sequencing (WGS), a powerful tool for detecting novel coding and non-coding disease-causing variants, has largely been applied to clinical diagnosis of inherited disorders. Here we leveraged WGS data in up to 62,653 ethnically diverse participants from the NHLBI Trans-Omics for Precision Medicine (TOPMed) program and assessed statistical association of variants with seven red blood cell (RBC) quantitative traits. We discovered 14 single variant-RBC trait associations at 12 genomic loci, which have not been reported previously. Several of the RBC trait-variant associations (RPN1, ELL2, MIDN, HBB, HBA1, PIEZO1, and G6PD) were replicated in independent GWAS datasets imputed to the TOPMed reference panel. Most of these discovered variants are rare/low frequency, and several are observed disproportionately among non-European Ancestry (African, Hispanic/Latino, or East Asian) populations. We identified a 3 bp indel p.Lys2169del (g.88717175_88717177TCT[4]) (common only in the Ashkenazi Jewish population) of PIEZO1, a gene responsible for the Mendelian red cell disorder hereditary xerocytosis (MIM: 194380), associated with higher mean corpuscular hemoglobin concentration (MCHC). In stepwise conditional analysis and in gene-based rare variant aggregated association analysis, we identified several of the variants in HBB, HBA1, TMPRSS6, and G6PD that represent the carrier state for known coding, promoter, or splice site loss-of-function variants that cause inherited RBC disorders. Finally, we applied base and nuclease editing to demonstrate that the sentinel variant rs112097551 (nearest gene RPN1) acts through a cis-regulatory element that exerts long-range control of the gene RUVBL1 which is essential for hematopoiesis. Together, these results demonstrate the utility of WGS in ethnically diverse population-based samples and gene editing for expanding knowledge of the genetic architecture of quantitative hematologic traits and suggest a continuum between complex trait and Mendelian red cell disorders.


Subject(s)
Erythrocytes/metabolism , Erythrocytes/pathology , Genome-Wide Association Study , National Heart, Lung, and Blood Institute (U.S.)/organization & administration , Phenotype , Adult , Aged , Chromosomes, Human, Pair 16/genetics , Datasets as Topic , Female , Gene Editing , Genetic Variation/genetics , HEK293 Cells , Humans , Male , Middle Aged , Quality Control , Reproducibility of Results , United States
4.
Am J Hum Genet ; 108(10): 1836-1851, 2021 10 07.
Article in English | MEDLINE | ID: mdl-34582791

ABSTRACT

Many common and rare variants associated with hematologic traits have been discovered through imputation on large-scale reference panels. However, the majority of genome-wide association studies (GWASs) have been conducted in Europeans, and determining causal variants has proved challenging. We performed a GWAS of total leukocyte, neutrophil, lymphocyte, monocyte, eosinophil, and basophil counts generated from 109,563,748 variants in the autosomes and the X chromosome in the Trans-Omics for Precision Medicine (TOPMed) program, which included data from 61,802 individuals of diverse ancestry. We discovered and replicated 7 leukocyte trait associations, including (1) the association between a chromosome X, pseudo-autosomal region (PAR), noncoding variant located between cytokine receptor genes (CSF2RA and CLRF2) and lower eosinophil count; and (2) associations between single variants found predominantly among African Americans at the S1PR3 (9q22.1) and HBB (11p15.4) loci and monocyte and lymphocyte counts, respectively. We further provide evidence indicating that the newly discovered eosinophil-lowering chromosome X PAR variant might be associated with reduced susceptibility to common allergic diseases such as atopic dermatitis and asthma. Additionally, we found a burden of very rare FLT3 (13q12.2) variants associated with monocyte counts. Together, these results emphasize the utility of whole-genome sequencing in diverse samples in identifying associations missed by European-ancestry-driven GWASs.


Subject(s)
Asthma/epidemiology , Biomarkers/metabolism , Dermatitis, Atopic/epidemiology , Leukocytes/pathology , Polymorphism, Single Nucleotide , Pulmonary Disease, Chronic Obstructive/epidemiology , Quantitative Trait Loci , Asthma/genetics , Asthma/metabolism , Asthma/pathology , Dermatitis, Atopic/genetics , Dermatitis, Atopic/metabolism , Dermatitis, Atopic/pathology , Genetic Predisposition to Disease , Genome, Human , Genome-Wide Association Study , Humans , National Heart, Lung, and Blood Institute (U.S.) , Phenotype , Prognosis , Proteome/analysis , Proteome/metabolism , Pulmonary Disease, Chronic Obstructive/genetics , Pulmonary Disease, Chronic Obstructive/metabolism , Pulmonary Disease, Chronic Obstructive/pathology , United Kingdom/epidemiology , United States/epidemiology , Whole Genome Sequencing
5.
Eur J Epidemiol ; 38(6): 605-615, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37099244

ABSTRACT

Data discovery, the ability to find datasets relevant to an analysis, increases scientific opportunity, improves rigour and accelerates activity. Rapid growth in the depth, breadth, quantity and availability of data provides unprecedented opportunities and challenges for data discovery. A potential tool for increasing the efficiency of data discovery, particularly across multiple datasets is data harmonisation.A set of 124 variables, identified as being of broad interest to neurodegeneration, were harmonised using the C-Surv data model. Harmonisation strategies used were simple calibration, algorithmic transformation and standardisation to the Z-distribution. Widely used data conventions, optimised for inclusiveness rather than aetiological precision, were used as harmonisation rules. The harmonisation scheme was applied to data from four diverse population cohorts.Of the 120 variables that were found in the datasets, correspondence between the harmonised data schema and cohort-specific data models was complete or close for 111 (93%). For the remainder, harmonisation was possible with a marginal a loss of granularity.Although harmonisation is not an exact science, sufficient comparability across datasets was achieved to enable data discovery with relatively little loss of informativeness. This provides a basis for further work extending harmonisation to a larger variable list, applying the harmonisation to further datasets, and incentivising the development of data discovery tools.


Subject(s)
Datasets as Topic , Knowledge Discovery , Humans , Reference Standards
6.
Eur J Epidemiol ; 37(7): 755-765, 2022 Jul.
Article in English | MEDLINE | ID: mdl-35790642

ABSTRACT

BACKGROUND: In the last decade, genomic studies have identified and replicated thousands of genetic associations with measures of health and disease and contributed to the understanding of the etiology of a variety of health conditions. Proteins are key biomarkers in clinical medicine and often drug-therapy targets. Like genomics, proteomics can advance our understanding of biology. METHODS AND RESULTS: In the setting of the Cardiovascular Health Study (CHS), a cohort study of older adults, an aptamer-based method that has high sensitivity for low-abundance proteins was used to assay 4979 proteins in frozen, stored plasma from 3188 participants (61% women, mean age 74 years). CHS provides active support, including central analysis, for seven phenotype-specific working groups (WGs). Each CHS WG is led by one or two senior investigators and includes 10 to 20 early or mid-career scientists. In this setting of mentored access, the proteomic data and analytic methods are widely shared with the WGs and investigators so that they may evaluate associations between baseline levels of circulating proteins and the incidence of a variety of health outcomes in prospective cohort analyses. We describe the design of CHS, the CHS Proteomics Study, characteristics of participants, quality control measures, and structural characteristics of the data provided to CHS WGs. We additionally highlight plans for validation and replication of novel proteomic associations. CONCLUSION: The CHS Proteomics Study offers an opportunity for collaborative data sharing to improve our understanding of the etiology of a variety of health conditions in older adults.


Subject(s)
Information Dissemination , Proteomics , Biomarkers , Cohort Studies , Female , Humans , Male , Prospective Studies , Proteomics/methods
7.
PLoS Genet ; 15(12): e1008500, 2019 12.
Article in English | MEDLINE | ID: mdl-31869403

ABSTRACT

Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are limited. In addition, these populations have more complex linkage disequilibrium structure. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with genome-wide genotyping array data. We demonstrated that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhanced gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3- to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels. Impressively, even for extremely rare variants with minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~21,600 African-ancestry and ~21,700 Hispanic/Latino individuals identified associations with two rare variants in the HBB gene (rs33930165 with higher WBC [p = 8.8x10-15] in African populations, rs11549407 with lower HGB [p = 1.5x10-12] and HCT [p = 8.8x10-10] in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of the TOPMed imputation reference panel for identification of novel rare variant associations not previously detected in similarly sized genome-wide studies of under-represented African and Hispanic/Latino populations.


Subject(s)
Black or African American/genetics , Hispanic or Latino/genetics , Precision Medicine/methods , Whole Genome Sequencing/methods , beta-Globins/genetics , Adult , Aged , Aged, 80 and over , Computational Biology/methods , Databases, Genetic , Female , Gene Frequency , Genetic Predisposition to Disease , Genetics, Population , Genome-Wide Association Study , Genotyping Techniques , Humans , Linkage Disequilibrium , Male , Middle Aged , United States
8.
J Allergy Clin Immunol ; 148(6): 1589-1595, 2021 12.
Article in English | MEDLINE | ID: mdl-34536413

ABSTRACT

BACKGROUND: Total serum IgE (tIgE) is an important intermediate phenotype of allergic disease. Whole genome genetic association studies across ancestries may identify important determinants of IgE. OBJECTIVE: We aimed to increase understanding of genetic variants affecting tIgE production across the ancestry and allergic disease spectrum by leveraging data from the National Heart, Lung and Blood Institute Trans-Omics for Precision Medicine program; the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA); and the Atopic Dermatitis Research Network (N = 21,901). METHODS: We performed genome-wide association within strata of study, disease, and ancestry groups, and we combined results via a meta-regression approach that models heterogeneity attributable to ancestry. We also tested for association between HLA alleles called from whole genome sequence data and tIgE, assessing replication of associations in HLA alleles called from genotype array data. RESULTS: We identified 6 loci at genome-wide significance (P < 5 × 10-9), including 4 loci previously reported as genome-wide significant for tIgE, as well as new regions in chr11q13.5 and chr15q22.2, which were also identified in prior genome-wide association studies of atopic dermatitis and asthma. In the HLA allele association study, HLA-A∗02:01 was associated with decreased tIgE level (Pdiscovery = 2 × 10-4; Preplication = 5 × 10-4; Pdiscovery+replication = 4 × 10-7), and HLA-DQB1∗03:02 was strongly associated with decreased tIgE level in Hispanic/Latino ancestry populations (PHispanic/Latino discovery+replication = 8 × 10-8). CONCLUSION: We performed the largest genome-wide association study and HLA association study of tIgE focused on ancestrally diverse populations and found several known tIgE and allergic disease loci that are relevant in non-European ancestry populations.


Subject(s)
Asthma/genetics , Dermatitis, Atopic/genetics , Ethnicity , Genotype , HLA-A2 Antigen/genetics , HLA-DQ beta-Chains/genetics , Adolescent , Adult , Aged , Child , Child, Preschool , Female , Gene Frequency , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Immunoglobulin E/blood , Male , Middle Aged , National Heart, Lung, and Blood Institute (U.S.) , United States , Whole Genome Sequencing , Young Adult
9.
PLoS Genet ; 13(4): e1006760, 2017 04.
Article in English | MEDLINE | ID: mdl-28453575

ABSTRACT

Prior GWAS have identified loci associated with red blood cell (RBC) traits in populations of European, African, and Asian ancestry. These studies have not included individuals with an Amerindian ancestral background, such as Hispanics/Latinos, nor evaluated the full spectrum of genomic variation beyond single nucleotide variants. Using a custom genotyping array enriched for Amerindian ancestral content and 1000 Genomes imputation, we performed GWAS in 12,502 participants of Hispanic Community Health Study and Study of Latinos (HCHS/SOL) for hematocrit, hemoglobin, RBC count, RBC distribution width (RDW), and RBC indices. Approximately 60% of previously reported RBC trait loci generalized to HCHS/SOL Hispanics/Latinos, including African ancestral alpha- and beta-globin gene variants. In addition to the known 3.8kb alpha-globin copy number variant, we identified an Amerindian ancestral association in an alpha-globin regulatory region on chromosome 16p13.3 for mean corpuscular volume and mean corpuscular hemoglobin. We also discovered and replicated three genome-wide significant variants in previously unreported loci for RDW (SLC12A2 rs17764730, PSMB5 rs941718), and hematocrit (PROX1 rs3754140). Among the proxy variants at the SLC12A2 locus we identified rs3812049, located in a bi-directional promoter between SLC12A2 (which encodes a red cell membrane ion-transport protein) and an upstream anti-sense long-noncoding RNA, LINC01184, as the likely causal variant. We further demonstrate that disruption of the regulatory element harboring rs3812049 affects transcription of SLC12A2 and LINC01184 in human erythroid progenitor cells. Together, these results reinforce the importance of genetic study of diverse ancestral populations, in particular Hispanics/Latinos.


Subject(s)
Homeodomain Proteins/genetics , Proteasome Endopeptidase Complex/genetics , RNA, Long Noncoding/genetics , Solute Carrier Family 12, Member 2/genetics , Tumor Suppressor Proteins/genetics , alpha-Globins/genetics , Erythrocyte Count , Erythrocytes , Female , Genome-Wide Association Study , Hemoglobins/genetics , Hispanic or Latino/genetics , Humans , Male , Polymorphism, Single Nucleotide , beta-Globins/genetics
10.
Am J Hum Genet ; 98(2): 229-42, 2016 Feb 04.
Article in English | MEDLINE | ID: mdl-26805783

ABSTRACT

Platelets play an essential role in hemostasis and thrombosis. We performed a genome-wide association study of platelet count in 12,491 participants of the Hispanic Community Health Study/Study of Latinos by using a mixed-model method that accounts for admixture and family relationships. We discovered and replicated associations with five genes (ACTN1, ETV7, GABBR1-MOG, MEF2C, and ZBTB9-BAK1). Our strongest association was with Amerindian-specific variant rs117672662 (p value = 1.16 × 10(-28)) in ACTN1, a gene implicated in congenital macrothrombocytopenia. rs117672662 exhibited allelic differences in transcriptional activity and protein binding in hematopoietic cells. Our results underscore the value of diverse populations to extend insights into the allelic architecture of complex traits.


Subject(s)
Genetic Association Studies/methods , Genetic Loci , Hispanic or Latino/genetics , Platelet Count , Actinin/genetics , Adolescent , Adult , Aged , Alleles , Gene Frequency , Genotype , Genotyping Techniques , Humans , MEF2 Transcription Factors/genetics , Membrane Proteins/genetics , Middle Aged , Phenotype , Polymorphism, Single Nucleotide , Receptors, GABA-B/genetics , Young Adult
11.
Am J Hum Genet ; 98(1): 165-84, 2016 Jan 07.
Article in English | MEDLINE | ID: mdl-26748518

ABSTRACT

US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a "genetic-analysis group" variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness.


Subject(s)
Genetic Variation , Hispanic or Latino/genetics , Genome-Wide Association Study , Humans , United States
13.
Proc Natl Acad Sci U S A ; 110(2): 588-93, 2013 Jan 08.
Article in English | MEDLINE | ID: mdl-23267103

ABSTRACT

The plasma glycoprotein von Willebrand factor (VWF) exhibits fivefold antigen level variation across the normal human population determined by both genetic and environmental factors. Low levels of VWF are associated with bleeding and elevated levels with increased risk for thrombosis, myocardial infarction, and stroke. To identify additional genetic determinants of VWF antigen levels and to minimize the impact of age and illness-related environmental factors, we performed genome-wide association analysis in two young and healthy cohorts (n = 1,152 and n = 2,310) and identified signals at ABO (P < 7.9E-139) and VWF (P < 5.5E-16), consistent with previous reports. Additionally, linkage analysis based on sibling structure within the cohorts, identified significant signals at chromosome 2q12-2p13 (LOD score 5.3) and at the ABO locus on chromosome 9q34 (LOD score 2.9) that explained 19.2% and 24.5% of the variance in VWF levels, respectively. Given its strong effect, the linkage region on chromosome 2 could harbor a potentially important determinant of bleeding and thrombosis risk. The absence of a chromosome 2 association signal in this or previous association studies suggests a causative gene harboring many genetic variants that are individually rare, but in aggregate common. These results raise the possibility that similar loci could explain a significant portion of the "missing heritability" for other complex genetic traits.


Subject(s)
Chromosomes, Human, Pair 2/genetics , Chromosomes, Human, Pair 9/genetics , Genetic Linkage/genetics , Quantitative Trait Loci/genetics , von Willebrand Factor/genetics , ABO Blood-Group System/genetics , Adolescent , Adult , Age Factors , Computational Biology , Gene Frequency , Genome-Wide Association Study , Genotype , Haplotypes/genetics , Humans , Lod Score , Polymorphism, Single Nucleotide/genetics , Principal Component Analysis , Sex Factors , von Willebrand Factor/metabolism
14.
Hum Mol Genet ; 22(17): 3583-96, 2013 Sep 01.
Article in English | MEDLINE | ID: mdl-23575227

ABSTRACT

Newborns characterized as large and small for gestational age are at risk for increased mortality and morbidity during the first year of life as well as for obesity and dysglycemia as children and adults. The intrauterine environment and fetal genes contribute to the fetal size at birth. To define the genetic architecture underlying the newborn size, we performed a genome-wide association study (GWAS) in 4281 newborns in four ethnic groups from the Hyperglycemia and Adverse Pregnancy Outcome Study. We tested for association with newborn anthropometric traits (birth length, head circumference, birth weight, percent fat mass and sum of skinfolds) and newborn metabolic traits (cord glucose and C-peptide) under three models. Model 1 adjusted for field center, ancestry, neonatal gender, gestational age at delivery, parity, maternal age at oral glucose tolerance test (OGTT); Model 2 adjusted for Model 1 covariates, maternal body mass index (BMI) at OGTT, maternal height at OGTT, maternal mean arterial pressure at OGTT, maternal smoking and drinking; Model 3 adjusted for Model 2 covariates, maternal glucose and C-peptide at OGTT. Strong evidence for association was observed with measures of newborn adiposity (sum of skinfolds model 3 Z-score 7.356, P = 1.90×10⁻¹³, and to a lesser degree fat mass and birth weight) and a region on Chr3q25.31 mapping between CCNL and LEKR1. These findings were replicated in an independent cohort of 2296 newborns. This region has previously been shown to be associated with birth weight in Europeans. The current study suggests that association of this locus with birth weight is secondary to an effect on fat as opposed to lean body mass.


Subject(s)
Adiposity/genetics , Birth Weight/genetics , Chromosomes, Human, Pair 3/genetics , Cyclins/genetics , Ethnicity/genetics , Proteinase Inhibitory Proteins, Secretory/genetics , Racial Groups/genetics , Asian People/genetics , Black People/genetics , Body Mass Index , Caribbean Region , Cohort Studies , Female , Genome-Wide Association Study , Humans , Infant, Newborn , Linear Models , Male , Mexican Americans/genetics , Pregnancy , Serine Peptidase Inhibitor Kazal-Type 5 , Thailand , White People/genetics
15.
Nat Commun ; 15(1): 6742, 2024 Aug 08.
Article in English | MEDLINE | ID: mdl-39112488

ABSTRACT

The mechanisms underlying the selective regional vulnerability to neurodegeneration in Huntington's disease (HD) have not been fully defined. To explore the role of astrocytes in this phenomenon, we used single-nucleus and bulk RNAseq, lipidomics, HTT gene CAG repeat-length measurements, and multiplexed immunofluorescence on HD and control post-mortem brains. We identified genes that correlated with CAG repeat length, which were enriched in astrocyte genes, and lipidomic signatures that implicated poly-unsaturated fatty acids in sensitizing neurons to cell death. Because astrocytes play essential roles in lipid metabolism, we explored the heterogeneity of astrocytic states in both protoplasmic and fibrous-like (CD44+) astrocytes. Significantly, one protoplasmic astrocyte state showed high levels of metallothioneins and was correlated with the selective vulnerability of distinct striatal neuronal populations. When modeled in vitro, this state improved the viability of HD-patient-derived spiny projection neurons. Our findings uncover key roles of astrocytic states in protecting against neurodegeneration in HD.


Subject(s)
Astrocytes , Huntington Disease , Neurons , Huntington Disease/metabolism , Huntington Disease/genetics , Huntington Disease/pathology , Astrocytes/metabolism , Astrocytes/pathology , Humans , Neurons/metabolism , Huntingtin Protein/genetics , Huntingtin Protein/metabolism , Male , Female , Lipidomics/methods , Middle Aged , Metallothionein/metabolism , Metallothionein/genetics , Brain/metabolism , Brain/pathology , Lipid Metabolism , Aged , Multiomics
16.
Genet Epidemiol ; 36(3): 253-62, 2012 Apr.
Article in English | MEDLINE | ID: mdl-22714937

ABSTRACT

A major concern for all copy number variation (CNV) detection algorithms is their reliability and repeatability. However, it is difficult to evaluate the reliability of CNV-calling strategies due to the lack of gold-standard data that would tell us which CNVs are real. We propose that if CNVs are called in duplicate samples, or inherited from parent to child, then these can be considered validated CNVs. We used two large family-based genome-wide association study (GWAS) datasets from the GENEVA consortium to look at concordance rates of CNV calls between duplicate samples, parent-child pairs, and unrelated pairs. Our goal was to make recommendations for ways to filter and use CNV calls in GWAS datasets that do not include family data. We used PennCNV as our primary CNV-calling algorithm, and tested CNV calls using different datasets and marker sets, and with various filters on CNVs and samples. Using the Illumina core HumanHap550 single nucleotide polymorphism (SNP) set, we saw duplicate concordance rates of approximately 55% and parent-child transmission rates of approximately 28% in our datasets. GC model adjustment and sample quality filtering had little effect on these reliability measures. Stratification on CNV size and DNA sample type did have some effect. Overall, our results show that it is probably not possible to find a CNV-calling strategy (including filtering and algorithm) that will give us a set of "reliable" CNV calls using current chip technologies. But if we understand the error process, we can still use CNV calls appropriately in genetic association studies.


Subject(s)
Algorithms , DNA Copy Number Variations , Genome-Wide Association Study , Age Factors , Case-Control Studies , Child , Dental Caries/genetics , Female , Humans , Male , Pedigree , Polymorphism, Single Nucleotide
17.
Hum Mol Genet ; 20(24): 5012-23, 2011 Dec 15.
Article in English | MEDLINE | ID: mdl-21926416

ABSTRACT

We performed a multistage genome-wide association study of melanoma. In a discovery cohort of 1804 melanoma cases and 1026 controls, we identified loci at chromosomes 15q13.1 (HERC2/OCA2 region) and 16q24.3 (MC1R) regions that reached genome-wide significance within this study and also found strong evidence for genetic effects on susceptibility to melanoma from markers on chromosome 9p21.3 in the p16/ARF region and on chromosome 1q21.3 (ARNT/LASS2/ANXA9 region). The most significant single-nucleotide polymorphisms (SNPs) in the 15q13.1 locus (rs1129038 and rs12913832) lie within a genomic region that has profound effects on eye and skin color; notably, 50% of variability in eye color is associated with variation in the SNP rs12913832. Because eye and skin colors vary across European populations, we further evaluated the associations of the significant SNPs after carefully adjusting for European substructure. We also evaluated the top 10 most significant SNPs by using data from three other genome-wide scans. Additional in silico data provided replication of the findings from the most significant region on chromosome 1q21.3 rs7412746 (P = 6 × 10(-10)). Together, these data identified several candidate genes for additional studies to identify causal variants predisposing to increased risk for developing melanoma.


Subject(s)
Genetic Loci/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Melanoma/genetics , Skin Neoplasms/genetics , Case-Control Studies , Chromosomes, Human, Pair 1/genetics , Genetic Markers , Guanine Nucleotide Exchange Factors/genetics , Humans , Meta-Analysis as Topic , Pigmentation/genetics , Polymorphism, Single Nucleotide/genetics , Ubiquitin-Protein Ligases
18.
Bioinformatics ; 28(24): 3329-31, 2012 Dec 15.
Article in English | MEDLINE | ID: mdl-23052040

ABSTRACT

GWASTools is an R/Bioconductor package for quality control and analysis of genome-wide association studies (GWAS). GWASTools brings the interactive capability and extensive statistical libraries of R to GWAS. Data are stored in NetCDF format to accommodate extremely large datasets that cannot fit within R's memory limits. The documentation includes instructions for converting data from multiple formats, including variants called from sequencing. GWASTools provides a convenient interface for linking genotypes and intensity data with sample and single nucleotide polymorphism annotation.


Subject(s)
Genome-Wide Association Study/standards , Polymorphism, Single Nucleotide , Software , Genotype , Humans , Quality Control
19.
Front Neuroinform ; 17: 1175689, 2023.
Article in English | MEDLINE | ID: mdl-37304174

ABSTRACT

There is common consensus that data sharing accelerates science. Data sharing enhances the utility of data and promotes the creation and competition of scientific ideas. Within the Alzheimer's disease and related dementias (ADRD) community, data types and modalities are spread across many organizations, geographies, and governance structures. The ADRD community is not alone in facing these challenges, however, the problem is even more difficult because of the need to share complex biomarker data from centers around the world. Heavy-handed data sharing mandates have, to date, been met with limited success and often outright resistance. Interest in making data Findable, Accessible, Interoperable, and Reusable (FAIR) has often resulted in centralized platforms. However, when data governance and sovereignty structures do not allow the movement of data, other methods, such as federation, must be pursued. Implementation of fully federated data approaches are not without their challenges. The user experience may become more complicated, and federated analysis of unstructured data types remains challenging. Advancement in federated data sharing should be accompanied by improvement in federated learning methodologies so that federated data sharing becomes functionally equivalent to direct access to record level data. In this article, we discuss federated data sharing approaches implemented by three data platforms in the ADRD field: Dementia's Platform UK (DPUK) in 2014, the Global Alzheimer's Association Interactive Network (GAAIN) in 2012, and the Alzheimer's Disease Data Initiative (ADDI) in 2020. We conclude by addressing open questions that the research community needs to solve together.

20.
bioRxiv ; 2023 Sep 12.
Article in English | MEDLINE | ID: mdl-37745577

ABSTRACT

Huntington disease (HD) is an incurable neurodegenerative disease characterized by neuronal loss and astrogliosis. One hallmark of HD is the selective neuronal vulnerability of striatal medium spiny neurons. To date, the underlying mechanisms of this selective vulnerability have not been fully defined. Here, we employed a multi-omic approach including single nucleus RNAseq (snRNAseq), bulk RNAseq, lipidomics, HTT gene CAG repeat length measurements, and multiplexed immunofluorescence on post-mortem brain tissue from multiple brain regions of HD and control donors. We defined a signature of genes that is driven by CAG repeat length and found it enriched in astrocytic and microglial genes. Moreover, weighted gene correlation network analysis showed loss of connectivity of astrocytic and microglial modules in HD and identified modules that correlated with CAG-repeat length which further implicated inflammatory pathways and metabolism. We performed lipidomic analysis of HD and control brains and identified several lipid species that correlate with HD grade, including ceramides and very long chain fatty acids. Integration of lipidomics and bulk transcriptomics identified a consensus gene signature that correlates with HD grade and HD lipidomic abnormalities and implicated the unfolded protein response pathway. Because astrocytes are critical for brain lipid metabolism and play important roles in regulating inflammation, we analyzed our snRNAseq dataset with an emphasis on astrocyte pathology. We found two main astrocyte types that spanned multiple brain regions; these types correspond to protoplasmic astrocytes, and fibrous-like - CD44-positive, astrocytes. HD pathology was differentially associated with these cell types in a region-specific manner. One protoplasmic astrocyte cluster showed high expression of metallothionein genes, the depletion of this cluster positively correlated with the depletion of vulnerable medium spiny neurons in the caudate nucleus. We confirmed that metallothioneins were increased in cingulate HD astrocytes but were unchanged or even decreased in caudate astrocytes. We combined existing genome-wide association studies (GWAS) with a GWA study conducted on HD patients from the original Venezuelan cohort and identified a single-nucleotide polymorphism in the metallothionein gene locus associated with delayed age of onset. Functional studies found that metallothionein overexpressing astrocytes are better able to buffer glutamate and were neuroprotective of patient-derived directly reprogrammed HD MSNs as well as against rotenone-induced neuronal death in vitro. Finally, we found that metallothionein-overexpressing astrocytes increased the phagocytic activity of microglia in vitro and increased the expression of genes involved in fatty acid binding. Together, we identified an astrocytic phenotype that is regionally-enriched in less vulnerable brain regions that can be leveraged to protect neurons in HD.

SELECTION OF CITATIONS
SEARCH DETAIL