RESUMO
INTRODUCTION: Alzheimer's disease (AD) is a common disorder of the elderly that is both highly heritable and genetically heterogeneous. METHODS: We investigated the association of AD with both common variants and aggregates of rare coding and non-coding variants in 13,371 individuals of diverse ancestry with whole genome sequencing (WGS) data. RESULTS: Pooled-population analyses of all individuals identified genetic variants at apolipoprotein E (APOE) and BIN1 associated with AD (p < 5 × 10-8). Subgroup-specific analyses identified a haplotype on chromosome 14 including PSEN1 associated with AD in Hispanics, further supported by aggregate testing of rare coding and non-coding variants in the region. Common variants in LINC00320 were observed associated with AD in Black individuals (p = 1.9 × 10-9). Finally, we observed rare non-coding variants in the promoter of TOMM40 distinct of APOE in pooled-population analyses (p = 7.2 × 10-8). DISCUSSION: We observed that complementary pooled-population and subgroup-specific analyses offered unique insights into the genetic architecture of AD. HIGHLIGHTS: We determine the association of genetic variants with Alzheimer's disease (AD) using 13,371 individuals of diverse ancestry with whole genome sequencing (WGS) data. We identified genetic variants at apolipoprotein E (APOE), BIN1, PSEN1, and LINC00320 associated with AD. We observed rare non-coding variants in the promoter of TOMM40 distinct of APOE.
RESUMO
Progressive supranuclear palsy (PSP), a rare Parkinsonian disorder, is characterized by problems with movement, balance, and cognition. PSP differs from Alzheimer's disease (AD) and other diseases, displaying abnormal microtubule-associated protein tau by both neuronal and glial cell pathologies. Genetic contributors may mediate these differences; however, the genetics of PSP remain underexplored. Here we conduct the largest genome-wide association study (GWAS) of PSP which includes 2779 cases (2595 neuropathologically-confirmed) and 5584 controls and identify six independent PSP susceptibility loci with genome-wide significant (P < 5 × 10-8) associations, including five known (MAPT, MOBP, STX6, RUNX2, SLCO1A2) and one novel locus (C4A). Integration with cell type-specific epigenomic annotations reveal an oligodendrocytic signature that might distinguish PSP from AD and Parkinson's disease in subsequent studies. Candidate PSP risk gene prioritization using expression quantitative trait loci (eQTLs) identifies oligodendrocyte-specific effects on gene expression in half of the genome-wide significant loci, and an association with C4A expression in brain tissue, which may be driven by increased C4A copy number. Finally, histological studies demonstrate tau aggregates in oligodendrocytes that colocalize with C4 (complement) deposition. Integrating GWAS with functional studies, epigenomic and eQTL analyses, we identify potential causal roles for variation in MOBP, STX6, RUNX2, SLCO1A2, and C4A in PSP pathogenesis.
Assuntos
Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Paralisia Supranuclear Progressiva , Proteínas tau , Humanos , Paralisia Supranuclear Progressiva/genética , Paralisia Supranuclear Progressiva/patologia , Paralisia Supranuclear Progressiva/metabolismo , Idoso , Masculino , Feminino , Proteínas tau/genética , Proteínas tau/metabolismo , Transcriptoma , Polimorfismo de Nucleotídeo Único , Neuroglia/metabolismo , Neuroglia/patologia , Idoso de 80 Anos ou mais , Oligodendroglia/metabolismo , Oligodendroglia/patologia , Pessoa de Meia-Idade , Doença de Alzheimer/genética , Doença de Alzheimer/patologia , Doença de Alzheimer/metabolismo , Estudos de Casos e Controles , Proteínas da MielinaRESUMO
Detecting structural variants (SVs) in whole-genome sequencing poses significant challenges. We present a protocol for variant calling, merging, genotyping, sensitivity analysis, and laboratory validation for generating a high-quality SV call set in whole-genome sequencing from the Alzheimer's Disease Sequencing Project comprising 578 individuals from 111 families. Employing two complementary pipelines, Scalpel and Parliament, for SV/indel calling, we assessed sensitivity through sample replicates (N = 9) with in silico variant spike-ins. We developed a novel metric, D-score, to evaluate caller specificity for deletions. The accuracy of deletions was evaluated by Sanger sequencing. We generated a high-quality call set of 152,301 deletions of diverse sizes. Sanger sequencing validated 114 of 146 detected deletions (78.1%). Scalpel excelled in accuracy for deletions ≤100 bp, whereas Parliament was optimal for deletions >900 bp. Overall, 83.0% and 72.5% of calls by Scalpel and Parliament were validated, respectively, including all 11 deletions called by both Parliament and Scalpel between 101 and 900 bp. Our flexible protocol successfully generated a high-quality deletion call set and a truth set of Sanger sequencing-validated deletions with precise breakpoints spanning 1-17,000 bp.
Assuntos
Doença de Alzheimer , Humanos , Doença de Alzheimer/genética , Sequenciamento Completo do Genoma/métodosRESUMO
INTRODUCTION: Clinical research in Alzheimer's disease (AD) lacks cohort diversity despite being a global health crisis. The Asian Cohort for Alzheimer's Disease (ACAD) was formed to address underrepresentation of Asians in research, and limited understanding of how genetics and non-genetic/lifestyle factors impact this multi-ethnic population. METHODS: The ACAD started fully recruiting in October 2021 with one central coordination site, eight recruitment sites, and two analysis sites. We developed a comprehensive study protocol for outreach and recruitment, an extensive data collection packet, and a centralized data management system, in English, Chinese, Korean, and Vietnamese. RESULTS: ACAD has recruited 606 participants with an additional 900 expressing interest in enrollment since program inception. DISCUSSION: ACAD's traction indicates the feasibility of recruiting Asians for clinical research to enhance understanding of AD risk factors. ACAD will recruit > 5000 participants to identify genetic and non-genetic/lifestyle AD risk factors, establish blood biomarker levels for AD diagnosis, and facilitate clinical trial readiness. HIGHLIGHTS: The Asian Cohort for Alzheimer's Disease (ACAD) promotes awareness of under-investment in clinical research for Asians. We are recruiting Asian Americans and Canadians for novel insights into Alzheimer's disease. We describe culturally appropriate recruitment strategies and data collection protocol. ACAD addresses challenges of recruitment from heterogeneous Asian subcommunities. We aim to implement a successful recruitment program that enrolls across three Asian subcommunities.
Assuntos
Doença de Alzheimer , População Norte-Americana , Humanos , Doença de Alzheimer/genética , Projetos Piloto , Asiático/genética , Canadá , Fatores de RiscoRESUMO
The heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer's Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD > 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community.
Assuntos
Doença de Alzheimer , Humanos , Exoma , Biologia Computacional , Confiabilidade dos Dados , GenótipoRESUMO
INTRODUCTION: The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site Alzheimer's Genomics Database (GenomicsDB) is a public knowledge base of Alzheimer's disease (AD) genetic datasets and genomic annotations. METHODS: GenomicsDB uses a custom systems architecture to adopt and enforce rigorous standards that facilitate harmonization of AD-relevant genome-wide association study summary statistics datasets with functional annotations, including over 230 million annotated variants from the AD Sequencing Project. RESULTS: GenomicsDB generates interactive reports compiled from the harmonized datasets and annotations. These reports contextualize AD-risk associations in a broader functional genomic setting and summarize them in the context of functionally annotated genes and variants. DISCUSSION: Created to make AD-genetics knowledge more accessible to AD researchers, the GenomicsDB is designed to guide users unfamiliar with genetic data in not only exploring but also interpreting this ever-growing volume of data. Scalable and interoperable with other genomics resources using data technology standards, the GenomicsDB can serve as a central hub for research and data analysis on AD and related dementias. HIGHLIGHTS: The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) offers to the public a unique, disease-centric collection of AD-relevant GWAS summary statistics datasets. Interpreting these data is challenging and requires significant bioinformatics expertise to standardize datasets and harmonize them with functional annotations on genome-wide scales. The NIAGADS Alzheimer's GenomicsDB helps overcome these challenges by providing a user-friendly public knowledge base for AD-relevant genetics that shares harmonized, annotated summary statistics datasets from the NIAGADS repository in an interpretable, easily searchable format.
Assuntos
Doença de Alzheimer , Estados Unidos , Humanos , Doença de Alzheimer/genética , Estudo de Associação Genômica Ampla , National Institute on Aging (U.S.) , Genômica , Bases de Dados Factuais , Predisposição Genética para Doença/genéticaRESUMO
Alzheimer's Disease (AD) is a common disorder of the elderly that is both highly heritable and genetically heterogeneous. Here, we investigated the association between AD and both common variants and aggregates of rare coding and noncoding variants in 13,371 individuals of diverse ancestry with whole genome sequence (WGS) data. Pooled-population analyses identified genetic variants in or near APOE, BIN1, and LINC00320 significantly associated with AD (p < 5×10-8). Population-specific analyses identified a haplotype on chromosome 14 including PSEN1 associated with AD in Hispanics, further supported by aggregate testing of rare coding and noncoding variants in this region. Finally, we observed suggestive associations (p < 5×10-5) of aggregates of rare coding rare variants in ABCA7 among non-Hispanic Whites (p=5.4×10-6), and rare noncoding variants in the promoter of TOMM40 distinct of APOE in pooled-population analyses (p=7.2×10-8). Complementary pooled-population and population-specific analyses offered unique insights into the genetic architecture of AD.
RESUMO
Limited ancestral diversity has impaired our ability to detect risk variants more prevalent in non-European ancestry groups in genome-wide association studies (GWAS). We constructed and analyzed a multi-ancestry GWAS dataset in the Alzheimer's Disease (AD) Genetics Consortium (ADGC) to test for novel shared and ancestry-specific AD susceptibility loci and evaluate underlying genetic architecture in 37,382 non-Hispanic White (NHW), 6,728 African American, 8,899 Hispanic (HIS), and 3,232 East Asian individuals, performing within-ancestry fixed-effects meta-analysis followed by a cross-ancestry random-effects meta-analysis. We identified 13 loci with cross-ancestry associations including known loci at/near CR1 , BIN1 , TREM2 , CD2AP , PTK2B , CLU , SHARPIN , MS4A6A , PICALM , ABCA7 , APOE and two novel loci not previously reported at 11p12 ( LRRC4C ) and 12q24.13 ( LHX5-AS1 ). Reflecting the power of diverse ancestry in GWAS, we observed the SHARPIN locus using 7.1% the sample size of the original discovering single-ancestry GWAS (n=788,989). We additionally identified three GWS ancestry-specific loci at/near ( PTPRK ( P =2.4×10 -8 ) and GRB14 ( P =1.7×10 -8 ) in HIS), and KIAA0825 ( P =2.9×10 -8 in NHW). Pathway analysis implicated multiple amyloid regulation pathways (strongest with P adjusted =1.6×10 -4 ) and the classical complement pathway ( P adjusted =1.3×10 -3 ). Genes at/near our novel loci have known roles in neuronal development ( LRRC4C, LHX5-AS1 , and PTPRK ) and insulin receptor activity regulation ( GRB14 ). These findings provide compelling support for using traditionally-underrepresented populations for gene discovery, even with smaller sample sizes.
RESUMO
BACKGROUND: Recent Alzheimer's disease (AD) genetics findings from genome-wide association studies (GWAS) span progressively larger and more diverse populations and outcomes. Currently, there is no up-to-date resource providing harmonized and searchable information on all AD genetic associations found by GWAS, nor linking the reported genetic variants and genes with functional and genomic annotations. OBJECTIVE: Create an integrated/harmonized, and literature-derived collection of population-specific AD genetic associations. METHODS: We developed the Alzheimer's Disease Variant Portal (ADVP), an extensive collection of associations curated from >200 GWAS publications from Alzheimer's Disease Genetics Consortium and other consortia. Genetic associations were systematically extracted, harmonized, and annotated from both the genome-wide significant and suggestive loci reported in these publications. To ensure consistent representation of AD genetic findings, all the extracted genetic association information was harmonized across specifically designed publication, variant, and association categories. RESULTS: ADVP V1.0 (February 2021) catalogs 6,990 associations related to disease-risk, expression quantitative traits, endophenotypes, or neuropathology. This extensive harmonization effort led to a catalog containing >900 loci, >1,800 variants, >80 cohorts, and 8 populations. Besides, ADVP provides investigators with a seamless integration of genomic and publicly available functional annotations across multiple databases per harmonized variant and gene records, thus facilitating further understanding and analyses of these genetics findings. CONCLUSION: ADVP is a valuable resource for investigators to quickly and systematically explore high-confidence AD genetic findings and provides insights into population-specific AD genetic architecture. ADVP is continually maintained and enhanced by NIAGADS and is freely accessible at https://advp.niagads.org.
Assuntos
Doença de Alzheimer , Estudo de Associação Genômica Ampla , Doença de Alzheimer/genética , Endofenótipos , Predisposição Genética para Doença/genética , Humanos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Alzheimer's Disease (AD) is a progressive neurologic disease and the most common form of dementia. While the causes of AD are not completely understood, genetics plays a key role in the etiology of AD, and thus finding genetic factors holds the potential to uncover novel AD mechanisms. For this study, we focus on copy number variation (CNV) detection and burden analysis. Leveraging whole-genome sequence (WGS) data released by Alzheimer's Disease Sequencing Project (ADSP), we developed a scalable bioinformatics pipeline to identify CNVs. This pipeline was applied to 1,737 AD cases and 2,063 cognitively normal controls. As a result, we observed 237,306 and 42,767 deletions and duplications, respectively, with an average of 2,255 deletions and 1,820 duplications per subject. The burden tests show that Non-Hispanic-White cases on average have 16 more duplications than controls do (p-value 2e-6), and Hispanic cases have larger deletions than controls do (p-value 6.8e-5).
RESUMO
Importance: Compared with non-Hispanic White individuals, African American individuals from the same community are approximately twice as likely to develop Alzheimer disease. Despite this disparity, the largest Alzheimer disease genome-wide association studies to date have been conducted in non-Hispanic White individuals. In the largest association analyses of Alzheimer disease in African American individuals, ABCA7, TREM2, and an intergenic locus at 5q35 were previously implicated. Objective: To identify additional risk loci in African American individuals by increasing the sample size and using the African Genome Resource panel. Design, Setting, and Participants: This genome-wide association meta-analysis used case-control and family-based data sets from the Alzheimer Disease Genetics Consortium. There were multiple recruitment sites throughout the United States that included individuals with Alzheimer disease and controls of African American ancestry. Analysis began October 2018 and ended September 2019. Main Outcomes and Measures: Diagnosis of Alzheimer disease. Results: A total of 2784 individuals with Alzheimer disease (1944 female [69.8%]) and 5222 controls (3743 female [71.7%]) were analyzed (mean [SD] age at last evaluation, 74.2 [13.6] years). Associations with 4 novel common loci centered near the intracellular glycoprotein trafficking gene EDEM1 (3p26; P = 8.9 × 10-7), near the immune response gene ALCAM (3q13; P = 9.3 × 10-7), within GPC6 (13q31; P = 4.1 × 10-7), a gene critical for recruitment of glutamatergic receptors to the neuronal membrane, and within VRK3 (19q13.33; P = 3.5 × 10-7), a gene involved in glutamate neurotoxicity, were identified. In addition, several loci associated with rare variants, including a genome-wide significant intergenic locus near IGF1R at 15q26 (P = 1.7 × 10-9) and 6 additional loci with suggestive significance (P ≤ 5 × 10-7) such as API5 at 11p12 (P = 8.8 × 10-8) and RBFOX1 at 16p13 (P = 5.4 × 10-7) were identified. Gene expression data from brain tissue demonstrate association of ALCAM, ARAP1, GPC6, and RBFOX1 with brain ß-amyloid load. Of 25 known loci associated with Alzheimer disease in non-Hispanic White individuals, only APOE, ABCA7, TREM2, BIN1, CD2AP, FERMT2, and WWOX were implicated at a nominal significance level or stronger in African American individuals. Pathway analyses strongly support the notion that immunity, lipid processing, and intracellular trafficking pathways underlying Alzheimer disease in African American individuals overlap with those observed in non-Hispanic White individuals. A new pathway emerging from these analyses is the kidney system, suggesting a novel mechanism for Alzheimer disease that needs further exploration. Conclusions and Relevance: While the major pathways involved in Alzheimer disease etiology in African American individuals are similar to those in non-Hispanic White individuals, the disease-associated loci within these pathways differ.
Assuntos
Doença de Alzheimer/genética , Negro ou Afro-Americano/genética , Predisposição Genética para Doença/genética , Idoso , Feminino , Loci Gênicos , Estudo de Associação Genômica Ampla , Humanos , Masculino , Pessoa de Meia-IdadeRESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
Risk for late-onset Alzheimer's disease (LOAD), the most prevalent dementia, is partially driven by genetics. To identify LOAD risk loci, we performed a large genome-wide association meta-analysis of clinically diagnosed LOAD (94,437 individuals). We confirm 20 previous LOAD risk loci and identify five new genome-wide loci (IQCK, ACE, ADAM10, ADAMTS1, and WWOX), two of which (ADAM10, ACE) were identified in a recent genome-wide association (GWAS)-by-familial-proxy of Alzheimer's or dementia. Fine-mapping of the human leukocyte antigen (HLA) region confirms the neurological and immune-mediated disease haplotype HLA-DR15 as a risk factor for LOAD. Pathway analysis implicates immunity, lipid metabolism, tau binding proteins, and amyloid precursor protein (APP) metabolism, showing that genetic variants affecting APP and Aß processing are associated not only with early-onset autosomal dominant Alzheimer's disease but also with LOAD. Analyses of risk genes and pathways show enrichment for rare variants (P = 1.32 × 10-7), indicating that additional rare variants remain to be identified. We also identify important genetic correlations between LOAD and traits such as family history of dementia and education.
Assuntos
Doença de Alzheimer/genética , Peptídeos beta-Amiloides/genética , Loci Gênicos/genética , Predisposição Genética para Doença/genética , Imunidade/genética , Lipídeos/genética , Proteínas tau/genética , Idoso , Estudos de Casos e Controles , Feminino , Testes Genéticos/métodos , Estudo de Associação Genômica Ampla/métodos , Haplótipos/genética , Humanos , Metabolismo dos Lipídeos/genética , MasculinoRESUMO
SUMMARY: We report VCPA, our SNP/Indel Variant Calling Pipeline and data management tool used for the analysis of whole genome and exome sequencing (WGS/WES) for the Alzheimer's Disease Sequencing Project. VCPA consists of two independent but linkable components: pipeline and tracking database. The pipeline, implemented using the Workflow Description Language and fully optimized for the Amazon elastic compute cloud environment, includes steps from aligning raw sequence reads to variant calling using GATK. The tracking database allows users to view job running status in real time and visualize >100 quality metrics per genome. VCPA is functionally equivalent to the CCDG/TOPMed pipeline. Users can use the pipeline and the dockerized database to process large WGS/WES datasets on Amazon cloud with minimal configuration. AVAILABILITY AND IMPLEMENTATION: VCPA is released under the MIT license and is available for academic and nonprofit use for free. The pipeline source code and step-by-step instructions are available from the National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (http://www.niagads.org/VCPA). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Doença de Alzheimer , Gerenciamento de Dados , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , SoftwareRESUMO
We identified rare coding variants associated with Alzheimer's disease in a three-stage case-control study of 85,133 subjects. In stage 1, we genotyped 34,174 samples using a whole-exome microarray. In stage 2, we tested associated variants (P < 1 × 10-4) in 35,962 independent samples using de novo genotyping and imputed genotypes. In stage 3, we used an additional 14,997 samples to test the most significant stage 2 associations (P < 5 × 10-8) using imputed genotypes. We observed three new genome-wide significant nonsynonymous variants associated with Alzheimer's disease: a protective variant in PLCG2 (rs72824905: p.Pro522Arg, P = 5.38 × 10-10, odds ratio (OR) = 0.68, minor allele frequency (MAF)cases = 0.0059, MAFcontrols = 0.0093), a risk variant in ABI3 (rs616338: p.Ser209Phe, P = 4.56 × 10-10, OR = 1.43, MAFcases = 0.011, MAFcontrols = 0.008), and a new genome-wide significant variant in TREM2 (rs143332484: p.Arg62His, P = 1.55 × 10-14, OR = 1.67, MAFcases = 0.0143, MAFcontrols = 0.0089), a known susceptibility gene for Alzheimer's disease. These protein-altering changes are in genes highly expressed in microglia and highlight an immune-related protein-protein interaction network enriched for previously identified risk genes in Alzheimer's disease. These genetic findings provide additional evidence that the microglia-mediated innate immune response contributes directly to the development of Alzheimer's disease.