Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
Nat Biotechnol ; 2024 Apr 26.
Article in English | MEDLINE | ID: mdl-38671154

ABSTRACT

Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits and are linked to over 60 disease phenotypes. However, they are often excluded from at-scale studies because of challenges with variant calling and representation, as well as a lack of a genome-wide standard. Here, to promote the development of TR methods, we created a catalog of TR regions and explored TR properties across 86 haplotype-resolved long-read human assemblies. We curated variants from the Genome in a Bottle (GIAB) HG002 individual to create a TR dataset to benchmark existing and future TR analysis methods. We also present an improved variant comparison method that handles variants greater than 4 bp in length and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ~24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 'truth-set' TR benchmark. We demonstrate the utility of this pipeline across short-read and long-read technologies.

2.
medRxiv ; 2024 Mar 07.
Article in English | MEDLINE | ID: mdl-38496498

ABSTRACT

Less than half of individuals with a suspected Mendelian condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.

3.
bioRxiv ; 2023 Nov 01.
Article in English | MEDLINE | ID: mdl-37961319

ABSTRACT

Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits, and are linked to over 60 disease phenotypes. However, their complexity often excludes them from at-scale studies due to challenges with variant calling, representation, and lack of a genome-wide standard. To promote TR methods development, we create a comprehensive catalog of TR regions and explore its properties across 86 samples. We then curate variants from the GIAB HG002 individual to create a tandem repeat benchmark. We also present a variant comparison method that handles small and large alleles and varying allelic representation. The 8.1% of the genome covered by the TR catalog holds ∼24.9% of variants per individual, including 124,728 small and 17,988 large variants for the GIAB HG002 TR benchmark. We work with the GIAB community to demonstrate the utility of this benchmark across short and long read technologies.

4.
Bioinformatics ; 39(5)2023 05 04.
Article in English | MEDLINE | ID: mdl-37171891

ABSTRACT

SUMMARY: Increases in the cohort size in long-read sequencing projects necessitate more efficient software for quality assessment and processing of sequencing data from Oxford Nanopore Technologies and Pacific Biosciences. Here, we describe novel tools for summarizing experiments, filtering datasets, visualizing phased alignments results, and updates to the NanoPack software suite. AVAILABILITY AND IMPLEMENTATION: The cramino, chopper, kyber, and phasius tools are written in Rust and available as executable binaries without requiring installation or managing dependencies. Binaries build on musl are available for broad compatibility. NanoPlot and NanoComp are written in Python3. Links to the separate tools and their documentation can be found at https://github.com/wdecoster/nanopack. All tools are compatible with Linux, Mac OS, and the MS Windows Subsystem for Linux and are released under the MIT license. The repositories include test data, and the tools are continuously tested using GitHub Actions and can be installed with the conda dependency manager.


Subject(s)
Nanopores , Software , Humans , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , Documentation
5.
Microbiol Spectr ; 11(1): e0306122, 2023 02 14.
Article in English | MEDLINE | ID: mdl-36475894

ABSTRACT

Acinetobacter baumannii is an opportunistic pathogenic bacterium prioritized by WHO and CDC because of its increasing antibiotic resistance. Heterogeneity among strains represents the hallmark of A. baumannii bacteria. We wondered to what extent extensively used strains, so-called reference strains, reflect the dynamic nature and intrinsic heterogeneity of these bacteria. We analyzed multiple phenotypic traits of 43 nonredundant, modern, and multidrug-resistant, extensively drug-resistant, and pandrug-resistant clinical isolates and broadly used strains of A. baumannii. Comparison of these isolates at the genetic and phenotypic levels confirmed a high degree of heterogeneity. Importantly, we observed that a significant portion of modern clinical isolates strongly differs from several historically established strains in the light of colony morphology, cellular density, capsule production, natural transformability, and in vivo virulence. The significant differences between modern clinical isolates of A. baumannii and established strains could hamper the study of A. baumannii, especially concerning its virulence and resistance mechanisms. Hence, we propose a variable collection of modern clinical isolates that are characterized at the genetic and phenotypic levels, covering a wide range of the phenotypic spectrum, with six different macrocolony type groups, from avirulent to hypervirulent phenotypes, and with naturally noncapsulated to hypermucoid strains, with intermediate phenotypes as well. Strain-specific mechanistic observations remain interesting per se, and established "reference" strains have undoubtedly been shown to be very useful to study basic mechanisms of A. baumannii biology. However, any study based on a specific strain of A. baumannii should be compared to modern and clinically relevant isolates. IMPORTANCE Acinetobacter baumannii is a bacterium prioritized by the CDC and WHO because of its increasing antibiotic resistance, leading to treatment failures. The hallmark of this pathogen is the high heterogeneity observed among isolates, due to a very dynamic genome. In this context, we tested if a subset of broadly used isolates, considered "reference" strains, was reflecting the genetic and phenotypic diversity found among currently circulating clinical isolates. We observed that the so-called reference strains do not cover the whole diversity of the modern clinical isolates. While formerly established strains successfully generated a strong base of knowledge in the A. baumannii field and beyond, our study shows that a rational choice of strain, related to a specific biological question, should be taken into consideration. Any data obtained with historically established strains should also be compared to modern and clinically relevant isolates, especially concerning drug screening, resistance, and virulence contexts.


Subject(s)
Acinetobacter Infections , Acinetobacter baumannii , Humans , Anti-Bacterial Agents/pharmacology , Anti-Bacterial Agents/therapeutic use , Microbial Sensitivity Tests , Acinetobacter Infections/microbiology , Phenotype , Drug Resistance, Multiple, Bacterial/genetics
6.
F1000Res ; 11: 530, 2022.
Article in English | MEDLINE | ID: mdl-36262335

ABSTRACT

In October 2021, 59 scientists from 14 countries and 13 U.S. states collaborated virtually in the Third Annual Baylor College of Medicine & DNANexus Structural Variation hackathon. The goal of the hackathon was to advance research on structural variants (SVs) by prototyping and iterating on open-source software. This led to nine hackathon projects focused on diverse genomics research interests, including various SV discovery and genotyping methods, SV sequence reconstruction, and clinically relevant structural variation, including SARS-CoV-2 variants. Repositories for the projects that participated in the hackathon are available at https://github.com/collaborativebioinformatics.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Genomics , Software
7.
Antimicrob Agents Chemother ; 66(9): e0089222, 2022 09 20.
Article in English | MEDLINE | ID: mdl-35969073

ABSTRACT

In this study, we characterize a new collection that comprises multidrug-resistant (MDR), extensively drug-resistant (XDR), pandrug-resistant (PDR), and carbapenem-resistant modern clinical isolates of Acinetobacter baumannii collected from hospitals through national microbiological surveillance in Belgium. Bacterial isolates (n = 43) were subjected to whole-genome sequencing (WGS), combining Illumina (MiSeq) and Nanopore (MinION) technologies, from which high-quality genomes (chromosome and plasmids) were de novo assembled. Antimicrobial susceptibility testing was performed along with genome analyses, which identified intrinsic and acquired resistance determinants along with their genetic environments and vehicles. Furthermore, the bacterial isolates were compared to the most prevalent A. baumannii sequence type 2 (ST2) (Pasteur scheme) genomes available from the BIGSdb database. Of the 43 strains, 40 carried determinants of resistance to carbapenems; blaOXA-23 (n = 29) was the most abundant acquired antimicrobial resistance gene, with 39 isolates encoding at least two different types of OXA enzymes. According to the Pasteur scheme, the majority of the isolates were globally disseminated clones of ST2 (n = 25), while less frequent sequence types included ST636 (n = 6), ST1 (n = 4), ST85 and ST78 (n = 2 each), and ST604, ST215, ST158, and ST10 (n = 1 each). Using the Oxford typing scheme, we identified 22 STs, including two novel types (ST2454 and ST2455). While the majority (26/29) of blaOXA-23 genes were chromosomally carried, all blaOXA-72 genes were plasmid borne. Our results show the presence of high-risk clones of A. baumannii within Belgian health care facilities with frequent occurrences of genes encoding carbapenemases, highlighting the crucial need for constant surveillance.


Subject(s)
Acinetobacter Infections , Acinetobacter baumannii , Acinetobacter Infections/microbiology , Anti-Bacterial Agents/pharmacology , Bacterial Proteins/genetics , Carbapenems/pharmacology , Drug Resistance, Multiple, Bacterial/genetics , Genomics , Humans , Interleukin-1 Receptor-Like 1 Protein/genetics , Microbial Sensitivity Tests , Multilocus Sequence Typing , beta-Lactamases/genetics
8.
F1000Res ; 10: 246, 2021.
Article in English | MEDLINE | ID: mdl-34621504

ABSTRACT

In October 2020, 62 scientists from nine nations worked together remotely in the Second Baylor College of Medicine & DNAnexus hackathon, focusing on different related topics on Structural Variation, Pan-genomes, and SARS-CoV-2 related research.   The overarching focus was to assess the current status of the field and identify the remaining challenges. Furthermore, how to combine the strengths of the different interests to drive research and method development forward. Over the four days, eight groups each designed and developed new open-source methods to improve the identification and analysis of variations among species, including humans and SARS-CoV-2. These included improvements in SV calling, genotyping, annotations and filtering. Together with advancements in benchmarking existing methods. Furthermore, groups focused on the diversity of SARS-CoV-2. Daily discussion summary and methods are available publicly at  https://github.com/collaborativebioinformatics provides valuable insights for both participants and the research community.


Subject(s)
COVID-19 , SARS-CoV-2 , Animals , Genome, Viral , Humans , Vertebrates
9.
Nat Rev Genet ; 22(9): 572-587, 2021 09.
Article in English | MEDLINE | ID: mdl-34050336

ABSTRACT

Long-read sequencing technologies have now reached a level of accuracy and yield that allows their application to variant detection at a scale of tens to thousands of samples. Concomitant with the development of new computational tools, the first population-scale studies involving long-read sequencing have emerged over the past 2 years and, given the continuous advancement of the field, many more are likely to follow. In this Review, we survey recent developments in population-scale long-read sequencing, highlight potential challenges of a scaled-up approach and provide guidance regarding experimental design. We provide an overview of current long-read sequencing platforms, variant calling methodologies and approaches for de novo assemblies and reference-based mapping approaches. Furthermore, we summarize strategies for variant validation, genotyping and predicting functional impact and emphasize challenges remaining in achieving long-read sequencing at a population scale.


Subject(s)
Computational Biology/methods , Genome, Human , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Industrial Development/trends , Sequence Analysis, DNA/methods , Humans
10.
Front Cell Dev Biol ; 9: 664317, 2021.
Article in English | MEDLINE | ID: mdl-33968938

ABSTRACT

Inactivating variants as well as a missense variant in the centrosomal CEP78 gene have been identified in autosomal recessive cone-rod dystrophy with hearing loss (CRDHL), a rare syndromic inherited retinal disease distinct from Usher syndrome. Apart from this, a complex structural variant (SV) implicating CEP78 has been reported in CRDHL. Here we aimed to expand the genetic architecture of typical CRDHL by the identification of complex SVs of the CEP78 region and characterization of their underlying mechanisms. Approaches used for the identification of the SVs are shallow whole-genome sequencing (sWGS) combined with quantitative polymerase chain reaction (PCR) and long-range PCR, or ExomeDepth analysis on whole-exome sequencing (WES) data. Targeted or whole-genome nanopore long-read sequencing (LRS) was used to delineate breakpoint junctions at the nucleotide level. For all SVs cases, the effect of the SVs on CEP78 expression was assessed using quantitative PCR on patient-derived RNA. Apart from two novel canonical CEP78 splice variants and a frameshifting single-nucleotide variant (SNV), two SVs affecting CEP78 were identified in three unrelated individuals with CRDHL: a heterozygous total gene deletion of 235 kb and a partial gene deletion of 15 kb in a heterozygous and homozygous state, respectively. Assessment of the molecular consequences of the SVs on patient's materials displayed a loss-of-function effect. Delineation and characterization of the 15-kb deletion using targeted LRS revealed the previously described complex CEP78 SV, suggestive of a recurrent genomic rearrangement. A founder haplotype was demonstrated for the latter SV in cases of Belgian and British origin, respectively. The novel 235-kb deletion was delineated using whole-genome LRS. Breakpoint analysis showed microhomology and pointed to a replication-based underlying mechanism. Moreover, data mining of bulk and single-cell human and mouse transcriptional datasets, together with CEP78 immunostaining on human retina, linked the CEP78 expression domain with its phenotypic manifestations. Overall, this study supports that the CEP78 locus is prone to distinct SVs and that SV analysis should be considered in a genetic workup of CRDHL. Finally, it demonstrated the power of sWGS and both targeted and whole-genome LRS in identifying and characterizing complex SVs in patients with ocular diseases.

11.
Bioinformatics ; 36(10): 3236-3238, 2020 05 01.
Article in English | MEDLINE | ID: mdl-32053166

ABSTRACT

SUMMARY: Modified nucleotides play a crucial role in gene expression regulation. Here, we describe methplotlib, a tool developed for the visualization of modified nucleotides detected from Oxford Nanopore Technologies sequencing platforms, together with additional scripts for statistical analysis of allele-specific modification within-subjects and differential modification frequency across subjects. AVAILABILITY AND IMPLEMENTATION: The methplotlib command-line tool is written in Python3, is compatible with Linux, Mac OS and the MS Windows 10 Subsystem for Linux and released under the MIT license. The source code can be found at https://github.com/wdecoster/methplotlib and can be installed from PyPI and bioconda. Our repository includes test data, and the tool is continuously tested at travis-ci.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Nanopore Sequencing , Nanopores , Humans , Nucleotides , Sequence Analysis, DNA , Software
12.
NAR Genom Bioinform ; 2(1): lqz027, 2020 Mar.
Article in English | MEDLINE | ID: mdl-33575574

ABSTRACT

Long-read sequencing has substantial advantages for structural variant discovery and phasing of variants compared to short-read technologies, but the required and optimal read length has not been assessed. In this work, we used long reads simulated from human genomes and evaluated structural variant discovery and variant phasing using current best practice bioinformatics methods. We determined that optimal discovery of structural variants from human genomes can be obtained with reads of minimally 20 kb. Haplotyping variants across genes only reaches its optimum from reads of 100 kb. These findings are important for the design of future long-read sequencing projects.

13.
Genome Biol ; 20(1): 239, 2019 11 14.
Article in English | MEDLINE | ID: mdl-31727106

ABSTRACT

Technological limitations have hindered the large-scale genetic investigation of tandem repeats in disease. We show that long-read sequencing with a single Oxford Nanopore Technologies PromethION flow cell per individual achieves 30× human genome coverage and enables accurate assessment of tandem repeats including the 10,000-bp Alzheimer's disease-associated ABCA7 VNTR. The Guppy "flip-flop" base caller and tandem-genotypes tandem repeat caller are efficient for large-scale tandem repeat assessment, but base calling and alignment challenges persist. We present NanoSatellite, which analyzes tandem repeats directly on electric current data and improves calling of GC-rich tandem repeats, expanded alleles, and motif interruptions.


Subject(s)
Genome, Human , Genomics/methods , High-Throughput Nucleotide Sequencing , Tandem Repeat Sequences , ATP-Binding Cassette Transporters/genetics , Algorithms , Feasibility Studies , Humans , Minisatellite Repeats
14.
Genome Res ; 29(7): 1178-1187, 2019 07.
Article in English | MEDLINE | ID: mdl-31186302

ABSTRACT

We sequenced the genome of the Yoruban reference individual NA19240 on the long-read sequencing platform Oxford Nanopore PromethION for evaluation and benchmarking of recently published aligners and germline structural variant calling tools, as well as a comparison with the performance of structural variant calling from short-read sequencing data. The structural variant caller Sniffles after NGMLR or minimap2 alignment provides the most accurate results, but additional confidence or sensitivity can be obtained by a combination of multiple variant callers. Sensitive and fast results can be obtained by minimap2 for alignment and a combination of Sniffles and SVIM for variant identification. We describe a scalable workflow for identification, annotation, and characterization of tens of thousands of structural variants from long-read genome sequencing of an individual or population. By discussing the results of this well-characterized reference individual, we provide an approximation of what can be expected in future long-read sequencing studies aiming for structural variant identification.


Subject(s)
Genetic Variation , Genome, Human , Sequence Analysis, DNA/instrumentation , Benchmarking , Cell Line, Tumor , Computational Biology , Humans
15.
Trends Biotechnol ; 37(9): 973-982, 2019 09.
Article in English | MEDLINE | ID: mdl-30902345

ABSTRACT

A substantial amount of structural variation in the human genome remains uninvestigated due to the limitations of existing technologies, the presence of repetitive sequences, and the complexity of a diploid genome. New technologies have been developed, increasing resolution and appreciation of structural variation and how it affects human diversity and disease. The genetic etiology of most patients with complex disorders such as neurodegenerative brain diseases is not yet elucidated, complicating disease diagnosis, genetic counseling, and understanding of underlying pathological mechanisms needed to develop therapeutic interventions. Here, we focus on innovative progress and opportunities provided by the newest methods such as linked read sequencing, strand-specific sequencing, and long-read sequencing. Finally, we describe a strategy for generating a comprehensive catalog of structural variations across populations.


Subject(s)
Genetic Variation , Genome, Human , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Base Sequence , Humans
16.
Acta Neuropathol ; 137(6): 901-918, 2019 06.
Article in English | MEDLINE | ID: mdl-30874922

ABSTRACT

Emerging evidence suggested a converging mechanism in neurodegenerative brain diseases (NBD) involving early neuronal network dysfunctions and alterations in the homeostasis of neuronal firing as culprits of neurodegeneration. In this study, we used paired-end short-read and direct long-read whole genome sequencing to investigate an unresolved autosomal dominant dementia family significantly linked to 7q36. We identified and validated a chromosomal inversion of ca. 4 Mb, segregating on the disease haplotype and disrupting the coding sequence of dipeptidyl-peptidase 6 gene (DPP6). DPP6 resequencing identified significantly more rare variants-nonsense, frameshift, and missense-in early-onset Alzheimer's disease (EOAD, p value = 0.03, OR = 2.21 95% CI 1.05-4.82) and frontotemporal dementia (FTD, p = 0.006, OR = 2.59, 95% CI 1.28-5.49) patient cohorts. DPP6 is a type II transmembrane protein with a highly structured extracellular domain and is mainly expressed in brain, where it binds to the potassium channel Kv4.2 enhancing its expression, regulating its gating properties and controlling the dendritic excitability of hippocampal neurons. Using in vitro modeling, we showed that the missense variants found in patients destabilize DPP6 and reduce its membrane expression (p < 0.001 and p < 0.0001) leading to a loss of protein. Reduced DPP6 and/or Kv4.2 expression was also detected in brain tissue of missense variant carriers. Loss of DPP6 is known to cause neuronal hyperexcitability and behavioral alterations in Dpp6-KO mice. Taken together, the results of our genomic, genetic, expression and modeling analyses, provided direct evidence supporting the involvement of DPP6 loss in dementia. We propose that loss of function variants have a higher penetrance and disease impact, whereas the missense variants have a variable risk contribution to disease that can vary from high to low penetrance. Our findings of DPP6, as novel gene in dementia, strengthen the involvement of neuronal hyperexcitability and alteration in the homeostasis of neuronal firing as a disease mechanism to further investigate.


Subject(s)
Chromosome Inversion , Dementia/genetics , Dipeptidyl-Peptidases and Tripeptidyl-Peptidases/deficiency , Mutation , Nerve Tissue Proteins/deficiency , Neurodegenerative Diseases/genetics , Neurons/physiology , Potassium Channels/deficiency , Action Potentials/physiology , Adult , Aged , Chromosomes, Human, Pair 7/genetics , Dementia/physiopathology , Dipeptidyl-Peptidases and Tripeptidyl-Peptidases/genetics , Dipeptidyl-Peptidases and Tripeptidyl-Peptidases/physiology , Female , Genes, Dominant , Homeostasis , Humans , Male , Middle Aged , Nerve Tissue Proteins/genetics , Nerve Tissue Proteins/physiology , Neurodegenerative Diseases/physiopathology , Pedigree , Penetrance , Polymorphism, Single Nucleotide , Potassium Channels/genetics , Potassium Channels/physiology , Protein Stability , Protein Transport , Synaptic Transmission , Whole Genome Sequencing
17.
Neurobiol Aging ; 67: 84-94, 2018 07.
Article in English | MEDLINE | ID: mdl-29653316

ABSTRACT

We previously reported a granulin (GRN) null mutation, originating from a common founder, in multiple Belgian families with frontotemporal dementia. Here, we used data of a 10-year follow-up study to describe in detail the clinical heterogeneity observed in this extended founder pedigree. We identified 85 patients and 40 unaffected mutation carriers, belonging to 29 branches of the founder pedigree. Most patients (74.4%) were diagnosed with frontotemporal dementia, while others had a clinical diagnosis of unspecified dementia, Alzheimer's dementia or Parkinson's disease. The observed clinical heterogeneity can guide clinical diagnosis, genetic testing, and counseling of mutation carriers. Onset of initial symptomatology is highly variable, ranging from age 45 to 80 years. Analysis of known modifiers, suggested effects of GRN rs5848, microtubule-associated protein tau H1/H2, and chromosome 9 open reading frame 72 G4C2 repeat length on onset age but explained only a minor fraction of the variability. Contrary, the extended GRN founder family is a valuable source for identifying other onset age modifiers based on exome or genome sequences. These modifiers might be interesting targets for developing disease-modifying therapies.


Subject(s)
Frontotemporal Dementia/genetics , Genetic Association Studies , Intercellular Signaling Peptides and Proteins/genetics , Loss of Function Mutation , Adult , Age of Onset , Aged , Aged, 80 and over , Belgium , Dimethylhydrazines , Female , Follow-Up Studies , Humans , Male , Middle Aged , Pedigree , Progranulins , Propionates
18.
Bioinformatics ; 34(15): 2666-2669, 2018 08 01.
Article in English | MEDLINE | ID: mdl-29547981

ABSTRACT

Summary: Here we describe NanoPack, a set of tools developed for visualization and processing of long-read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences. Availability and implementation: The NanoPack tools are written in Python3 and released under the GNU GPL3.0 License. The source code can be found at https://github.com/wdecoster/nanopack, together with links to separate scripts and their documentation. The scripts are compatible with Linux, Mac OS and the MS Windows 10 subsystem for Linux and are available as a graphical user interface, a web service at http://nanoplot.bioinf.be and command line tools. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Software , Escherichia coli/genetics
19.
Acta Neuropathol ; 134(3): 475-487, 2017 Sep.
Article in English | MEDLINE | ID: mdl-28447221

ABSTRACT

Premature termination codon (PTC) mutations in the ATP-Binding Cassette, Sub-Family A, Member 7 gene (ABCA7) have recently been identified as intermediate-to-high penetrant risk factor for late-onset Alzheimer's disease (LOAD). High variability, however, is observed in downstream ABCA7 mRNA and protein expression, disease penetrance, and onset age, indicative of unknown modifying factors. Here, we investigated the prevalence and disease penetrance of ABCA7 PTC mutations in a large early onset AD (EOAD)-control cohort, and examined the effect on transcript level with comprehensive third-generation long-read sequencing. We characterized the ABCA7 coding sequence with next-generation sequencing in 928 EOAD patients and 980 matched control individuals. With MetaSKAT rare variant association analysis, we observed a fivefold enrichment (p = 0.0004) of PTC mutations in EOAD patients (3%) versus controls (0.6%). Ten novel PTC mutations were only observed in patients, and PTC mutation carriers in general had an increased familial AD load. In addition, we observed nominal risk reducing trends for three common coding variants. Seven PTC mutations were further analyzed using targeted long-read cDNA sequencing on an Oxford Nanopore MinION platform. PTC-containing transcripts for each investigated PTC mutation were observed at varying proportion (5-41% of the total read count), implying incomplete nonsense-mediated mRNA decay (NMD). Furthermore, we distinguished and phased several previously unknown alternative splicing events (up to 30% of transcripts). In conjunction with PTC mutations, several of these novel ABCA7 isoforms have the potential to rescue deleterious PTC effects. In conclusion, ABCA7 PTC mutations play a substantial role in EOAD, warranting genetic screening of ABCA7 in genetically unexplained patients. Long-read cDNA sequencing revealed both varying degrees of NMD and transcript-modifying events, which may influence ABCA7 dosage, disease severity, and may create opportunities for therapeutic interventions in AD.


Subject(s)
ATP-Binding Cassette Transporters/genetics , Alzheimer Disease/genetics , Genetic Predisposition to Disease , Mutation , Polymorphism, Single Nucleotide , Adult , Age of Onset , Aged , Female , Genetic Association Studies , Humans , Male , Middle Aged
SELECTION OF CITATIONS
SEARCH DETAIL
...