Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 45
Filter
1.
Cell ; 185(11): 1986-2005.e26, 2022 05 26.
Article in English | MEDLINE | ID: mdl-35525246

ABSTRACT

Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversions <2 kbp form by twin-priming during L1 retrotransposition; 80% of the larger inversions are balanced and affect twice as many nucleotides as CNVs. Balanced inversions show an excess of common variants, and 72% are flanked by segmental duplications (SDs) or retrotransposons. Since flanking repeats promote non-allelic homologous recombination, we developed complementary approaches to identify recurrent inversion formation. We describe 40 recurrent inversions encompassing 0.6% of the genome, showing inversion rates up to 2.7 × 10-4 per locus per generation. Recurrent inversions exhibit a sex-chromosomal bias and co-localize with genomic disorder critical regions. We propose that inversion recurrence results in an elevated number of heterozygous carriers and structural SD diversity, which increases mutability in the population and predisposes specific haplotypes to disease-causing CNVs.


Subject(s)
Chromosome Inversion , Segmental Duplications, Genomic , Chromosome Inversion/genetics , DNA Copy Number Variations/genetics , Genome, Human , Genomics , Humans
2.
Cell ; 176(6): 1310-1324.e10, 2019 03 07.
Article in English | MEDLINE | ID: mdl-30827684

ABSTRACT

DNA rearrangements resulting in human genome structural variants (SVs) are caused by diverse mutational mechanisms. We used long- and short-read sequencing technologies to investigate end products of de novo chromosome 17p11.2 rearrangements and query the molecular mechanisms underlying both recurrent and non-recurrent events. Evidence for an increased rate of clustered single-nucleotide variant (SNV) mutation in cis with non-recurrent rearrangements was found. Indel and SNV formation are associated with both copy-number gains and losses of 17p11.2, occur up to ∼1 Mb away from the breakpoint junctions, and favor C > G transversion substitutions; results suggest that single-stranded DNA is formed during the genesis of the SV and provide compelling support for a microhomology-mediated break-induced replication (MMBIR) mechanism for SV formation. Our data show an additional mutational burden of MMBIR consisting of hypermutation confined to the locus and manifesting as SNVs and indels predominantly within genes.


Subject(s)
Chromosomes, Human, Pair 17 , Mutation , Abnormalities, Multiple/genetics , Chromosome Breakpoints , Chromosome Disorders/genetics , Chromosome Duplication/genetics , DNA Copy Number Variations , DNA Repair/genetics , DNA Replication , Gene Rearrangement , Genome, Human , Genomic Structural Variation , Humans , INDEL Mutation , Models, Genetic , Polymorphism, Single Nucleotide , Recombination, Genetic , Sequence Analysis, DNA/methods , Smith-Magenis Syndrome/genetics
3.
Nature ; 621(7978): 355-364, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37612510

ABSTRACT

The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.


Subject(s)
Chromosomes, Human, Y , Evolution, Molecular , Humans , Male , Chromosomes, Human, Y/genetics , Genome, Human/genetics , Genomics , Mutation Rate , Phenotype , Euchromatin/genetics , Pseudogenes , Genetic Variation/genetics , Chromosomes, Human, X/genetics , Pseudoautosomal Regions/genetics
4.
Genome Res ; 34(1): 7-19, 2024 02 07.
Article in English | MEDLINE | ID: mdl-38176712

ABSTRACT

High-quality genome assemblies and sophisticated algorithms have increased sensitivity for a wide range of variant types, and breakpoint accuracy for structural variants (SVs, ≥50 bp) has improved to near base pair precision. Despite these advances, many SV breakpoint locations are subject to systematic bias affecting variant representation. To understand why SV breakpoints are inconsistent across samples, we reanalyzed 64 phased haplotypes constructed from long-read assemblies released by the Human Genome Structural Variation Consortium (HGSVC). We identify 882 SV insertions and 180 SV deletions with variable breakpoints not anchored in tandem repeats (TRs) or segmental duplications (SDs). SVs called from aligned sequencing reads increase breakpoint disagreements by 2×-16×. Sequence accuracy had a minimal impact on breakpoints, but we observe a strong effect of ancestry. We confirm that SNP and indel polymorphisms are enriched at shifted breakpoints and are also absent from variant callsets. Breakpoint homology increases the likelihood of imprecise SV calls and the distance they are shifted, and tandem duplications are the most heavily affected SVs. Because graph genome methods normalize SV calls across samples, we investigated graphs generated by two different methods and find the resulting breakpoints are subject to other technical biases affecting breakpoint accuracy. The breakpoint inconsistencies we characterize affect ∼5% of the SVs called in a human genome and can impact variant interpretation and annotation. These limitations underscore a need for algorithm development to improve SV databases, mitigate the impact of ancestry on breakpoints, and increase the value of callsets for investigating breakpoint features.


Subject(s)
Algorithms , Genome, Human , Humans , Sequence Analysis , Genomic Structural Variation , Bias , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing
5.
Genome Res ; 33(12): 2029-2040, 2023 12 27.
Article in English | MEDLINE | ID: mdl-38190646

ABSTRACT

Advances in long-read sequencing (LRS) technologies continue to make whole-genome sequencing more complete, affordable, and accurate. LRS provides significant advantages over short-read sequencing approaches, including phased de novo genome assembly, access to previously excluded genomic regions, and discovery of more complex structural variants (SVs) associated with disease. Limitations remain with respect to cost, scalability, and platform-dependent read accuracy and the tradeoffs between sequence coverage and sensitivity of variant discovery are important experimental considerations for the application of LRS. We compare the genetic variant-calling precision and recall of Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) HiFi platforms over a range of sequence coverages. For read-based applications, LRS sensitivity begins to plateau around 12-fold coverage with a majority of variants called with reasonable accuracy (F1 score above 0.5), and both platforms perform well for SV detection. Genome assembly increases variant-calling precision and recall of SVs and indels in HiFi data sets with HiFi outperforming ONT in quality as measured by the F1 score of assembly-based variant call sets. While both technologies continue to evolve, our work offers guidance to design cost-effective experimental strategies that do not compromise on discovering novel biology.


Subject(s)
Genomics , Nanopores , INDEL Mutation , Whole Genome Sequencing
6.
Nat Methods ; 19(10): 1230-1233, 2022 10.
Article in English | MEDLINE | ID: mdl-36109679

ABSTRACT

Complex structural variants (CSVs) encompass multiple breakpoints and are often missed or misinterpreted. We developed SVision, a deep-learning-based multi-object-recognition framework, to automatically detect and haracterize CSVs from long-read sequencing data. SVision outperforms current callers at identifying the internal structure of complex events and has revealed 80 high-quality CSVs with 25 distinct structures from an individual genome. SVision directly detects CSVs without matching known structures, allowing sensitive detection of both common and previously uncharacterized complex rearrangements.


Subject(s)
Deep Learning , Genome , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNA
7.
Cell ; 141(7): 1159-70, 2010 Jun 25.
Article in English | MEDLINE | ID: mdl-20602998

ABSTRACT

Highly active (i.e., "hot") long interspersed element-1 (LINE-1 or L1) sequences comprise the bulk of retrotransposition activity in the human genome; however, the abundance of hot L1s in the human population remains largely unexplored. Here, we used a fosmid-based, paired-end DNA sequencing strategy to identify 68 full-length L1s that are differentially present among individuals but are absent from the human genome reference sequence. The majority of these L1s were highly active in a cultured cell retrotransposition assay. Genotyping 26 elements revealed that two L1s are only found in Africa and that two more are absent from the H952 subset of the Human Genome Diversity Panel. Therefore, these results suggest that hot L1s are more abundant in the human population than previously appreciated, and that ongoing L1 retrotransposition continues to be a major source of interindividual genetic variation.


Subject(s)
Genome, Human , Long Interspersed Nucleotide Elements , Base Sequence , Gene Frequency , Genetics, Population , Humans , Molecular Sequence Data , Phylogeny
8.
Trends Genet ; 37(8): 717-729, 2021 08.
Article in English | MEDLINE | ID: mdl-33199048

ABSTRACT

Mutation of the human genome results in three classes of genomic variation: single nucleotide variants; short insertions or deletions; and large structural variants (SVs). Some mutations occur during normal processes, such as meiotic recombination or B cell development, and others result from DNA replication or aberrant repair of breaks in sequence-specific contexts. Regardless of mechanism, mutations are subject to selection, and some hotspots can manifest in disease. Here, we discuss genomic regions prone to mutation, mechanisms contributing to mutation susceptibility, and the processes leading to their accumulation in normal and somatic genomes. With further, more accurate human genome sequencing, additional mutation hotspots, mechanistic details of their formation, and the relevance of hotspots to evolution and disease are likely to be discovered.


Subject(s)
Genome, Human/genetics , Genomics , Mutation/genetics , DNA Replication/genetics , Genomic Structural Variation/genetics , Humans , Polymorphism, Single Nucleotide/genetics , Recombination, Genetic/genetics
9.
Genome Res ; 28(8): 1228-1242, 2018 08.
Article in English | MEDLINE | ID: mdl-29907612

ABSTRACT

Alu elements, the short interspersed element numbering more than 1 million copies per human genome, can mediate the formation of copy number variants (CNVs) between substrate pairs. These Alu/Alu-mediated rearrangements (AAMRs) can result in pathogenic variants that cause diseases. To investigate the impact of AAMR on gene variation and human health, we first characterized Alus that are involved in mediating CNVs (CNV-Alus) and observed that these Alus tend to be evolutionarily younger. We then computationally generated, with the assistance of a supercomputer, a test data set consisting of 78 million Alu pairs and predicted ∼18% of them are potentially susceptible to AAMR. We further determined the relative risk of AAMR in 12,074 OMIM genes using the count of predicted CNV-Alu pairs and experimentally validated the predictions with 89 samples selected by correlating predicted hotspots with a database of CNVs identified by clinical chromosomal microarrays (CMAs) on the genomes of approximately 54,000 subjects. We fine-mapped 47 duplications, 40 deletions, and two complex rearrangements and examined a total of 52 breakpoint junctions of simple CNVs. Overall, 94% of the candidate breakpoints were at least partially Alu mediated. We successfully predicted all (100%) of Alu pairs that mediated deletions (n = 21) and achieved an 87% positive predictive value overall when including AAMR-generated deletions and duplications. We provided a tool, AluAluCNVpredictor, for assessing AAMR hotspots and their role in human disease. These results demonstrate the utility of our predictive model and provide insights into the genomic features and molecular mechanisms underlying AAMR.


Subject(s)
Alu Elements/genetics , DNA Copy Number Variations/genetics , Genomic Instability/genetics , Gene Duplication/genetics , Genome, Human/genetics , Humans , Sequence Deletion
10.
PLoS Biol ; 16(3): e2003067, 2018 03.
Article in English | MEDLINE | ID: mdl-29505568

ABSTRACT

Human Long interspersed element-1 (L1) retrotransposons contain an internal RNA polymerase II promoter within their 5' untranslated region (UTR) and encode two proteins, (ORF1p and ORF2p) required for their mobilization (i.e., retrotransposition). The evolutionary success of L1 relies on the continuous retrotransposition of full-length L1 mRNAs. Previous studies identified functional splice donor (SD), splice acceptor (SA), and polyadenylation sequences in L1 mRNA and provided evidence that a small number of spliced L1 mRNAs retrotransposed in the human genome. Here, we demonstrate that the retrotransposition of intra-5'UTR or 5'UTR/ORF1 spliced L1 mRNAs leads to the generation of spliced integrated retrotransposed elements (SpIREs). We identified a new intra-5'UTR SpIRE that is ten times more abundant than previously identified SpIREs. Functional analyses demonstrated that both intra-5'UTR and 5'UTR/ORF1 SpIREs lack Cis-acting transcription factor binding sites and exhibit reduced promoter activity. The 5'UTR/ORF1 SpIREs also produce nonfunctional ORF1p variants. Finally, we demonstrate that sequence changes within the L1 5'UTR over evolutionary time, which permitted L1 to evade the repressive effects of a host protein, can lead to the generation of new L1 splicing events, which, upon retrotransposition, generates a new SpIRE subfamily. We conclude that splicing inhibits L1 retrotransposition, SpIREs generally represent evolutionary "dead-ends" in the L1 retrotransposition process, mutations within the L1 5'UTR alter L1 splicing dynamics, and that retrotransposition of the resultant spliced transcripts can generate interindividual genomic variation.


Subject(s)
Evolution, Molecular , Genome, Human , Long Interspersed Nucleotide Elements/genetics , Retroelements/genetics , HeLa Cells , Humans , Polymorphism, Genetic , Promoter Regions, Genetic , RNA Splicing , RNA, Messenger/metabolism
11.
Chromosome Res ; 28(1): 31-47, 2020 03.
Article in English | MEDLINE | ID: mdl-31907725

ABSTRACT

Structural variant (SV) differences between human genomes can cause germline and mosaic disease as well as inter-individual variation. De-regulation of accurate DNA repair and genomic surveillance mechanisms results in a large number of SVs in cancer. Analysis of the DNA sequences at SV breakpoints can help identify pathways of mutagenesis and regions of the genome that are more susceptible to rearrangement. Large-scale SV analyses have been enabled by high-throughput genome-level sequencing on humans in the past decade. These studies have shed light on the mechanisms and prevalence of complex genomic rearrangements. Recent advancements in both sequencing and other mapping technologies as well as calling algorithms for detection of genomic rearrangements have helped propel SV detection into population-scale studies, and have begun to elucidate previously inaccessible regions of the genome. Here, we discuss the genomic organization of simple and complex SVs, the molecular mechanisms of their formation, and various ways to detect them. We also introduce methods for characterizing SVs and their consequences on human genomes.


Subject(s)
Genome, Human , Genomic Structural Variation , Genomics/methods , Chromosome Aberrations , Chromosome Banding , Chromosome Mapping , Comparative Genomic Hybridization , Computational Biology/methods , Genomics/standards , High-Throughput Nucleotide Sequencing , Humans , Reproducibility of Results
12.
Am J Hum Genet ; 97(5): 691-707, 2015 Nov 05.
Article in English | MEDLINE | ID: mdl-26544804

ABSTRACT

The genomic duplication associated with Potocki-Lupski syndrome (PTLS) maps in close proximity to the duplication associated with Charcot-Marie-Tooth disease type 1A (CMT1A). PTLS is characterized by hypotonia, failure to thrive, reduced body weight, intellectual disability, and autistic features. CMT1A is a common autosomal dominant distal symmetric peripheral polyneuropathy. The key dosage-sensitive genes RAI1 and PMP22 are respectively associated with PTLS and CMT1A. Recurrent duplications accounting for the majority of subjects with these conditions are mediated by nonallelic homologous recombination between distinct low-copy repeat (LCR) substrates. The LCRs flanking a contiguous genomic interval encompassing both RAI1 and PMP22 do not share extensive homology; thus, duplications encompassing both loci are rare and potentially generated by a different mutational mechanism. We characterized genomic rearrangements that simultaneously duplicate PMP22 and RAI1, including nine potential complex genomic rearrangements, in 23 subjects by high-resolution array comparative genomic hybridization and breakpoint junction sequencing. Insertions and microhomologies were found at the breakpoint junctions, suggesting potential replicative mechanisms for rearrangement formation. At the breakpoint junctions of these nonrecurrent rearrangements, enrichment of repetitive DNA sequences was observed, indicating that they might predispose to genomic instability and rearrangement. Clinical evaluation revealed blended PTLS and CMT1A phenotypes with a potential earlier onset of neuropathy. Moreover, additional clinical findings might be observed due to the extra duplicated material included in the rearrangements. Our genomic analysis suggests replicative mechanisms as a predominant mechanism underlying PMP22-RAI1 contiguous gene duplications and provides further evidence supporting the role of complex genomic architecture in genomic instability.


Subject(s)
Charcot-Marie-Tooth Disease/genetics , Chromosome Disorders/genetics , Chromosome Duplication/genetics , Chromosomes, Human, Pair 17/genetics , Gene Duplication , Gene Rearrangement , Myelin Proteins/genetics , Transcription Factors/genetics , Abnormalities, Multiple/genetics , Abnormalities, Multiple/pathology , Charcot-Marie-Tooth Disease/pathology , Child , Child, Preschool , Chromosome Disorders/pathology , Comparative Genomic Hybridization , Female , Follow-Up Studies , Genome, Human , Genomics/methods , Humans , Infant , Male , Models, Genetic , Phenotype , Prognosis , Recombination, Genetic , Trans-Activators
13.
PLoS Genet ; 11(12): e1005686, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26641089

ABSTRACT

Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases-about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual's susceptibility to acquiring disease-associated alleles.


Subject(s)
Adaptor Proteins, Signal Transducing/genetics , Evolution, Molecular , Genome, Human , Kidney Diseases, Cystic/congenital , Membrane Proteins/genetics , Alleles , Animals , Comparative Genomic Hybridization , Cytoskeletal Proteins , Gene Dosage , Gene Rearrangement , Genomic Structural Variation , Haplotypes , Humans , Kidney Diseases, Cystic/genetics , Kidney Diseases, Cystic/pathology , Primates
14.
PLoS Genet ; 11(3): e1005050, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25749076

ABSTRACT

Inverted repeats (IRs) can facilitate structural variation as crucibles of genomic rearrangement. Complex duplication-inverted triplication-duplication (DUP-TRP/INV-DUP) rearrangements that contain breakpoint junctions within IRs have been recently associated with both MECP2 duplication syndrome (MIM#300260) and Pelizaeus-Merzbacher disease (PMD, MIM#312080). We investigated 17 unrelated PMD subjects with copy number gains at the PLP1 locus including triplication and quadruplication of specific genomic intervals-16/17 were found to have a DUP-TRP/INV-DUP rearrangement product. An IR distal to PLP1 facilitates DUP-TRP/INV-DUP formation as well as an inversion structural variation found frequently amongst normal individuals. We show that a homology-or homeology-driven replicative mechanism of DNA repair can apparently mediate template switches within stretches of microhomology. Moreover, we provide evidence that quadruplication and potentially higher order amplification of a genomic interval can occur in a manner consistent with rolling circle amplification as predicted by the microhomology-mediated break induced replication (MMBIR) model.


Subject(s)
Gene Duplication , Myelin Proteolipid Protein/genetics , Pelizaeus-Merzbacher Disease/genetics , Chromosome Breakpoints , Chromosome Inversion , Gene Dosage , Humans
15.
J Allergy Clin Immunol ; 139(1): 232-245, 2017 01.
Article in English | MEDLINE | ID: mdl-27577878

ABSTRACT

BACKGROUND: Primary immunodeficiency diseases (PIDDs) are clinically and genetically heterogeneous disorders thus far associated with mutations in more than 300 genes. The clinical phenotypes derived from distinct genotypes can overlap. Genetic etiology can be a prognostic indicator of disease severity and can influence treatment decisions. OBJECTIVE: We sought to investigate the ability of whole-exome screening methods to detect disease-causing variants in patients with PIDDs. METHODS: Patients with PIDDs from 278 families from 22 countries were investigated by using whole-exome sequencing. Computational copy number variant (CNV) prediction pipelines and an exome-tiling chromosomal microarray were also applied to identify intragenic CNVs. Analytic approaches initially focused on 475 known or candidate PIDD genes but were nonexclusive and further tailored based on clinical data, family history, and immunophenotyping. RESULTS: A likely molecular diagnosis was achieved in 110 (40%) unrelated probands. Clinical diagnosis was revised in about half (60/110) and management was directly altered in nearly a quarter (26/110) of families based on molecular findings. Twelve PIDD-causing CNVs were detected, including 7 smaller than 30 Kb that would not have been detected with conventional diagnostic CNV arrays. CONCLUSION: This high-throughput genomic approach enabled detection of disease-related variants in unexpected genes; permitted detection of low-grade constitutional, somatic, and revertant mosaicism; and provided evidence of a mutational burden in mixed PIDD immunophenotypes.


Subject(s)
Immunologic Deficiency Syndromes/genetics , Adolescent , Adult , Aged , Child , Child, Preschool , DNA Copy Number Variations , Female , Genomics , High-Throughput Nucleotide Sequencing , Humans , Infant , Male , Middle Aged , Young Adult
16.
Hum Mol Genet ; 24(14): 4061-77, 2015 Jul 15.
Article in English | MEDLINE | ID: mdl-25908615

ABSTRACT

Alu repetitive elements are known to be major contributors to genome instability by generating Alu-mediated copy-number variants (CNVs). Most of the reported Alu-mediated CNVs are simple deletions and duplications, and the mechanism underlying Alu-Alu-mediated rearrangement has been attributed to non-allelic homologous recombination (NAHR). Chromosome 17 at the p13.3 genomic region lacks extensive low-copy repeat architecture; however, it is highly enriched for Alu repetitive elements, with a fraction of 30% of total sequence annotated in the human reference genome, compared with the 10% genome-wide and 18% on chromosome 17. We conducted mechanistic studies of the 17p13.3 CNVs by performing high-density oligonucleotide array comparative genomic hybridization, specifically interrogating the 17p13.3 region with ∼150 bp per probe density; CNV breakpoint junctions were mapped to nucleotide resolution by polymerase chain reaction and Sanger sequencing. Studied rearrangements include 5 interstitial deletions, 14 tandem duplications, 7 terminal deletions and 13 complex genomic rearrangements (CGRs). Within the 17p13.3 region, Alu-Alu-mediated rearrangements were identified in 80% of the interstitial deletions, 46% of the tandem duplications and 50% of the CGRs, indicating that this mechanism was a major contributor for formation of breakpoint junctions. Our studies suggest that Alu repetitive elements facilitate formation of non-recurrent CNVs, CGRs and other structural aberrations of chromosome 17 at p13.3. The common observation of Alu-mediated rearrangement in CGRs and breakpoint junction sequences analysis further demonstrates that this type of mechanism is unlikely attributed to NAHR, but rather may be due to a recombination-coupled DNA replicative repair process.


Subject(s)
Alu Elements/genetics , Chromosomes, Human, Pair 17/genetics , DNA Copy Number Variations , Alleles , Base Sequence , Comparative Genomic Hybridization , Female , Gene Duplication , Gene Rearrangement , Genome, Human , Genomic Instability , Genomics , Homologous Recombination , Humans , Male , Molecular Sequence Data , Segmental Duplications, Genomic , Sequence Deletion
17.
Am J Hum Genet ; 95(2): 143-61, 2014 Aug 07.
Article in English | MEDLINE | ID: mdl-25065914

ABSTRACT

Intragenic copy-number variants (CNVs) contribute to the allelic spectrum of both Mendelian and complex disorders. Although pathogenic deletions and duplications in SPAST (mutations in which cause autosomal-dominant spastic paraplegia 4 [SPG4]) have been described, their origins and molecular consequences remain obscure. We mapped breakpoint junctions of 54 SPAST CNVs at nucleotide resolution. Diverse combinations of exons are deleted or duplicated, highlighting the importance of particular exons for spastin function. Of the 54 CNVs, 38 (70%) appear to be mediated by an Alu-based mechanism, suggesting that the Alu-rich genomic architecture of SPAST renders this locus susceptible to various genome rearrangements. Analysis of breakpoint Alus further informs a model of Alu-mediated CNV formation characterized by small CNV size and potential involvement of mechanisms other than homologous recombination. Twelve deletions (22%) overlap part of SPAST and a portion of a nearby, directly oriented gene, predicting novel chimeric genes in these subjects' genomes. cDNA from a subject with a SPAST final exon deletion contained multiple SPAST:SLC30A6 fusion transcripts, indicating that SPAST CNVs can have transcriptional effects beyond the gene itself. SLC30A6 has been implicated in Alzheimer disease, so these fusion gene data could explain a report of spastic paraplegia and dementia cosegregating in a family with deletion of the final exon of SPAST. Our findings provide evidence that the Alu genomic architecture of SPAST predisposes to diverse CNV alleles with distinct transcriptional--and possibly phenotypic--consequences. Moreover, we provide further mechanistic insights into Alu-mediated copy-number change that are extendable to other loci.


Subject(s)
Adenosine Triphosphatases/genetics , Alu Elements/genetics , Cation Transport Proteins/genetics , DNA Copy Number Variations/genetics , Spastic Paraplegia, Hereditary/genetics , Base Sequence , Cell Line, Transformed , Genotype , Humans , Protein Isoforms/genetics , Recombinant Fusion Proteins/genetics , Sequence Analysis, DNA , Sequence Deletion , Spastin
18.
Am J Hum Genet ; 95(1): 96-107, 2014 Jul 03.
Article in English | MEDLINE | ID: mdl-24931394

ABSTRACT

Human phosphoglucomutase 3 (PGM3) catalyzes the conversion of N-acetyl-glucosamine (GlcNAc)-6-phosphate into GlcNAc-1-phosphate during the synthesis of uridine diphosphate (UDP)-GlcNAc, a sugar nucleotide critical to multiple glycosylation pathways. We identified three unrelated children with recurrent infections, congenital leukopenia including neutropenia, B and T cell lymphopenia, and progression to bone marrow failure. Whole-exome sequencing demonstrated deleterious mutations in PGM3 in all three subjects, delineating their disease to be due to an unsuspected congenital disorder of glycosylation (CDG). Functional studies of the disease-associated PGM3 variants in E. coli cells demonstrated reduced PGM3 activity for all mutants tested. Two of the three children had skeletal anomalies resembling Desbuquois dysplasia: short stature, brachydactyly, dysmorphic facial features, and intellectual disability. However, these additional features were absent in the third child, showing the clinical variability of the disease. Two children received hematopoietic stem cell transplantation of cord blood and bone marrow from matched related donors; both had successful engraftment and correction of neutropenia and lymphopenia. We define PGM3-CDG as a treatable immunodeficiency, document the power of whole-exome sequencing in gene discoveries for rare disorders, and illustrate the utility of genomic analyses in studying combined and variable phenotypes.


Subject(s)
Bone Diseases, Developmental/genetics , Congenital Disorders of Glycosylation/genetics , Immunologic Deficiency Syndromes/genetics , Mutation , Phosphoglucomutase/genetics , Female , Humans , Male , Pedigree
19.
Genet Med ; 18(5): 443-51, 2016 05.
Article in English | MEDLINE | ID: mdl-26378787

ABSTRACT

PURPOSE: Charcot-Marie-Tooth (CMT) disease is a heterogeneous group of genetic disorders of the peripheral nervous system. Copy-number variants (CNVs) contribute significantly to CMT, as duplication of PMP22 underlies the majority of CMT1 cases. We hypothesized that CNVs and/or single-nucleotide variants (SNVs) might exist in patients with CMT with an unknown molecular genetic etiology. METHODS: Two hundred patients with CMT, negative for both SNV mutations in several CMT genes and for CNVs involving PMP22, were screened for CNVs by high-resolution oligonucleotide array comparative genomic hybridization. Whole-exome sequencing was conducted on individuals with rare, potentially pathogenic CNVs. RESULTS: Putatively causative CNVs were identified in five subjects (~2.5%); four of the five map to known neuropathy genes. Breakpoint sequencing revealed Alu-Alu-mediated junctions as a predominant contributor. Exome sequencing identified MFN2 SNVs in two of the individuals. CONCLUSION: Neuropathy-associated CNV outside of the PMP22 locus is rare in CMT. Nevertheless, there is potential clinical utility in testing for CNVs and exome sequencing in CMT cases negative for the CMT1A duplication. These findings suggest that complex phenotypes including neuropathy can potentially be caused by a combination of SNVs and CNVs affecting more than one disease-associated locus and contributing to a mutational burden.Genet Med 18 5, 443-451.


Subject(s)
Charcot-Marie-Tooth Disease/genetics , GTP Phosphohydrolases/genetics , Mitochondrial Proteins/genetics , Myelin Proteins/genetics , Polyneuropathies/genetics , Adult , Age of Onset , Charcot-Marie-Tooth Disease/physiopathology , Child, Preschool , Comparative Genomic Hybridization , DNA Copy Number Variations/genetics , Exome/genetics , Female , Genetic Predisposition to Disease , High-Throughput Nucleotide Sequencing/methods , Humans , Male , Motor Neurons/metabolism , Motor Neurons/pathology , Myelin P0 Protein/genetics , Neural Conduction/genetics , Polymorphism, Single Nucleotide/genetics , Polyneuropathies/physiopathology
20.
BMC Genomics ; 16: 214, 2015 Mar 19.
Article in English | MEDLINE | ID: mdl-25887218

ABSTRACT

BACKGROUND: Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high. RESULTS: We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki-Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants. CONCLUSIONS: The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.


Subject(s)
Genomics/methods , Chromosome Aberrations , Gene Library , Gene Rearrangement , Genetic Association Studies/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Workflow
SELECTION OF CITATIONS
SEARCH DETAIL