Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 712
Filter
Add more filters

Publication year range
1.
Cell ; 175(4): 1074-1087.e18, 2018 11 01.
Article in English | MEDLINE | ID: mdl-30388444

ABSTRACT

Mutation rates along the genome are highly variable and influenced by several chromatin features. Here, we addressed how nucleosomes, the most pervasive chromatin structure in eukaryotes, affect the generation of mutations. We discovered that within nucleosomes, the somatic mutation rate across several tumor cohorts exhibits a strong 10 base pair (bp) periodicity. This periodic pattern tracks the alternation of the DNA minor groove facing toward and away from the histones. The strength and phase of the mutation rate periodicity are determined by the mutational processes active in tumors. We uncovered similar periodic patterns in the genetic variation among human and Arabidopsis populations, also detectable in their divergence from close species, indicating that the same principles underlie germline and somatic mutation rates. We propose that differential DNA damage and repair processes dependent on the minor groove orientation in nucleosome-bound DNA contribute to the 10-bp periodicity in AT/CG content in eukaryotic genomes.


Subject(s)
DNA/genetics , Germ-Line Mutation , Mutation Rate , Nucleosomes/genetics , Arabidopsis/genetics , DNA/chemistry , GC Rich Sequence , Genetic Variation , Nucleic Acid Conformation , Nucleosomes/chemistry
2.
Mol Cell ; 73(4): 803-814.e6, 2019 02 21.
Article in English | MEDLINE | ID: mdl-30639243

ABSTRACT

Intron retention (IR) has emerged as an important mechanism of gene expression control, but the factors controlling IR events remain poorly understood. We observed consistent IR in one intron of the Irf7 gene and identified BUD13 as an RNA-binding protein that acts at this intron to increase the amount of successful splicing. Deficiency in BUD13 was associated with increased IR, decreased mature Irf7 transcript and protein levels, and consequently a dampened type I interferon response, which compromised the ability of BUD13-deficient macrophages to withstand vesicular stomatitis virus (VSV) infection. Global analysis of BUD13 knockdown and BUD13 cross-linking to RNA revealed a subset of introns that share many characteristics with the one found in Irf7 and are spliced in a BUD13-dependent manner. Deficiency of BUD13 led to decreased mature transcript from genes containing such introns. Thus, by acting as an antagonist to IR, BUD13 facilitates the expression of genes at which IR occurs.


Subject(s)
Interferon Regulatory Factor-7/metabolism , Interferon Type I/metabolism , Introns , Macrophages/metabolism , RNA-Binding Proteins/metabolism , Vesicular Stomatitis/metabolism , Vesicular stomatitis Indiana virus/pathogenicity , Animals , Binding Sites , Chlorocebus aethiops , GC Rich Sequence , HEK293 Cells , Host-Pathogen Interactions , Humans , Interferon Regulatory Factor-7/genetics , Interferon Type I/immunology , Macrophages/immunology , Macrophages/virology , Mice, Inbred C57BL , Protein Binding , RNA Splice Sites , RNA Splicing , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA-Binding Proteins/genetics , Vero Cells , Vesicular Stomatitis/genetics , Vesicular Stomatitis/immunology , Vesicular Stomatitis/virology , Vesicular stomatitis Indiana virus/immunology
3.
Nucleic Acids Res ; 52(10): 5928-5949, 2024 Jun 10.
Article in English | MEDLINE | ID: mdl-38412259

ABSTRACT

A GGGGCC (G4C2) hexanucleotide repeat expansion in C9ORF72 causes amyotrophic lateral sclerosis and frontotemporal dementia (C9ALS/FTD), while a CGG trinucleotide repeat expansion in FMR1 leads to the neurodegenerative disorder Fragile X-associated tremor/ataxia syndrome (FXTAS). These GC-rich repeats form RNA secondary structures that support repeat-associated non-AUG (RAN) translation of toxic proteins that contribute to disease pathogenesis. Here we assessed whether these same repeats might trigger stalling and interfere with translational elongation. We find that depletion of ribosome-associated quality control (RQC) factors NEMF, LTN1 and ANKZF1 markedly boost RAN translation product accumulation from both G4C2 and CGG repeats while overexpression of these factors reduces RAN production in both reporter assays and C9ALS/FTD patient iPSC-derived neurons. We also detected partially made products from both G4C2 and CGG repeats whose abundance increased with RQC factor depletion. Repeat RNA sequence, rather than amino acid content, is central to the impact of RQC factor depletion on RAN translation-suggesting a role for RNA secondary structure in these processes. Together, these findings suggest that ribosomal stalling and RQC pathway activation during RAN translation inhibits the generation of toxic RAN products. We propose augmenting RQC activity as a therapeutic strategy in GC-rich repeat expansion disorders.


Subject(s)
Amyotrophic Lateral Sclerosis , C9orf72 Protein , Frontotemporal Dementia , Protein Biosynthesis , Ribosomal Proteins , Trinucleotide Repeat Expansion , Humans , Amyotrophic Lateral Sclerosis/genetics , Amyotrophic Lateral Sclerosis/metabolism , Ataxia , C9orf72 Protein/genetics , C9orf72 Protein/metabolism , DNA Repeat Expansion/genetics , Fragile X Mental Retardation Protein/genetics , Fragile X Mental Retardation Protein/metabolism , Fragile X Syndrome/genetics , Fragile X Syndrome/metabolism , Frontotemporal Dementia/genetics , Frontotemporal Dementia/metabolism , GC Rich Sequence , HEK293 Cells , Induced Pluripotent Stem Cells/metabolism , Neurons/metabolism , Ribosomes/metabolism , Ribosomes/genetics , Tremor , Trinucleotide Repeat Expansion/genetics , Ribosomal Proteins/metabolism
4.
Proc Natl Acad Sci U S A ; 118(11)2021 03 16.
Article in English | MEDLINE | ID: mdl-33836575

ABSTRACT

Technological advances have allowed improvements in genome reference sequence assemblies. Here, we combined long- and short-read sequence resources to assemble the genome of a female Great Dane dog. This assembly has improved continuity compared to the existing Boxer-derived (CanFam3.1) reference genome. Annotation of the Great Dane assembly identified 22,182 protein-coding gene models and 7,049 long noncoding RNAs, including 49 protein-coding genes not present in the CanFam3.1 reference. The Great Dane assembly spans the majority of sequence gaps in the CanFam3.1 reference and illustrates that 2,151 gaps overlap the transcription start site of a predicted protein-coding gene. Moreover, a subset of the resolved gaps, which have an 80.95% median GC content, localize to transcription start sites and recombination hotspots more often than expected by chance, suggesting the stable canine recombinational landscape has shaped genome architecture. Alignment of the Great Dane and CanFam3.1 assemblies identified 16,834 deletions and 15,621 insertions, as well as 2,665 deletions and 3,493 insertions located on secondary contigs. These structural variants are dominated by retrotransposon insertion/deletion polymorphisms and include 16,221 dimorphic canine short interspersed elements (SINECs) and 1,121 dimorphic long interspersed element-1 sequences (LINE-1_Cfs). Analysis of sequences flanking the 3' end of LINE-1_Cfs (i.e., LINE-1_Cf 3'-transductions) suggests multiple retrotransposition-competent LINE-1_Cfs segregate among dog populations. Consistent with this conclusion, we demonstrate that a canine LINE-1_Cf element with intact open reading frames can retrotranspose its own RNA and that of a SINEC_Cf consensus sequence in cultured human cells, implicating ongoing retrotransposon activity as a driver of canine genetic variation.


Subject(s)
Dogs/genetics , GC Rich Sequence , Genome , Interspersed Repetitive Sequences , Animals , Dogs/classification , Long Interspersed Nucleotide Elements , Short Interspersed Nucleotide Elements , Species Specificity
5.
Trends Genet ; 36(2): 81-92, 2020 02.
Article in English | MEDLINE | ID: mdl-31837826

ABSTRACT

The presence of microsatellite repeat expansions within genes is associated with >30 neurological diseases. Of interest, (GGGGCC)>30-repeats within C9orf72 are associated with amyotrophic lateral sclerosis and frontotemporal dementia (ALS/FTD). These expansions can be 100s to 1000s of units long. Thus, it is perplexing how RNA-polymerase II (RNAPII) can successfully transcribe them. Recent investigations focusing on GGGGCC-transcription have identified specific, canonical complexes that may promote RNAPII-transcription at these GC-rich microsatellites: the DSIF complex and PAF1C. These complexes may be important for resolving the unique secondary structures formed by GGGGCC-DNA during transcription. Importantly, this process can produce potentially toxic repeat-containing RNA that can encode potentially toxic peptides, impacting neuron function and health. Understanding how transcription of these repeats occurs has implications for therapeutics in multiple diseases.


Subject(s)
C9orf72 Protein/genetics , DNA Repeat Expansion/genetics , Transcription Factors/genetics , Transcription, Genetic , Amyotrophic Lateral Sclerosis/genetics , Frontotemporal Dementia/genetics , Frontotemporal Dementia/pathology , GC Rich Sequence/genetics , Humans , Microsatellite Repeats/genetics , Neurons/metabolism , Neurons/pathology , Peptides/genetics , RNA/biosynthesis , RNA/genetics , RNA Polymerase II/genetics
6.
Nature ; 549(7673): 519-522, 2017 09 28.
Article in English | MEDLINE | ID: mdl-28959963

ABSTRACT

The characterization of mutational processes that generate sequence diversity in the human genome is of paramount importance both to medical genetics and to evolutionary studies. To understand how the age and sex of transmitting parents affect de novo mutations, here we sequence 1,548 Icelanders, their parents, and, for a subset of 225, at least one child, to 35× genome-wide coverage. We find 108,778 de novo mutations, both single nucleotide polymorphisms and indels, and determine the parent of origin of 42,961. The number of de novo mutations from mothers increases by 0.37 per year of age (95% CI 0.32-0.43), a quarter of the 1.51 per year from fathers (95% CI 1.45-1.57). The number of clustered mutations increases faster with the mother's age than with the father's, and the genomic span of maternal de novo mutation clusters is greater than that of paternal ones. The types of de novo mutation from mothers change substantially with age, with a 0.26% (95% CI 0.19-0.33%) decrease in cytosine-phosphate-guanine to thymine-phosphate-guanine (CpG>TpG) de novo mutations and a 0.33% (95% CI 0.28-0.38%) increase in C>G de novo mutations per year, respectively. Remarkably, these age-related changes are not distributed uniformly across the genome. A striking example is a 20 megabase region on chromosome 8p, with a maternal C>G mutation rate that is up to 50-fold greater than the rest of the genome. The age-related accumulation of maternal non-crossover gene conversions also mostly occurs within these regions. Increased sequence diversity and linkage disequilibrium of C>G variants within regions affected by excess maternal mutations indicate that the underlying mutational process has persisted in humans for thousands of years. Moreover, the regional excess of C>G variation in humans is largely shared by chimpanzees, less by gorillas, and is almost absent from orangutans. This demonstrates that sequence diversity in humans results from evolving interactions between age, sex, mutation type, and genomic location.


Subject(s)
Aging/genetics , Germ-Line Mutation/genetics , Maternal Age , Mutagenesis , Parents , Paternal Age , Adolescent , Adult , Aged , Animals , Child , Chromosomes, Human, Pair 8/genetics , Evolution, Molecular , Female , GC Rich Sequence , Genome, Human/genetics , Gorilla gorilla/genetics , Humans , INDEL Mutation , Iceland , Linkage Disequilibrium/genetics , Male , Middle Aged , Mutation Rate , Pan troglodytes/genetics , Polymorphism, Single Nucleotide , Pongo/genetics , Young Adult
7.
Nature ; 550(7674): 124-127, 2017 10 05.
Article in English | MEDLINE | ID: mdl-28953888

ABSTRACT

Vertebrate genomes exhibit marked CG suppression-that is, lower than expected numbers of 5'-CG-3' dinucleotides. This feature is likely to be due to C-to-T mutations that have accumulated over hundreds of millions of years, driven by CG-specific DNA methyl transferases and spontaneous methyl-cytosine deamination. Many RNA viruses of vertebrates that are not substrates for DNA methyl transferases mimic the CG suppression of their hosts. This property of viral genomes is unexplained. Here we show, using synonymous mutagenesis, that CG suppression is essential for HIV-1 replication. The deleterious effect of CG dinucleotides on HIV-1 replication was cumulative, associated with cytoplasmic RNA depletion, and was exerted by CG dinucleotides in both translated and non-translated exonic RNA sequences. A focused screen using small inhibitory RNAs revealed that zinc-finger antiviral protein (ZAP) inhibited virion production by cells infected with CG-enriched HIV-1. Crucially, HIV-1 mutants containing segments whose CG content mimicked random nucleotide sequence were defective in unmanipulated cells, but replicated normally in ZAP-deficient cells. Crosslinking-immunoprecipitation-sequencing assays demonstrated that ZAP binds directly and selectively to RNA sequences containing CG dinucleotides. These findings suggest that ZAP exploits host CG suppression to identify non-self RNA. The dinucleotide composition of HIV-1, and perhaps other RNA viruses, appears to have adapted to evade this host defence.


Subject(s)
Dinucleoside Phosphates/genetics , GC Rich Sequence/genetics , HIV-1/genetics , HIV-1/immunology , RNA, Viral/genetics , RNA, Viral/immunology , Cell Line , Cytoplasm/genetics , Cytoplasm/virology , HIV-1/growth & development , Humans , Immunoprecipitation , Mutagenesis , Mutation , Protein Binding , RNA, Viral/metabolism , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , Virus Replication/genetics
8.
Nucleic Acids Res ; 49(16): 9174-9193, 2021 09 20.
Article in English | MEDLINE | ID: mdl-34417622

ABSTRACT

To investigate how exogenous DNA concatemerizes to form episomal artificial chromosomes (ACs), acquire equal segregation ability and maintain stable holocentromeres, we injected DNA sequences with different features, including sequences that are repetitive or complex, and sequences with different AT-contents, into the gonad of Caenorhabditis elegans to form ACs in embryos, and monitored AC mitotic segregation. We demonstrated that AT-poor sequences (26% AT-content) delayed the acquisition of segregation competency of newly formed ACs. We also co-injected fragmented Saccharomyces cerevisiae genomic DNA, differentially expressed fluorescent markers and ubiquitously expressed selectable marker to construct a less repetitive, more complex AC. We sequenced the whole genome of a strain which propagates this AC through multiple generations, and de novo assembled the AC sequences. We discovered CENP-AHCP-3 domains/peaks are distributed along the AC, as in endogenous chromosomes, suggesting a holocentric architecture. We found that CENP-AHCP-3 binds to the unexpressed marker genes and many fragmented yeast sequences, but is excluded in the yeast extremely high-AT-content centromeric and mitochondrial DNA (> 83% AT-content) on the AC. We identified A-rich motifs in CENP-AHCP-3 domains/peaks on the AC and on endogenous chromosomes, which have some similarity with each other and similarity to some non-germline transcription factor binding sites.


Subject(s)
Chromosome Segregation , Chromosomes, Artificial/genetics , Mitosis , Animals , Caenorhabditis elegans , Caenorhabditis elegans Proteins/metabolism , Centromere/genetics , Centromere/metabolism , GC Rich Sequence , Heat-Shock Proteins/metabolism , Protein Binding , Saccharomyces cerevisiae
9.
Genome Res ; 29(6): 896-906, 2019 06.
Article in English | MEDLINE | ID: mdl-31152051

ABSTRACT

Compared to coding sequences, untranslated regions of the transcriptome are not well conserved, and functional annotation of these sequences is challenging. Global relationships between nucleotide composition of 3' UTR sequences and their sequence conservation have been appreciated since mammalian genomes were first sequenced, but the functional relevance of these patterns remain unknown. We systematically measured the effect on gene expression of the sequences of more than 25,000 RNA-binding protein (RBP) binding sites in primary mouse T cells using a massively parallel reporter assay. GC-rich sequences were destabilizing of reporter mRNAs and come from more rapidly evolving regions of the genome. These sequences were more likely to be folded in vivo and contain a number of structural motifs that reduced accumulation of a heterologous reporter protein. Comparison of full-length 3' UTR sequences across vertebrate phylogeny revealed that strictly conserved 3' UTRs were GC-poor and enriched in genes associated with organismal development. In contrast, rapidly evolving 3' UTRs tended to be GC-rich and derived from genes involved in metabolism and immune responses. Cell-essential genes had lower GC content in their 3' UTRs, suggesting a connection between unstructured mRNA noncoding sequences and optimal protein production. By reducing gene expression, GC-rich RBP-occupied sequences act as a rapidly evolving substrate for gene regulatory interactions.


Subject(s)
3' Untranslated Regions , Base Composition , Conserved Sequence , Gene Expression Regulation , Gene Expression , Genes, Reporter , RNA, Messenger/genetics , Animals , Base Sequence , Evolution, Molecular , GC Rich Sequence , Humans , Mice , Nucleic Acid Conformation , RNA Stability , RNA, Messenger/chemistry
10.
Nature ; 530(7591): 441-6, 2016 Feb 25.
Article in English | MEDLINE | ID: mdl-26863196

ABSTRACT

Gene expression can be regulated post-transcriptionally through dynamic and reversible RNA modifications. A recent noteworthy example is N(6)-methyladenosine (m(6)A), which affects messenger RNA (mRNA) localization, stability, translation and splicing. Here we report on a new mRNA modification, N(1)-methyladenosine (m(1)A), that occurs on thousands of different gene transcripts in eukaryotic cells, from yeast to mammals, at an estimated average transcript stoichiometry of 20% in humans. Employing newly developed sequencing approaches, we show that m(1)A is enriched around the start codon upstream of the first splice site: it preferentially decorates more structured regions around canonical and alternative translation initiation sites, is dynamic in response to physiological conditions, and correlates positively with protein production. These unique features are highly conserved in mouse and human cells, strongly indicating a functional role for m(1)A in promoting translation of methylated mRNA.


Subject(s)
Adenosine/analogs & derivatives , RNA, Messenger/metabolism , 5' Untranslated Regions/genetics , Adenosine/metabolism , Animals , Base Sequence , Cell Line , Cell Line, Tumor , Codon, Initiator/genetics , Conserved Sequence , Epigenesis, Genetic , Evolution, Molecular , GC Rich Sequence/genetics , Humans , Methylation , Mice , Organ Specificity , Peptide Chain Initiation, Translational/genetics , RNA Splice Sites/genetics , RNA, Messenger/genetics , Saccharomyces cerevisiae , Transcriptome/genetics
11.
Nucleic Acids Res ; 48(4): 1748-1763, 2020 02 28.
Article in English | MEDLINE | ID: mdl-31930331

ABSTRACT

The double-helical structure of DNA results from canonical base pairing and stacking interactions. However, variations from steady-state conformations resulting from mechanical perturbations in cells have physiological relevance but their dependence on sequence remains unclear. Here, we use molecular dynamics simulations showing sequence differences result in markedly different structural motifs upon physiological twisting and stretching. We simulate overextension on different sequences of DNA ((AA)12, (AT)12, (CC)12 and (CG)12) with supercoiling densities at 200 and 50 mM salt concentrations. We find that DNA denatures in the majority of stretching simulations, surprisingly including those with over-twisted DNA. GC-rich sequences are observed to be more stable than AT-rich ones, with the specific response dependent on the base pair order. Furthermore, we find that (AT)12 forms stable periodic structures with non-canonical hydrogen bonds in some regions and non-canonical stacking in others, whereas (CG)12 forms a stacking motif of four base pairs independent of supercoiling density. Our results demonstrate that 20-30% DNA extension is sufficient for breaking B-DNA around and significantly above cellular supercoiling, and that the DNA sequence is crucial for understanding structural changes under mechanical stress. Our findings have important implications for the activities of protein machinery interacting with DNA in all cells.


Subject(s)
Base Pairing/genetics , Base Sequence/genetics , DNA/chemistry , Biophysical Phenomena , DNA/genetics , GC Rich Sequence/genetics , Hydrogen Bonding , Molecular Dynamics Simulation , Molecular Structure , Nucleic Acid Conformation
12.
Nucleic Acids Res ; 48(6): 3103-3118, 2020 04 06.
Article in English | MEDLINE | ID: mdl-32025695

ABSTRACT

Micro (mi)RNAs are 20-22nt long non-coding RNA molecules involved in post-transcriptional silencing of targets having high base-pair complementarity. Plant miRNAs are processed from long Pol II-transcripts with specific stem-loop structures by Dicer-like (DCL) 1 protein. Although there were reports indicating how a specific region is selected for miRNA biogenesis, molecular details were unclear. Here, we show that the presence of specific GC-rich sequence signature within miRNA/miRNA* region is required for the precise miRNA biogenesis. The involvement of GC-rich signatures in precise processing and abundance of miRNAs was confirmed through detailed molecular and functional analysis. Consistent with the presence of the miRNA-specific GC signature, target RNAs of miRNAs also possess conserved complementary sequence signatures in their miRNA binding motifs. The selection of these GC signatures was dependent on an RNA binding protein partner of DCL1 named HYL1. Finally, we demonstrate a direct application of this discovery for enhancing the abundance and efficiency of artificial miRNAs that are popular in plant functional genomic studies.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis/genetics , Cell Cycle Proteins/genetics , MicroRNAs/biosynthesis , RNA-Binding Proteins/genetics , Ribonuclease III/genetics , Conserved Sequence/genetics , GC Rich Sequence/genetics , Gene Expression Regulation, Plant/genetics , MicroRNAs/genetics , RNA, Plant/genetics , RNA-Binding Motifs/genetics
13.
Proc Natl Acad Sci U S A ; 116(48): 24303-24309, 2019 11 26.
Article in English | MEDLINE | ID: mdl-31719195

ABSTRACT

Infection of animal cells by numerous viruses is detected and countered by a variety of means, including recognition of nonself nucleic acids. The zinc finger antiviral protein (ZAP) depletes cytoplasmic RNA that is recognized as foreign in mammalian cells by virtue of its elevated CG dinucleotide content compared with endogenous mRNAs. Here, we determined a crystal structure of a protein-RNA complex containing the N-terminal, 4-zinc finger human (h) ZAP RNA-binding domain (RBD) and a CG dinucleotide-containing RNA target. The structure reveals in molecular detail how hZAP is able to bind selectively to CG-rich RNA. Specifically, the 4 zinc fingers create a basic patch on the hZAP RBD surface. The highly basic second zinc finger contains a pocket that selectively accommodates CG dinucleotide bases. Structure guided mutagenesis, cross-linking immunoprecipitation sequencing assays, and RNA affinity assays show that the structurally defined CG-binding pocket is not required for RNA binding per se in human cells. However, the pocket is a crucial determinant of high-affinity, specific binding to CG dinucleotide-containing RNA. Moreover, variations in RNA-binding specificity among a panel of CG-binding pocket mutants quantitatively predict their selective antiviral activity against a CG-enriched HIV-1 strain. Overall, the hZAP RBD RNA structure provides an atomic-level explanation for how ZAP selectively targets foreign, CG-rich RNA.


Subject(s)
GC Rich Sequence , RNA, Viral/metabolism , RNA-Binding Proteins/chemistry , RNA-Binding Proteins/metabolism , Repressor Proteins/chemistry , Repressor Proteins/metabolism , Binding Sites , Crystallography, X-Ray , Fluorescence Polarization , HEK293 Cells , HIV-1/genetics , Humans , Models, Molecular , Mutagenesis , Mutation , Protein Domains , RNA, Viral/chemistry , RNA-Binding Proteins/genetics , Repressor Proteins/genetics , Zinc Fingers
14.
Brain ; 143(1): 222-233, 2020 01 01.
Article in English | MEDLINE | ID: mdl-31819945

ABSTRACT

Essential tremor is one of the most common movement disorders. Despite its high prevalence and heritability, the genetic aetiology of essential tremor remains elusive. Up to now, only a few genes/loci have been identified, but these genes have not been replicated in other essential tremor families or cohorts. Here we report a genetic study in a cohort of 197 Chinese pedigrees clinically diagnosed with essential tremor. Using a comprehensive strategy combining linkage analysis, whole-exome sequencing, long-read whole-genome sequencing, repeat-primed polymerase chain reaction and GC-rich polymerase chain reaction, we identified an abnormal GGC repeat expansion in the 5' region of the NOTCH2NLC gene that co-segregated with disease in 11 essential tremor families (5.58%) from our cohort. Clinically, probands that had an abnormal GGC repeat expansion were found to have more severe tremor phenotypes, lower activities of daily living ability. Obvious genetic anticipation was also detected in these 11 essential tremor-positive families. These results indicate that abnormal GGC repeat expansion in the 5' region of NOTCH2NLC gene is associated with essential tremor, and provide strong evidence that essential tremor is a family of diseases with high clinical and genetic heterogeneities.


Subject(s)
Asian People/genetics , Essential Tremor/genetics , Trinucleotide Repeat Expansion/genetics , Adult , Aged , Female , GC Rich Sequence , Genetic Linkage , Humans , Intranuclear Inclusion Bodies/genetics , Intranuclear Inclusion Bodies/ultrastructure , Male , Microscopy, Electron , Middle Aged , Neurodegenerative Diseases/genetics , Pedigree , Polymerase Chain Reaction , Skin/ultrastructure , Exome Sequencing , Whole Genome Sequencing
15.
Nature ; 517(7536): 608-11, 2015 Jan 29.
Article in English | MEDLINE | ID: mdl-25383537

ABSTRACT

The human genome is arguably the most complete mammalian reference assembly, yet more than 160 euchromatic gaps remain and aspects of its structural variation remain poorly understood ten years after its completion. To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing. We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome--78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3:1) in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Genomics , Sequence Analysis, DNA/methods , Chromosome Inversion/genetics , Chromosomes, Human, Pair 10/genetics , Cloning, Molecular , GC Rich Sequence/genetics , Haploidy , Humans , Mutagenesis, Insertional/genetics , Reference Standards , Tandem Repeat Sequences/genetics
16.
Nucleic Acids Res ; 47(5): e25, 2019 03 18.
Article in English | MEDLINE | ID: mdl-30590705

ABSTRACT

Dysregulated protein synthesis is a major underlying cause of many neurodevelopmental diseases including fragile X syndrome. In order to capture subtle but biologically significant differences in translation in these disorders, a robust technique is required. One powerful tool to study translational control is ribosome profiling, which is based on deep sequencing of mRNA fragments protected from ribonuclease (RNase) digestion by ribosomes. However, this approach has been mainly applied to rapidly dividing cells where translation is active and large amounts of starting material are readily available. The application of ribosome profiling to low-input brain tissue where translation is modest and gene expression changes between genotypes are expected to be small has not been carefully evaluated. Using hippocampal tissue from wide type and fragile X mental retardation 1 (Fmr1) knockout mice, we show that variable RNase digestion can lead to significant sample batch effects. We also establish GC content and ribosome footprint length as quality control metrics for RNase digestion. We performed RNase titration experiments for low-input samples to identify optimal conditions for this critical step that is often improperly conducted. Our data reveal that optimal RNase digestion is essential to ensure high quality and reproducibility of ribosome profiling for low-input brain tissue.


Subject(s)
Brain/metabolism , Disease Models, Animal , Fragile X Syndrome/genetics , RNA, Messenger/analysis , RNA, Messenger/genetics , Ribosomes/genetics , Ribosomes/metabolism , Animals , Base Sequence , Female , Fragile X Syndrome/metabolism , GC Rich Sequence , Male , Mice , Quality Control , RNA, Messenger/metabolism , Ribonucleases/metabolism
17.
PLoS Genet ; 14(10): e1007467, 2018 10.
Article in English | MEDLINE | ID: mdl-30356280

ABSTRACT

Structural features of genomes, including the three-dimensional arrangement of DNA in the nucleus, are increasingly seen as key contributors to the regulation of gene expression. However, studies on how genome structure and nuclear organisation influence transcription have so far been limited to a handful of model species. This narrow focus limits our ability to draw general conclusions about the ways in which three-dimensional structures are encoded, and to integrate information from three-dimensional data to address a broader gamut of biological questions. Here, we generate a complete and gapless genome sequence for the filamentous fungus, Epichloë festucae. We use Hi-C data to examine the three-dimensional organisation of the genome, and RNA-seq data to investigate how Epichloë genome structure contributes to the suite of transcriptional changes needed to maintain symbiotic relationships with the grass host. Our results reveal a genome in which very repeat-rich blocks of DNA with discrete boundaries are interspersed by gene-rich sequences that are almost repeat-free. In contrast to other species reported to date, the three-dimensional structure of the genome is anchored by these repeat blocks, which act to isolate transcription in neighbouring gene-rich regions. Genes that are differentially expressed in planta are enriched near the boundaries of these repeat-rich blocks, suggesting that their three-dimensional orientation partly encodes and regulates the symbiotic relationship formed by this organism.


Subject(s)
DNA, Fungal/genetics , Epichloe/genetics , Gene Expression Regulation, Fungal , Genome, Fungal/genetics , Repetitive Sequences, Nucleic Acid/genetics , AT Rich Sequence/genetics , DNA, Fungal/chemistry , Fungal Proteins/genetics , GC Rich Sequence/genetics , Gene Expression Profiling/methods , Hyphae/genetics , Sequence Analysis, DNA/methods , Symbiosis/genetics
18.
Int J Mol Sci ; 22(23)2021 Dec 03.
Article in English | MEDLINE | ID: mdl-34884879

ABSTRACT

MiR-143 play an important role in hepatocellular carcinoma and liver fibrosis via inhibiting hepatoma cell proliferation. DNA methyltransferase 3 alpha (DNMT3a), as a target of miR-143, regulates the development of primary organic solid tumors through DNA methylation mechanisms. However, the effect of miR-143 on DNA methylation profiles in liver is unclear. In this study, we used Whole-Genome Bisulfite Sequencing (WGBS) to detect the differentially methylated regions (DMRs), and investigated DMR-related genes and their enriched pathways by miR-143. We found that methylated cytosines increased 0.19% in the miR-143 knock-out (KO) liver fed with high-fat diet (HFD), compared with the wild type (WT). Furthermore, compared with the WT group, the CG methylation patterns of the KO group showed lower CG methylation levels in CG islands (CGIs), promoters and hypermethylation in CGI shores, 5'UTRs, exons, introns, 3'UTRs, and repeat regions. A total of 984 DMRs were identified between the WT and KO groups consisting of 559 hypermethylation and 425 hypomethylation DMRs. Furthermore, DMR-related genes were enriched in metabolism pathways such as carbon metabolism (serine hydroxymethyltransferase 2 (Shmt2), acyl-Coenzyme A dehydrogenase medium chain (Acadm)), arginine and proline metabolism (spermine synthase (Sms), proline dehydrogenase (Prodh2)) and purine metabolism (phosphoribosyl pyrophosphate synthetase 2 (Prps2)). In summary, we are the first to report the change in whole-genome methylation levels by miR-143-null through WGBS in mice liver, and provide an experimental basis for clinical diagnosis and treatment in liver diseases, indicating that miR-143 may be a potential therapeutic target and biomarker for liver damage-associated diseases and hepatocellular carcinoma.


Subject(s)
DNA Methylation , Liver/metabolism , MicroRNAs/metabolism , Whole Genome Sequencing , Animals , CpG Islands , Epigenesis, Genetic , Epigenomics , GC Rich Sequence , Genome , High-Throughput Nucleotide Sequencing , Male , Mice , Mice, Knockout , MicroRNAs/genetics , Promoter Regions, Genetic , Sulfites
19.
BMC Genomics ; 21(1): 376, 2020 May 29.
Article in English | MEDLINE | ID: mdl-32471448

ABSTRACT

BACKGROUND: Parasitoid wasps have fascinating life cycles and play an important role in trophic networks, yet little is known about their genome content and function. Parasitoids that infect aphids are an important group with the potential for biological control. Their success depends on adapting to develop inside aphids and overcoming both host aphid defenses and their protective endosymbionts. RESULTS: We present the de novo genome assemblies, detailed annotation, and comparative analysis of two closely related parasitoid wasps that target pest aphids: Aphidius ervi and Lysiphlebus fabarum (Hymenoptera: Braconidae: Aphidiinae). The genomes are small (139 and 141 Mbp) and the most AT-rich reported thus far for any arthropod (GC content: 25.8 and 23.8%). This nucleotide bias is accompanied by skewed codon usage and is stronger in genes with adult-biased expression. AT-richness may be the consequence of reduced genome size, a near absence of DNA methylation, and energy efficiency. We identify missing desaturase genes, whose absence may underlie mimicry in the cuticular hydrocarbon profile of L. fabarum. We highlight key gene groups including those underlying venom composition, chemosensory perception, and sex determination, as well as potential losses in immune pathway genes. CONCLUSIONS: These findings are of fundamental interest for insect evolution and biological control applications. They provide a strong foundation for further functional studies into coevolution between parasitoids and their hosts. Both genomes are available at https://bipaa.genouest.org.


Subject(s)
Aphids/genetics , Genomics , Wasps/genetics , Animals , Aphids/immunology , DNA Methylation/genetics , GC Rich Sequence , Insect Proteins/genetics , Sex Determination Processes/genetics , Venoms/genetics , Wasps/immunology
20.
Genome Res ; 27(3): 407-418, 2017 03.
Article in English | MEDLINE | ID: mdl-27940950

ABSTRACT

Up-frameshift protein 1 (UPF1) is an ATP-dependent RNA helicase that has essential roles in RNA surveillance and in post-transcriptional gene regulation by promoting the degradation of mRNAs. Previous studies revealed that UPF1 is associated with the 3' untranslated region (UTR) of target mRNAs via as-yet-unknown sequence features. Herein, we aimed to identify characteristic sequence features of UPF1 targets. We identified 246 UPF1 targets by measuring RNA stabilization upon UPF1 depletion and by identifying mRNAs that associate with UPF1. By analyzing RNA footprint data of phosphorylated UPF1 and two CLIP-seq data of UPF1, we found that 3' UTR but not 5' UTRs or open reading frames of UPF1 targets have GC-rich motifs embedded in high GC-content regions. Reporter gene experiments revealed that GC-rich motifs in UPF1 targets were indispensable for UPF1-mediated mRNA decay. These findings highlight the important features of UPF1 target 3' UTRs.


Subject(s)
3' Untranslated Regions , GC Rich Sequence , Nonsense Mediated mRNA Decay , RNA Helicases/metabolism , RNA, Messenger/metabolism , Trans-Activators/metabolism , HeLa Cells , Humans , RNA Helicases/genetics , RNA, Messenger/chemistry , Trans-Activators/genetics
SELECTION OF CITATIONS
SEARCH DETAIL