ABSTRACT
Genome editing occurs in the context of chromatin, which is heterogeneous in structure and function across the genome. Chromatin heterogeneity is thought to affect genome editing efficiency, but this has been challenging to quantify due to the presence of confounding variables. Here, we develop a method that exploits the allele-specific chromatin status of imprinted genes in order to address this problem in cycling mouse embryonic stem cells (mESCs). Because maternal and paternal alleles of imprinted genes have identical DNA sequence and are situated in the same nucleus, allele-specific differences in the frequency and spectrum of mutations induced by CRISPR-Cas9 can be unequivocally attributed to epigenetic mechanisms. We found that heterochromatin can impede mutagenesis, but to a degree that depends on other key experimental parameters. Mutagenesis was impeded by up to 7-fold when Cas9 exposure was brief and when intracellular Cas9 expression was low. In contrast, the outcome of mutagenic DNA repair was unaffected by chromatin state, with similar efficiencies of homology-directed repair (HDR) and deletion spectra on maternal and paternal chromosomes. Combined, our data show that heterochromatin imposes a permeable barrier that influences the kinetics, but not the endpoint, of CRISPR-Cas9 genome editing and suggest that therapeutic applications involving low-level Cas9 exposure will be particularly affected by chromatin status.
Subject(s)
DNA Repair/physiology , Heterochromatin/genetics , Heterochromatin/physiology , Animals , Base Sequence , CRISPR-Cas Systems/genetics , CRISPR-Cas Systems/physiology , DNA Breaks, Double-Stranded , DNA Repair/genetics , Endonucleases/metabolism , Gene Editing/methods , Genome , Mice , Mice, Inbred C57BL , Mouse Embryonic Stem Cells/physiology , Mutagenesis, Insertional , Mutagens , Mutation/genetics , Recombinational DNA Repair/physiology , Sequence DeletionABSTRACT
BACKGROUND: The study of genomic variation has provided key insights into the functional role of mutations. Predominantly, studies have focused on single nucleotide variants (SNV), which are relatively easy to detect and can be described with rich mathematical models. However, it has been observed that genomes are highly plastic, and that whole regions can be moved, removed or duplicated in bulk. These structural variants (SV) have been shown to have significant impact on phenotype, but their study has been held back by the combinatorial complexity of the underlying models. RESULTS: We describe here a general model of structural variation that encompasses both balanced rearrangements and arbitrary copy-number variants (CNV). CONCLUSIONS: In this model, we show that the space of possible evolutionary histories that explain the structural differences between any two genomes can be sampled ergodically.
ABSTRACT
Although it is known that the methylation of DNA in 5' promoters suppresses gene expression, the role of DNA methylation in gene bodies is unclear. In mammals, tissue- and cell type-specific methylation is present in a small percentage of 5' CpG island (CGI) promoters, whereas a far greater proportion occurs across gene bodies, coinciding with highly conserved sequences. Tissue-specific intragenic methylation might reduce, or, paradoxically, enhance transcription elongation efficiency. Capped analysis of gene expression (CAGE) experiments also indicate that transcription commonly initiates within and between genes. To investigate the role of intragenic methylation, we generated a map of DNA methylation from the human brain encompassing 24.7 million of the 28 million CpG sites. From the dense, high-resolution coverage of CpG islands, the majority of methylated CpG islands were shown to be in intragenic and intergenic regions, whereas less than 3% of CpG islands in 5' promoters were methylated. The CpG islands in all three locations overlapped with RNA markers of transcription initiation, and unmethylated CpG islands also overlapped significantly with trimethylation of H3K4, a histone modification enriched at promoters. The general and CpG-island-specific patterns of methylation are conserved in mouse tissues. An in-depth investigation of the human SHANK3 locus and its mouse homologue demonstrated that this tissue-specific DNA methylation regulates intragenic promoter activity in vitro and in vivo. These methylation-regulated, alternative transcripts are expressed in a tissue- and cell type-specific manner, and are expressed differentially within a single cell type from distinct brain regions. These results support a major role for intragenic methylation in regulating cell context-specific alternative promoters in gene bodies.
Subject(s)
Brain/metabolism , Conserved Sequence/genetics , DNA Methylation , Promoter Regions, Genetic/genetics , Animals , Brain/anatomy & histology , Brain/cytology , Carrier Proteins/genetics , Cell Line , CpG Islands/genetics , DNA, Intergenic/genetics , DNA, Intergenic/metabolism , Frontal Lobe/metabolism , Gene Expression Regulation , Histones/genetics , Histones/metabolism , Humans , Male , Mice , Mice, Inbred C57BL , Microfilament Proteins , Middle Aged , Nerve Tissue Proteins , Organ Specificity , Transcription, Genetic/geneticsABSTRACT
Cytosine methylation, a common form of DNA modification that antagonizes transcription, is found at transposons and repeats in vertebrates, plants and fungi. Here we have mapped DNA methylation in the entire Arabidopsis thaliana genome at high resolution. DNA methylation covers transposons and is present within a large fraction of A. thaliana genes. Methylation within genes is conspicuously biased away from gene ends, suggesting a dependence on RNA polymerase transit. Genic methylation is strongly influenced by transcription: moderately transcribed genes are most likely to be methylated, whereas genes at either extreme are least likely. In turn, transcription is influenced by methylation: short methylated genes are poorly expressed, and loss of methylation in the body of a gene leads to enhanced transcription. Our results indicate that genic transcription and DNA methylation are closely interwoven processes.
Subject(s)
Arabidopsis/genetics , Chromosome Mapping/methods , DNA Methylation , Transcription, Genetic , Gene Expression Regulation, Plant , Genes, Plant , Genome, Plant , Models, Biological , Oligonucleotide Array Sequence Analysis , Plants, Genetically Modified , Transcriptional Elongation Factors/physiologyABSTRACT
Eukaryotic chromatin is separated into functional domains differentiated by post-translational histone modifications, histone variants and DNA methylation. Methylation is associated with repression of transcriptional initiation in plants and animals, and is frequently found in transposable elements. Proper methylation patterns are crucial for eukaryotic development, and aberrant methylation-induced silencing of tumour suppressor genes is a common feature of human cancer. In contrast to methylation, the histone variant H2A.Z is preferentially deposited by the Swr1 ATPase complex near 5' ends of genes where it promotes transcriptional competence. How DNA methylation and H2A.Z influence transcription remains largely unknown. Here we show that in the plant Arabidopsis thaliana regions of DNA methylation are quantitatively deficient in H2A.Z. Exclusion of H2A.Z is seen at sites of DNA methylation in the bodies of actively transcribed genes and in methylated transposons. Mutation of the MET1 DNA methyltransferase, which causes both losses and gains of DNA methylation, engenders opposite changes (gains and losses) in H2A.Z deposition, whereas mutation of the PIE1 subunit of the Swr1 complex that deposits H2A.Z leads to genome-wide hypermethylation. Our findings indicate that DNA methylation can influence chromatin structure and effect gene silencing by excluding H2A.Z, and that H2A.Z protects genes from DNA methylation.
Subject(s)
Arabidopsis/genetics , Arabidopsis/metabolism , Chromatin/metabolism , DNA Methylation , Histones/metabolism , Arabidopsis/enzymology , Arabidopsis Proteins/genetics , Arabidopsis Proteins/metabolism , Chromatin/genetics , DNA (Cytosine-5-)-Methyltransferases/genetics , DNA (Cytosine-5-)-Methyltransferases/metabolism , Gene Expression Regulation, Plant , Gene Silencing , Mutation , Transcription Factors/genetics , Transcription Factors/metabolism , Transcription, GeneticABSTRACT
BACKGROUND: Structural variants (SVs) are known to play important roles in a variety of cancers, but their origins and functional consequences are still poorly understood. Many SVs are thought to emerge from errors in the repair processes following DNA double strand breaks (DSBs). RESULTS: We used experimentally quantified DSB frequencies in cell lines with matched chromatin and sequence features to derive the first quantitative genome-wide models of DSB susceptibility. These models are accurate and provide novel insights into the mutational mechanisms generating DSBs. Models trained in one cell type can be successfully applied to others, but a substantial proportion of DSBs appear to reflect cell type-specific processes. Using model predictions as a proxy for susceptibility to DSBs in tumors, many SV-enriched regions appear to be poorly explained by selectively neutral mutational bias alone. A substantial number of these regions show unexpectedly high SV breakpoint frequencies given their predicted susceptibility to mutation and are therefore credible targets of positive selection in tumors. These putatively positively selected SV hotspots are enriched for genes previously shown to be oncogenic. In contrast, several hundred regions across the genome show unexpectedly low levels of SVs, given their relatively high susceptibility to mutation. These novel coldspot regions appear to be subject to purifying selection in tumors and are enriched for active promoters and enhancers. CONCLUSIONS: We conclude that models of DSB susceptibility offer a rigorous approach to the inference of SVs putatively subject to selection in tumors.
Subject(s)
DNA Breaks, Double-Stranded , Genomic Structural Variation , Models, Genetic , Neoplasms/genetics , Humans , K562 Cells , MCF-7 Cells , Regression AnalysisABSTRACT
BACKGROUND: Retroposed processed gene transcripts are an important source of material for new gene formation on evolutionary timescales. Most prior work on gene retrocopy discovery compared copies in reference genome assemblies to their source genes. Here, we explore gene retrocopy insertion polymorphisms (GRIPs) that are present in the germlines of individual humans, mice, and chimpanzees, and we identify novel gene retrocopy insertions in cancerous somatic tissues that are absent from patient-matched non-cancer genomes. RESULTS: Through analysis of whole-genome sequence data, we found evidence for 48 GRIPs in the genomes of one or more humans sequenced as part of the 1,000 Genomes Project and The Cancer Genome Atlas, but which were not in the human reference assembly. Similarly, we found evidence for 755 GRIPs at distinct locations in one or more of 17 inbred mouse strains but which were not in the mouse reference assembly, and 19 GRIPs across a cohort of 10 chimpanzee genomes, which were not in the chimpanzee reference genome assembly. Many of these insertions are new members of existing gene families whose source genes are highly and widely expressed, and the majority have detectable hallmarks of processed gene retrocopy formation. We estimate the rate of novel gene retrocopy insertions in humans and chimps at roughly one new gene retrocopy insertion for every 6,000 individuals. CONCLUSIONS: We find that gene retrocopy polymorphisms are a widespread phenomenon, present a multi-species analysis of these events, and provide a method for their ascertainment.
Subject(s)
Genome/genetics , Genomic Structural Variation/genetics , Mammals/genetics , RNA, Messenger/genetics , Retroelements/genetics , Animals , Gene Ontology , Genome, Human/genetics , Humans , Mice , Molecular Sequence Annotation , Mutagenesis, Insertional/genetics , Neoplasms/genetics , Pan troglodytes/genetics , RNA, Messenger/metabolismABSTRACT
Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylated DNA immunoprecipitation sequencing (MeDIP-seq) and methylated DNA binding domain sequencing (MBD-seq). We applied all four methods to biological replicates of human embryonic stem cells to assess their genome-wide CpG coverage, resolution, cost, concordance and the influence of CpG density and genomic context. The methylation levels assessed by the two bisulfite methods were concordant (their difference did not exceed a given threshold) for 82% for CpGs and 99% of the non-CpG cytosines. Using binary methylation calls, the two enrichment methods were 99% concordant and regions assessed by all four methods were 97% concordant. We combined MeDIP-seq with methylation-sensitive restriction enzyme (MRE-seq) sequencing for comprehensive methylome coverage at lower cost. This, along with RNA-seq and ChIP-seq of the ES cells enabled us to detect regions with allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression.
Subject(s)
Alleles , DNA Methylation/genetics , Epigenesis, Genetic , Sequence Analysis, DNA/methods , Cell Line , CpG Islands/genetics , Cytosine/metabolism , Embryonic Stem Cells/metabolism , Gene Expression Regulation , Humans , Sulfites/metabolismABSTRACT
Cytosine DNA methylation is considered to be a stable epigenetic mark, but active demethylation has been observed in both plants and animals. In Arabidopsis thaliana, DNA glycosylases of the DEMETER (DME) family remove methylcytosines from DNA. Demethylation by DME is necessary for genomic imprinting, and demethylation by a related protein, REPRESSOR OF SILENCING1, prevents gene silencing in a transgenic background. However, the extent and function of demethylation by DEMETER-LIKE (DML) proteins in WT plants is not known. Using genome-tiling microarrays, we mapped DNA methylation in mutant and WT plants and identified 179 loci actively demethylated by DML enzymes. Mutations in DML genes lead to locus-specific DNA hypermethylation. Reintroducing WT DML genes restores most loci to the normal pattern of methylation, although at some loci, hypermethylated epialleles persist. Of loci demethylated by DML enzymes, >80% are near or overlap genes. Genic demethylation by DML enzymes primarily occurs at the 5' and 3' ends, a pattern opposite to the overall distribution of WT DNA methylation. Our results show that demethylation by DML DNA glycosylases edits the patterns of DNA methylation within the Arabidopsis genome to protect genes from potentially deleterious methylation.
Subject(s)
Arabidopsis/genetics , DNA Methylation , DNA, Plant/metabolism , Genome, Plant , 5-Methylcytosine/metabolism , Arabidopsis/enzymology , Arabidopsis Proteins/genetics , DNA Glycosylases/genetics , Genetic Markers , Genomic Imprinting , N-Glycosyl Hydrolases/genetics , Trans-Activators/geneticsABSTRACT
Mitochondria in the oxidizing environment of the maize (Zea mays) root quiescent center (QC) are altered in function, but otherwise structurally normal. Compared to mitochondria in the adjacent, rapidly dividing cells of the proximal root tissues, mitochondria in the QC show marked reductions in the activities of tricarboxylic acid cycle enzymes. Pyruvate dehydrogenase activity was not detected in the QC. Use of several mitochondrial membrane potential (DeltaPsi(m)) sensing probes indicated a depolarization of the mitochondrial membrane in the QC, which suggests a reduction in the capacity of QC mitochondria to generate ATP and NADH. We postulate that modifications of mitochondrial function are central to the establishment and maintenance of the QC.