ABSTRACT
The cistrome is the complete set of transcription factor (TF) binding sites (cis-elements) in an organism, while an epicistrome incorporates tissue-specific DNA chemical modifications and TF-specific chemical sensitivities into these binding profiles. Robust methods to construct comprehensive cistrome andĀ epicistrome maps are critical for elucidating complex transcriptional networks that underlie growth, behavior, and disease. Here, we describe DNA affinity purification sequencing (DAP-seq), a high-throughput TF binding site discovery method that interrogates genomic DNA with in-vitro-expressed TFs. Using DAP-seq, we defined the Arabidopsis cistrome by resolving motifs and peaks for 529 TFs. Because genomic DNA used in DAP-seq retains 5-methylcytosines, we determined that >75% (248/327) of Arabidopsis TFs surveyed were methylation sensitive, a property that strongly impacts theĀ epicistrome landscape. DAP-seq datasets also yielded insight into the biology and binding site architecture of numerous TFs, demonstrating the value of DAP-seq for cost-effective cistromic and epicistromic annotation in any organism.
Subject(s)
Arabidopsis/genetics , DNA, Plant/genetics , Genome, Plant , Response Elements , Sequence Analysis, DNA/methods , Transcription Factors/metabolism , Amino Acid Motifs , DNA, Plant/metabolism , Epigenesis, Genetic , Indoleacetic Acids/metabolism , Plant Proteins/geneticsABSTRACT
Sun-loving plants have the ability to detect and avoid shading through sensing of both blue and red light wavelengths. Higher plant cryptochromes (CRYs) control how plants modulate growth in response to changes in blue light. For growth under a canopy, where blue light is diminished, CRY1 and CRY2 perceive this change and respond by directly contacting two bHLH transcription factors, PIF4 and PIF5. These factors are also known to be controlled by phytochromes, the red/far-red photoreceptors; however, transcriptome analyses indicate that the gene regulatory programs induced by the different light wavelengths are distinct. Our results indicate that CRYs signal by modulating PIF activity genome wide and that these factors integrate binding of different plant photoreceptors to facilitate growth changes under different light conditions.
Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/metabolism , Basic Helix-Loop-Helix Transcription Factors/metabolism , Cryptochromes/metabolism , Arabidopsis/growth & development , Arabidopsis/radiation effects , Gene Expression , Hypocotyl/growth & development , Light , Phytochrome B/metabolismABSTRACT
The epigenome orchestrates genome accessibility,Ā functionality, and three-dimensional structure. Because epigenetic variation can impact transcription and thus phenotypes, it may contribute to adaptation. Here, we report 1,107 high-quality single-base resolution methylomes and 1,203 transcriptomes from the 1001 Genomes collection of Arabidopsis thaliana. Although the genetic basis of methylation variation is highly complex, geographic origin is a major predictor of genome-wide DNA methylation levels and of altered gene expression caused by epialleles. Comparison to cistrome and epicistrome datasets identifies associations between transcription factor binding sites, methylation, nucleotide variation, and co-expression modules. Physical maps for nine of the most diverse genomes reveal how transposons and other structural variants shape the epigenome, with dramatic effects on immunity genes. The 1001 Epigenomes Project provides a comprehensive resource for understanding how variation in DNA methylation contributes to molecular and non-molecular phenotypes in natural populations of the most studied model plant.
Subject(s)
Arabidopsis/genetics , Epigenesis, Genetic , DNA Methylation , Epigenomics , Gene Expression Regulation, Plant , Genome, Plant , TranscriptomeABSTRACT
Cytosine DNA methylation is essential in brain development and is implicated in various neurological disorders. Understanding DNA methylation diversity across the entire brain in a spatial context is fundamental for a complete molecular atlas of brain cell types and their gene regulatory landscapes. Here we used single-nucleus methylome sequencing (snmC-seq3) and multi-omic sequencing (snm3C-seq)1 technologies to generate 301,626 methylomes and 176,003 chromatin conformation-methylome joint profiles from 117 dissected regions throughout the adult mouse brain. Using iterative clustering and integrating with companion whole-brain transcriptome and chromatin accessibility datasets, we constructed a methylation-based cell taxonomy with 4,673 cell groups and 274 cross-modality-annotated subclasses. We identified 2.6 million differentially methylated regions across the genome that represent potential gene regulation elements. Notably, we observed spatial cytosine methylation patterns on both genes and regulatory elements in cell types within and across brain regions. Brain-wide spatial transcriptomics data validated the association of spatial epigenetic diversity with transcription and improved the anatomical mapping of our epigenetic datasets. Furthermore, chromatin conformation diversities occurred in important neuronal genes and were highly associated with DNA methylation and transcription changes. Brain-wide cell-type comparisons enabled the construction of regulatory networks that incorporate transcription factors, regulatory elements and their potential downstream gene targets. Finally, intragenic DNA methylation and chromatin conformation patterns predicted alternative gene isoform expression observed in a whole-brain SMART-seq2 dataset. Our study establishes a brain-wide, single-cell DNA methylome and 3D multi-omic atlas and provides a valuable resource for comprehending the cellular-spatial and regulatory genome diversity of the mouseĀ brain.
Subject(s)
Brain , DNA Methylation , Epigenome , Multiomics , Single-Cell Analysis , Animals , Mice , Brain/cytology , Brain/metabolism , Chromatin/chemistry , Chromatin/genetics , Chromatin/metabolism , Cytosine/metabolism , Datasets as Topic , Transcription Factors/metabolism , Transcription, GeneticABSTRACT
Divergence of cis-regulatory elements drives species-specific traits1, but how this manifests in the evolution of the neocortex at the molecular and cellular level remains unclear. Here we investigated the gene regulatory programs in the primary motor cortex of human, macaque, marmoset and mouse using single-cell multiomics assays, generating gene expression, chromatin accessibility, DNA methylome and chromosomal conformation profiles from a total of over 200,000 cells. From these data, we show evidence that divergence of transcription factor expression corresponds to species-specific epigenome landscapes. We find that conserved and divergent gene regulatory features are reflected in the evolution of the three-dimensional genome. Transposable elements contribute to nearly 80% of the human-specific candidate cis-regulatory elements in cortical cells. Through machine learning, we develop sequence-based predictors of candidate cis-regulatory elements in different species and demonstrate that the genomic regulatory syntax is highly preserved from rodents to primates. Finally, we show that epigenetic conservation combined with sequence similarity helps to uncover functional cis-regulatory elements and enhances our ability to interpret genetic variants contributing to neurological disease and traits.
Subject(s)
Conserved Sequence , Evolution, Molecular , Gene Expression Regulation , Gene Regulatory Networks , Mammals , Neocortex , Animals , Humans , Mice , Callithrix/genetics , Chromatin/genetics , Chromatin/metabolism , Conserved Sequence/genetics , DNA Methylation , DNA Transposable Elements/genetics , Epigenome , Gene Expression Regulation/genetics , Macaca/genetics , Mammals/genetics , Motor Cortex/cytology , Motor Cortex/metabolism , Multiomics , Neocortex/cytology , Neocortex/metabolism , Regulatory Sequences, Nucleic Acid/genetics , Single-Cell Analysis , Transcription Factors/metabolism , Genetic Variation/geneticsABSTRACT
Single-cell analyses parse the brain's billions of neurons into thousands of 'cell-type' clusters residing in different brain structures1. Many cell types mediate their functions through targeted long-distance projections allowing interactions between specific cell types. Here we used epi-retro-seq2 to link single-cell epigenomes and cell types to long-distance projections for 33,034 neurons dissected from 32 different regions projecting to 24 different targets (225 source-to-target combinations) across the whole mouse brain. We highlight uses of these data for interrogating principles relating projection types to transcriptomics and epigenomics, and for addressing hypotheses about cell types and connections related to genetics. We provide an overall synthesis with 926 statistical comparisons of discriminability of neurons projecting to each target for every source. We integrate this dataset into the larger BRAIN Initiative Cell Census Network atlas, composed of millions of neurons, to link projection cell types to consensus clusters. Integration with spatial transcriptomics further assigns projection-enriched clusters to smaller source regions than the original dissections. We exemplify this by presenting in-depth analyses of projection neurons from the hypothalamus, thalamus, hindbrain, amygdala and midbrain to provide insights into properties of those cell types, including differentially expressed genes, their associated cis-regulatory elements and transcription-factor-binding motifs, and neurotransmitter use.
Subject(s)
Brain , Epigenomics , Neural Pathways , Neurons , Animals , Mice , Amygdala , Brain/cytology , Brain/metabolism , Consensus Sequence , Datasets as Topic , Gene Expression Profiling , Hypothalamus/cytology , Mesencephalon/cytology , Neural Pathways/cytology , Neurons/metabolism , Neurotransmitter Agents/metabolism , Regulatory Sequences, Nucleic Acid , Rhombencephalon/cytology , Single-Cell Analysis , Thalamus/cytology , Transcription Factors/metabolismABSTRACT
Epigenetic mechanisms have been proposed to play crucial roles in mammalian development, but their precise functions are only partially understood. To investigate epigenetic regulation of embryonic development, we differentiated human embryonic stem cells into mesendoderm, neural progenitor cells, trophoblast-like cells, and mesenchymal stem cells and systematically characterized DNA methylation, chromatin modifications, and the transcriptome in each lineage. We found that promoters that are active in early developmental stages tend to be CG rich and mainly engage H3K27me3 upon silencing in nonexpressing lineages. By contrast, promoters for genes expressed preferentially at later stages are often CG poor and primarily employ DNA methylation upon repression. Interestingly, the early developmental regulatory genes are often located in large genomic domains that are generally devoid of DNA methylation in most lineages, which we termed DNA methylation valleys (DMVs). Our results suggest that distinct epigenetic mechanisms regulate early and late stages of ES cell differentiation.
Subject(s)
DNA Methylation , Embryonic Stem Cells/metabolism , Epigenomics , Gene Expression Regulation, Developmental , Animals , Cell Differentiation , Chromatin/metabolism , CpG Islands , Embryonic Stem Cells/cytology , Histones/metabolism , Humans , Methylation , Neoplasms/genetics , Promoter Regions, Genetic , Zebrafish/embryologyABSTRACT
Mammalian brain cells show remarkable diversity in gene expression, anatomy and function, yet the regulatory DNA landscape underlying this extensive heterogeneity is poorly understood. Here we carry out a comprehensive assessment of the epigenomes of mouse brain cell types by applying single-nucleus DNA methylation sequencing1,2 to profile 103,982 nuclei (including 95,815 neurons and 8,167 non-neuronal cells) from 45 regions of the mouse cortex, hippocampus, striatum, pallidum and olfactory areas. We identified 161 cell clusters with distinct spatial locations and projection targets. We constructed taxonomies of these epigenetic types, annotated with signature genes, regulatory elements and transcription factors. These features indicate the potential regulatory landscape supporting the assignment of putative cell types and reveal repetitive usage of regulators in excitatory and inhibitory cells for determining subtypes. The DNA methylation landscape of excitatory neurons in the cortex and hippocampus varied continuously along spatial gradients. Using this deep dataset, we constructed an artificial neural network model that precisely predicts single neuron cell-type identity and brain area spatial location. Integration of high-resolution DNA methylomes with single-nucleus chromatin accessibility data3 enabled prediction of high-confidence enhancer-gene interactions for all identified cell types, which were subsequently validated by cell-type-specific chromatin conformation capture experiments4. By combining multi-omic datasets (DNA methylation, chromatin contacts, and open chromatin) from single nuclei and annotating the regulatory genome of hundreds of cell types in the mouse brain, our DNA methylation atlas establishes the epigenetic basis for neuronal diversity and spatial organization throughout the mouse cerebrum.
Subject(s)
Brain/cytology , DNA Methylation , Epigenome , Epigenomics , Neurons/classification , Neurons/metabolism , Single-Cell Analysis , Animals , Atlases as Topic , Brain/metabolism , Chromatin/chemistry , Chromatin/genetics , Chromatin/metabolism , Cytosine/chemistry , Cytosine/metabolism , Datasets as Topic , Dentate Gyrus/cytology , Enhancer Elements, Genetic/genetics , Gene Expression Profiling , Hippocampus/cytology , Hippocampus/metabolism , Male , Mice , Mice, Inbred C57BL , Models, Biological , Neural Pathways , Neurons/cytologyABSTRACT
Neuronal cell types are classically defined by their molecular properties, anatomy and functions. Although recent advances in single-cell genomics have led to high-resolution molecular characterization of cell type diversity in the brain1, neuronal cell types are often studied out of the context of their anatomical properties. To improve our understanding of the relationship between molecular and anatomical features that define cortical neurons, here we combined retrograde labelling with single-nucleus DNA methylation sequencing to link neural epigenomic properties to projections. We examined 11,827 single neocortical neurons from 63 cortico-cortical and cortico-subcortical long-distance projections. Our results showed unique epigenetic signatures of projection neurons that correspond to their laminar and regional location and projection patterns. On the basis of their epigenomes, intra-telencephalic cells that project to different cortical targets could be further distinguished, and some layer 5 neurons that project to extra-telencephalic targets (L5 ET) formed separate clusters that aligned with their axonal projections. Such separation varied between cortical areas, which suggests that there are area-specific differences in L5 ET subtypes, which were further validated by anatomical studies. Notably, a population of cortico-cortical projection neurons clustered with L5 ET rather than intra-telencephalic neurons, which suggests that a population of L5 ET cortical neurons projects to both targets. We verified the existence of these neurons by dual retrograde labelling and anterograde tracing of cortico-cortical projection neurons, which revealed axon terminals in extra-telencephalic targets including the thalamus, superior colliculus and pons. These findings highlight the power of single-cell epigenomic approaches to connect the molecular properties of neurons with their anatomical and projection properties.
Subject(s)
Cerebral Cortex/cytology , Cerebral Cortex/metabolism , Epigenome , Epigenomics , Neural Pathways , Neurons/classification , Neurons/metabolism , Animals , Brain Mapping , Female , Male , Mice , Neurons/cytologyABSTRACT
The primary motor cortex (M1) is essential for voluntary fine-motor control and is functionally conserved across mammals1. Here, using high-throughput transcriptomic and epigenomic profiling of more than 450,000 single nuclei in humans, marmoset monkeys and mice, we demonstrate a broadly conserved cellular makeup of this region, with similarities that mirror evolutionary distance and are consistent between the transcriptome and epigenome. The core conserved molecular identities of neuronal and non-neuronal cell types allow us to generate a cross-species consensus classification of cell types, and to infer conserved properties of cell types across species. Despite the overall conservation, however, many species-dependent specializations are apparent, including differences in cell-type proportions, gene expression, DNA methylation and chromatin state. Few cell-type marker genes are conserved across species, revealing a short list of candidate genes and regulatory mechanisms that are responsible for conserved features of homologous cell types, such as the GABAergic chandelier cells. This consensus transcriptomic classification allows us to use patch-seq (a combination of whole-cell patch-clamp recordings, RNA sequencing and morphological characterization) to identify corticospinal Betz cells from layer 5 in non-human primates and humans, and to characterize their highly specialized physiology and anatomy. These findings highlight the robust molecular underpinnings of cell-type diversity in M1 across mammals, and point to the genes and regulatory pathways responsible for the functional identity of cell types and their species-specific adaptations.
Subject(s)
Motor Cortex/cytology , Neurons/classification , Single-Cell Analysis , Animals , Atlases as Topic , Callithrix/genetics , Epigenesis, Genetic , Epigenomics , Female , GABAergic Neurons/cytology , GABAergic Neurons/metabolism , Gene Expression Profiling , Glutamates/metabolism , Humans , In Situ Hybridization, Fluorescence , Male , Mice , Middle Aged , Motor Cortex/anatomy & histology , Neurons/cytology , Neurons/metabolism , Organ Specificity , Phylogeny , Species Specificity , TranscriptomeABSTRACT
Single-cell transcriptomics can provide quantitative molecular signatures for large, unbiased samples of the diverse cell types in the brain1-3. With the proliferation of multi-omics datasets, a major challenge is to validate and integrate results into a biological understanding of cell-type organization. Here we generated transcriptomes and epigenomes from more than 500,000 individual cells in the mouse primary motor cortex, a structure that has an evolutionarily conserved role in locomotion. We developed computational and statistical methods to integrate multimodal data and quantitatively validate cell-type reproducibility. The resulting reference atlas-containing over 56 neuronal cell types that are highly replicable across analysis methods, sequencing technologies and modalities-is a comprehensive molecular and genomic account of the diverse neuronal and non-neuronal cell types in the mouse primary motor cortex. The atlas includes a population of excitatory neurons that resemble pyramidal cells in layer 4 in other cortical regions4. We further discovered thousands of concordant marker genes and gene regulatory elements for these cell types. Our results highlight the complex molecular regulation of cell types in the brain and will directly enable the design of reagents to target specific cell types in the mouse primary motor cortex for functional analysis.
Subject(s)
Epigenomics , Gene Expression Profiling , Motor Cortex/cytology , Neurons/classification , Single-Cell Analysis , Transcriptome , Animals , Atlases as Topic , Datasets as Topic , Epigenesis, Genetic , Female , Male , Mice , Motor Cortex/anatomy & histology , Neurons/cytology , Neurons/metabolism , Organ Specificity , Reproducibility of ResultsABSTRACT
Cytosine DNA methylation is essential for mammalian development but understanding of its spatiotemporal distribution in the developing embryo remains limited1,2. Here, as part of the mouse Encyclopedia of DNA Elements (ENCODE) project, we profiled 168 methylomes from 12 mouse tissues or organs at 9 developmental stages from embryogenesis to adulthood. We identified 1,808,810 genomic regions that showed variations in CG methylation by comparing the methylomes of different tissues or organs from different developmental stages. These DNA elements predominantly lose CG methylation during fetal development, whereas the trend is reversed after birth. During late stages ofĀ fetal development, non-CG methylation accumulated within the bodies of key developmental transcription factor genes, coinciding with their transcriptional repression. Integration of genome-wide DNA methylation, histone modification and chromatin accessibility data enabled us to predict 461,141 putative developmental tissue-specific enhancers, the human orthologuesĀ of which were enriched for disease-associated genetic variants. These spatiotemporal epigenome maps provide a resource for studies of gene regulation during tissue or organ progression, and a starting point for investigating regulatory elements that are involved in human developmental disorders.
Subject(s)
DNA Methylation , Epigenome , Fetus/embryology , Fetus/metabolism , Animals , Animals, Newborn , Chromatin/genetics , Chromatin/metabolism , Disease/genetics , Down-Regulation , Enhancer Elements, Genetic/genetics , Epigenetic Repression , Female , Gene Silencing , Humans , Mice , Mice, Inbred C57BL , Models, Animal , Molecular Sequence Annotation , Polymorphism, Single Nucleotide , Spatio-Temporal AnalysisABSTRACT
Like other complex multicellular organisms, plants are composed of different cell types with specialized shapes and functions. For example, most laminar leaves consist of multiple photosynthetic cell types. These cell types include the palisade mesophyll, which typically forms one or more cell layers on the adaxial side of the leaf. Despite their importance for photosynthesis, we know little about how palisade cells differ at the molecular level from other photosynthetic cell types. To this end, we have used a combination of cell-specific profiling using fluorescence-activated cell sorting and single-cell RNA-sequencing methods to generate a transcriptional blueprint of the palisade mesophyll in Arabidopsis thaliana leaves. We find that despite their unique morphology, palisade cells are otherwise transcriptionally similar to other photosynthetic cell types. Nevertheless, we show that some genes in the phenylpropanoid biosynthesis pathway have both palisade-enriched expression and are light-regulated. Phenylpropanoid gene activity in the palisade was required for production of the ultraviolet (UV)-B protectant sinapoylmalate, which may protect the palisade and/or other leaf cells against damaging UV light. These findings improve our understanding of how different photosynthetic cell types in the leaf can function uniquely to optimize leaf performance, despite their transcriptional similarities.
Subject(s)
Arabidopsis , Ultraviolet Rays , Light , Photosynthesis , Plant LeavesABSTRACT
Dynamic three-dimensional chromatin conformation is a critical mechanism for gene regulation during development and disease. Despite this, profiling of three-dimensional genome structure from complex tissues with cell-type specific resolution remains challenging. Recent efforts have demonstrated that cell-type specific epigenomic features can be resolved in complex tissues using single-cell assays. However, it remains unclear whether single-cell chromatin conformation capture (3C) or Hi-C profiles can effectively identify cell types and reconstruct cell-type specific chromatin conformation maps. To address these challenges, we have developed single-nucleus methyl-3C sequencing to capture chromatin organization and DNA methylation information and robustly separate heterogeneous cell types. Applying this method to >4,200 single human brain prefrontal cortex cells, we reconstruct cell-type specific chromatin conformation maps from 14 cortical cell types. These datasets reveal the genome-wide association between cell-type specific chromatin conformation and differential DNA methylation, suggesting pervasive interactions between epigenetic processes regulating gene expression.
Subject(s)
DNA Methylation , Genome, Human , Single-Cell Analysis , Algorithms , Chromatin/metabolism , Datasets as Topic , Epigenesis, Genetic , Gene Expression Regulation , Genome-Wide Association Study , HumansABSTRACT
The bacterium Agrobacterium tumefaciens has been the workhorse in plant genome engineering. Customized replacement of native tumor-inducing (Ti) plasmid elements enabled insertion of a sequence of interest called Transfer-DNA (T-DNA) into any plant genome. Although these transfer mechanisms are well understood, detailed understanding of structure and epigenomic status of insertion events was limited by current technologies. Here we applied two single-molecule technologies and analyzed Arabidopsis thaliana lines from three widely used T-DNA insertion collections (SALK, SAIL and WISC). Optical maps for four randomly selected T-DNA lines revealed between one and seven insertions/rearrangements, and the length of individual insertions from 27 to 236 kilobases. De novo nanopore sequencing-based assemblies for two segregating lines partially resolved T-DNA structures and revealed multiple translocations and exchange of chromosome arm ends. For the current TAIR10 reference genome, nanopore contigs corrected 83% of non-centromeric misassemblies. The unprecedented contiguous nucleotide-level resolution enabled an in-depth study of the epigenome at T-DNA insertion sites. SALK_059379 line T-DNA insertions were enriched for 24nt small interfering RNAs (siRNA) and dense cytosine DNA methylation, resulting in transgene silencing via the RNA-directed DNA methylation pathway. In contrast, SAIL_232 line T-DNA insertions are predominantly targeted by 21/22nt siRNAs, with DNA methylation and silencing limited to a reporter, but not the resistance gene. Additionally, we profiled the H3K4me3, H3K27me3 and H2A.Z chromatin environments around T-DNA insertions using ChIP-seq in SALK_059379, SAIL_232 and five additional T-DNA lines. We discovered various effect s ranging from complete loss of chromatin marks to the de novo incorporation of H2A.Z and trimethylation of H3K4 and H3K27 around the T-DNA integration sites. This study provides new insights into the structural impact of inserting foreign fragments into plant genomes and demonstrates the utility of state-of-the-art long-range sequencing technologies to rapidly identify unanticipated genomic changes.
Subject(s)
DNA Methylation/genetics , DNA, Bacterial/genetics , DNA, Plant/genetics , Epigenesis, Genetic/genetics , Agrobacterium tumefaciens/genetics , Arabidopsis/genetics , Chromosome Mapping , Chromosomes, Plant/genetics , Genome, Plant/genetics , Mutagenesis, Insertional/genetics , Plant Tumor-Inducing Plasmids/genetics , Plants, Genetically Modified/genetics , Plants, Genetically Modified/growth & development , Transformation, GeneticABSTRACT
Understanding the diversity of human tissues is fundamental to disease and requires linking genetic information, which is identical in most of an individual's cells, with epigenetic mechanisms that could have tissue-specific roles. Surveys of DNA methylation in human tissues have established a complex landscape including both tissue-specific and invariant methylation patterns. Here we report high coverage methylomes that catalogue cytosine methylation in all contexts for the major human organ systems, integrated with matched transcriptomes and genomic sequence. By combining these diverse data types with each individuals' phased genome, we identified widespread tissue-specific differential CG methylation (mCG), partially methylated domains, allele-specific methylation and transcription, and the unexpected presence of non-CG methylation (mCH) in almost all human tissues. mCH correlated with tissue-specific functions, and using this mark, we made novel predictions of genes that escape X-chromosome inactivation in specific tissues. Overall, DNA methylation in several genomic contexts varies substantially among human tissues.