RESUMEN
Current catalogs of regulatory sequences in the human genome are still incomplete and lack cell type resolution. To profile the activity of gene regulatory elements in diverse cell types and tissues in the human body, we applied single-cell chromatin accessibility assays to 30 adult human tissue types from multiple donors. We integrated these datasets with previous single-cell chromatin accessibility data from 15 fetal tissue types to reveal the status of open chromatin for â¼1.2 million candidate cis-regulatory elements (cCREs) in 222 distinct cell types comprised of >1.3 million nuclei. We used these chromatin accessibility maps to delineate cell-type-specificity of fetal and adult human cCREs and to systematically interpret the noncoding variants associated with complex human traits and diseases. This rich resource provides a foundation for the analysis of gene regulatory programs in human cell types across tissues, life stages, and organ systems.
Asunto(s)
Cromatina/metabolismo , Genoma Humano , Análisis de la Célula Individual , Adulto , Análisis por Conglomerados , Feto/metabolismo , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , Especificidad de Órganos , Filogenia , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de RiesgoRESUMEN
Spalt-like transcription factor 1 (SALL1) is a critical regulator of organogenesis and microglia identity. Here we demonstrate that disruption of a conserved microglia-specific super-enhancer interacting with the Sall1 promoter results in complete and specific loss of Sall1 expression in microglia. By determining the genomic binding sites of SALL1 and leveraging Sall1 enhancer knockout mice, we provide evidence for functional interactions between SALL1 and SMAD4 required for microglia-specific gene expression. SMAD4 binds directly to the Sall1 super-enhancer and is required for Sall1 expression, consistent with an evolutionarily conserved requirement of the TGFß and SMAD homologs Dpp and Mad for cell-specific expression of Spalt in the Drosophila wing. Unexpectedly, SALL1 in turn promotes binding and function of SMAD4 at microglia-specific enhancers while simultaneously suppressing binding of SMAD4 to enhancers of genes that become inappropriately activated in enhancer knockout microglia, thereby enforcing microglia-specific functions of the TGFß-SMAD signaling axis.
Asunto(s)
Microglía , Factores de Transcripción , Animales , Ratones , Sitios de Unión , ADN , Ratones Noqueados , Microglía/metabolismo , Regiones Promotoras Genéticas/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Factor de Crecimiento Transformador beta/metabolismoRESUMEN
Non-coding genetic variation is a major driver of phenotypic diversity and allows the investigation of mechanisms that control gene expression. Here, we systematically investigated the effects of >50 million variations from five strains of mice on mRNA, nascent transcription, transcription start sites, and transcription factor binding in resting and activated macrophages. We observed substantial differences associated with distinct molecular pathways. Evaluating genetic variation provided evidence for roles of â¼100 TFs in shaping lineage-determining factor binding. Unexpectedly, a substantial fraction of strain-specific factor binding could not be explained by local mutations. Integration of genomic features with chromatin interaction data provided evidence for hundreds of connected cis-regulatory domains associated with differences in transcription factor binding and gene expression. This system and the >250 datasets establish a substantial new resource for investigation of how genetic variation affects cellular phenotypes.
Asunto(s)
Variación Genética , Macrófagos/metabolismo , Factores de Transcripción/metabolismo , Animales , Sitios de Unión , Células de la Médula Ósea/citología , Proteína beta Potenciadora de Unión a CCAAT/genética , Proteína beta Potenciadora de Unión a CCAAT/metabolismo , Análisis por Conglomerados , Elementos de Facilitación Genéticos/genética , Femenino , Regulación de la Expresión Génica/efectos de los fármacos , Lipopolisacáridos/farmacología , Macrófagos/citología , Macrófagos/efectos de los fármacos , Masculino , Ratones , Ratones Endogámicos BALB C , Ratones Endogámicos C57BL , Ratones Endogámicos NOD , Regiones Promotoras Genéticas , Unión Proteica , Proteínas Proto-Oncogénicas/genética , Proteínas Proto-Oncogénicas/metabolismo , Transactivadores/genética , Transactivadores/metabolismo , Factores de Transcripción/genéticaRESUMEN
Histone H3 lysine 4 mono-methylation (H3K4me1) marks poised or active enhancers. KMT2C (MLL3) and KMT2D (MLL4) catalyze H3K4me1, but their histone methyltransferase activities are largely dispensable for transcription during early embryogenesis in mammals. To better understand the role of H3K4me1 in enhancer function, we analyze dynamic enhancer-promoter (E-P) interactions and gene expression during neural differentiation of the mouse embryonic stem cells. We found that KMT2C/D catalytic activities were only required for H3K4me1 and E-P contacts at a subset of candidate enhancers, induced upon neural differentiation. By contrast, a majority of enhancers retained H3K4me1 in KMT2C/D catalytic mutant cells. Surprisingly, H3K4me1 signals at these KMT2C/D-independent sites were reduced after acute depletion of KMT2B, resulting in aggravated transcriptional defects. Our observations therefore implicate KMT2B in the catalysis of H3K4me1 at enhancers and provide additional support for an active role of H3K4me1 in enhancer-promoter interactions and transcription in mammalian cells.
Asunto(s)
Diferenciación Celular , Elementos de Facilitación Genéticos , N-Metiltransferasa de Histona-Lisina , Histonas , Lisina/análogos & derivados , Células Madre Embrionarias de Ratones , Regiones Promotoras Genéticas , Animales , Ratones , Histonas/metabolismo , Histonas/genética , N-Metiltransferasa de Histona-Lisina/genética , N-Metiltransferasa de Histona-Lisina/metabolismo , Células Madre Embrionarias de Ratones/metabolismo , Células Madre Embrionarias de Ratones/citología , Activación Transcripcional , Metilación , Regulación del Desarrollo de la Expresión Génica , Proteína de la Leucemia Mieloide-Linfoide/metabolismo , Proteína de la Leucemia Mieloide-Linfoide/genética , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/genéticaRESUMEN
Animal development depends on not only the linear genome sequence that embeds millions of cis-regulatory elements, but also the three-dimensional (3D) chromatin architecture that orchestrates the interplay between cis-regulatory elements and their target genes. Compared to our knowledge of the cis-regulatory sequences, the understanding of the 3D genome organization in human and other eukaryotes is still limited. Recent advances in technologies to map the 3D genome architecture have greatly accelerated the pace of discovery. Here, we review emerging concepts of chromatin organization in mammalian cells, discuss the dynamics of chromatin conformation during development, and highlight important roles for chromatin organization in cancer and other human diseases.
Asunto(s)
Genoma , Mamíferos/genética , Animales , Enfermedad/genética , Regulación de la Expresión Génica , Humanos , Neoplasias/genéticaRESUMEN
The four-dimensional nucleome (4DN) consortium studies the architecture of the genome and the nucleus in space and time. We summarize progress by the consortium and highlight the development of technologies for (1) mapping genome folding and identifying roles of nuclear components and bodies, proteins, and RNA, (2) characterizing nuclear organization with time or single-cell resolution, and (3) imaging of nuclear organization. With these tools, the consortium has provided over 2,000 public datasets. Integrative computational models based on these data are starting to reveal connections between genome structure and function. We then present a forward-looking perspective and outline current aims to (1) delineate dynamics of nuclear architecture at different timescales, from minutes to weeks as cells differentiate, in populations and in single cells, (2) characterize cis-determinants and trans-modulators of genome organization, (3) test functional consequences of changes in cis- and trans-regulators, and (4) develop predictive models of genome structure and function.
Asunto(s)
Núcleo Celular , Genoma , Genoma/genética , Núcleo Celular/genética , Núcleo Celular/metabolismo , Cromatina/metabolismoRESUMEN
Structural variations are common in the human genome, but their contributions to human diseases have been hard to define. Lupiáñez et al. demonstrate that some structural variants can interrupt chromatin topology, resulting in ectopic enhancer-promoter interactions, altered spatiotemporal gene expression patterns, and developmental disorders.
Asunto(s)
Modelos Animales de Enfermedad , Elementos de Facilitación Genéticos , Regulación de la Expresión Génica , Animales , HumanosRESUMEN
CTCF and the associated cohesin complex play a central role in insulator function and higher-order chromatin organization of mammalian genomes. Recent studies identified a correlation between the orientation of CTCF-binding sites (CBSs) and chromatin loops. To test the functional significance of this observation, we combined CRISPR/Cas9-based genomic-DNA-fragment editing with chromosome-conformation-capture experiments to show that the location and relative orientations of CBSs determine the specificity of long-range chromatin looping in mammalian genomes, using protocadherin (Pcdh) and ß-globin as model genes. Inversion of CBS elements within the Pcdh enhancer reconfigures the topology of chromatin loops between the distal enhancer and target promoters and alters gene-expression patterns. Thus, although enhancers can function in an orientation-independent manner in reporter assays, in the native chromosome context, the orientation of at least some enhancers carrying CBSs can determine both the architecture of topological chromatin domains and enhancer/promoter specificity. These findings reveal how 3D chromosome architecture can be encoded by linear genome sequences.
Asunto(s)
Cromosomas/metabolismo , Técnicas Genéticas , Proteínas Represoras/metabolismo , Animales , Sitios de Unión , Factor de Unión a CCCTC , Cadherinas/genética , Proteínas de Ciclo Celular/metabolismo , Proteínas Cromosómicas no Histona/metabolismo , Cromosomas/química , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , ADN/química , Elementos de Facilitación Genéticos , Expresión Génica , Genoma Humano , Humanos , Células K562 , Ratones , Regiones Promotoras Genéticas , Globinas beta/genética , CohesinasRESUMEN
The heart, which is the first organ to develop, is highly dependent on its form to function1,2. However, how diverse cardiac cell types spatially coordinate to create the complex morphological structures that are crucial for heart function remains unclear. Here we integrated single-cell RNA-sequencing with high-resolution multiplexed error-robust fluorescence in situ hybridization to resolve the identity of the cardiac cell types that develop the human heart. This approach also provided a spatial mapping of individual cells that enables illumination of their organization into cellular communities that form distinct cardiac structures. We discovered that many of these cardiac cell types further specified into subpopulations exclusive to specific communities, which support their specialization according to the cellular ecosystem and anatomical region. In particular, ventricular cardiomyocyte subpopulations displayed an unexpected complex laminar organization across the ventricular wall and formed, with other cell subpopulations, several cellular communities. Interrogating cell-cell interactions within these communities using in vivo conditional genetic mouse models and in vitro human pluripotent stem cell systems revealed multicellular signalling pathways that orchestrate the spatial organization of cardiac cell subpopulations during ventricular wall morphogenesis. These detailed findings into the cellular social interactions and specialization of cardiac cell types constructing and remodelling the human heart offer new insights into structural heart diseases and the engineering of complex multicellular tissues for human heart repair.
Asunto(s)
Tipificación del Cuerpo , Corazón , Miocardio , Animales , Humanos , Ratones , Corazón/anatomía & histología , Corazón/embriología , Cardiopatías/metabolismo , Cardiopatías/patología , Ventrículos Cardíacos/anatomía & histología , Ventrículos Cardíacos/citología , Ventrículos Cardíacos/embriología , Hibridación Fluorescente in Situ , Modelos Animales , Miocardio/citología , Miocitos Cardíacos/citología , Miocitos Cardíacos/metabolismo , Análisis de Expresión Génica de una Sola CélulaRESUMEN
Cell type-specific gene expression patterns and dynamics during development or in disease are controlled by cis-regulatory elements (CREs), such as promoters and enhancers. Distinct classes of CREs can be characterized by their epigenomic features, including DNA methylation, chromatin accessibility, combinations of histone modifications and conformation of local chromatin. Tremendous progress has been made in cataloguing CREs in the human genome using bulk transcriptomic and epigenomic methods. However, single-cell epigenomic and multi-omic technologies have the potential to provide deeper insight into cell type-specific gene regulatory programmes as well as into how they change during development, in response to environmental cues and through disease pathogenesis. Here, we highlight recent advances in single-cell epigenomic methods and analytical tools and discuss their readiness for human tissue profiling.
Asunto(s)
Epigenómica , Secuencias Reguladoras de Ácidos Nucleicos , Humanos , Cromatina/genética , Regiones Promotoras Genéticas , Metilación de ADNRESUMEN
Genome-wide association studies (GWAS) have linked hundreds of thousands of sequence variants in the human genome to common traits and diseases. However, translating this knowledge into a mechanistic understanding of disease-relevant biology remains challenging, largely because such variants are predominantly in non-protein-coding sequences that still lack functional annotation at cell-type resolution. Recent advances in single-cell epigenomics assays have enabled the generation of cell type-, subtype- and state-resolved maps of the epigenome in heterogeneous human tissues. These maps have facilitated cell type-specific annotation of candidate cis-regulatory elements and their gene targets in the human genome, enhancing our ability to interpret the genetic basis of common traits and diseases.
Asunto(s)
Epigenómica , Estudio de Asociación del Genoma Completo , Humanos , Secuencias Reguladoras de Ácidos Nucleicos , Genoma Humano , Fenotipo , Polimorfismo de Nucleótido SimpleRESUMEN
Cytosine DNA methylation is essential in brain development and is implicated in various neurological disorders. Understanding DNA methylation diversity across the entire brain in a spatial context is fundamental for a complete molecular atlas of brain cell types and their gene regulatory landscapes. Here we used single-nucleus methylome sequencing (snmC-seq3) and multi-omic sequencing (snm3C-seq)1 technologies to generate 301,626 methylomes and 176,003 chromatin conformation-methylome joint profiles from 117 dissected regions throughout the adult mouse brain. Using iterative clustering and integrating with companion whole-brain transcriptome and chromatin accessibility datasets, we constructed a methylation-based cell taxonomy with 4,673 cell groups and 274 cross-modality-annotated subclasses. We identified 2.6 million differentially methylated regions across the genome that represent potential gene regulation elements. Notably, we observed spatial cytosine methylation patterns on both genes and regulatory elements in cell types within and across brain regions. Brain-wide spatial transcriptomics data validated the association of spatial epigenetic diversity with transcription and improved the anatomical mapping of our epigenetic datasets. Furthermore, chromatin conformation diversities occurred in important neuronal genes and were highly associated with DNA methylation and transcription changes. Brain-wide cell-type comparisons enabled the construction of regulatory networks that incorporate transcription factors, regulatory elements and their potential downstream gene targets. Finally, intragenic DNA methylation and chromatin conformation patterns predicted alternative gene isoform expression observed in a whole-brain SMART-seq2 dataset. Our study establishes a brain-wide, single-cell DNA methylome and 3D multi-omic atlas and provides a valuable resource for comprehending the cellular-spatial and regulatory genome diversity of the mouse brain.
Asunto(s)
Encéfalo , Metilación de ADN , Epigenoma , Multiómica , Análisis de la Célula Individual , Animales , Ratones , Encéfalo/citología , Encéfalo/metabolismo , Cromatina/química , Cromatina/genética , Cromatina/metabolismo , Citosina/metabolismo , Conjuntos de Datos como Asunto , Factores de Transcripción/metabolismo , Transcripción GenéticaRESUMEN
Divergence of cis-regulatory elements drives species-specific traits1, but how this manifests in the evolution of the neocortex at the molecular and cellular level remains unclear. Here we investigated the gene regulatory programs in the primary motor cortex of human, macaque, marmoset and mouse using single-cell multiomics assays, generating gene expression, chromatin accessibility, DNA methylome and chromosomal conformation profiles from a total of over 200,000 cells. From these data, we show evidence that divergence of transcription factor expression corresponds to species-specific epigenome landscapes. We find that conserved and divergent gene regulatory features are reflected in the evolution of the three-dimensional genome. Transposable elements contribute to nearly 80% of the human-specific candidate cis-regulatory elements in cortical cells. Through machine learning, we develop sequence-based predictors of candidate cis-regulatory elements in different species and demonstrate that the genomic regulatory syntax is highly preserved from rodents to primates. Finally, we show that epigenetic conservation combined with sequence similarity helps to uncover functional cis-regulatory elements and enhances our ability to interpret genetic variants contributing to neurological disease and traits.
Asunto(s)
Secuencia Conservada , Evolución Molecular , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Mamíferos , Neocórtex , Animales , Humanos , Ratones , Callithrix/genética , Cromatina/genética , Cromatina/metabolismo , Secuencia Conservada/genética , Metilación de ADN , Elementos Transponibles de ADN/genética , Epigenoma , Regulación de la Expresión Génica/genética , Macaca/genética , Mamíferos/genética , Corteza Motora/citología , Corteza Motora/metabolismo , Multiómica , Neocórtex/citología , Neocórtex/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos/genética , Análisis de la Célula Individual , Factores de Transcripción/metabolismo , Variación Genética/genéticaRESUMEN
Recent advances in single-cell technologies have led to the discovery of thousands of brain cell types; however, our understanding of the gene regulatory programs in these cell types is far from complete1-4. Here we report a comprehensive atlas of candidate cis-regulatory DNA elements (cCREs) in the adult mouse brain, generated by analysing chromatin accessibility in 2.3 million individual brain cells from 117 anatomical dissections. The atlas includes approximately 1 million cCREs and their chromatin accessibility across 1,482 distinct brain cell populations, adding over 446,000 cCREs to the most recent such annotation in the mouse genome. The mouse brain cCREs are moderately conserved in the human brain. The mouse-specific cCREs-specifically, those identified from a subset of cortical excitatory neurons-are strongly enriched for transposable elements, suggesting a potential role for transposable elements in the emergence of new regulatory programs and neuronal diversity. Finally, we infer the gene regulatory networks in over 260 subclasses of mouse brain cells and develop deep-learning models to predict the activities of gene regulatory elements in different brain cell types from the DNA sequence alone. Our results provide a resource for the analysis of cell-type-specific gene regulation programs in both mouse and human brains.
Asunto(s)
Encéfalo , Cromatina , Análisis de la Célula Individual , Animales , Humanos , Ratones , Encéfalo/citología , Encéfalo/metabolismo , Corteza Cerebral/citología , Cromatina/química , Cromatina/genética , Cromatina/metabolismo , Aprendizaje Profundo , Elementos Transponibles de ADN/genética , Redes Reguladoras de Genes/genética , Neuronas/metabolismoRESUMEN
Chromosomes of eukaryotes adopt highly dynamic and complex hierarchical structures in the nucleus. The three-dimensional (3D) organization of chromosomes profoundly affects DNA replication, transcription and the repair of DNA damage. Thus, a thorough understanding of nuclear architecture is fundamental to the study of nuclear processes in eukaryotic cells. Recent years have seen rapid proliferation of technologies to investigate genome organization and function. Here, we review experimental and computational methodologies for 3D genome analysis, with special focus on recent advances in high-throughput chromatin conformation capture (3C) techniques and data analysis.
Asunto(s)
Cromatina/ultraestructura , Animales , Mapeo Cromosómico , Cromosomas/ultraestructura , Simulación por Computador , Humanos , Modelos MolecularesRESUMEN
As the second dimension to the genome, the epigenome contains key information specific to every type of cells. Thousands of human epigenome maps have been produced in recent years thanks to rapid development of high throughput epigenome mapping technologies. In this review, we discuss the current epigenome mapping toolkit and utilities of epigenome maps. We focus particularly on mapping of DNA methylation, chromatin modification state, and chromatin structures, and emphasize the use of epigenome maps to delineate human gene regulatory sequences and developmental programs. We also provide a perspective on the progress of the epigenomics field and challenges ahead.
Asunto(s)
Epigénesis Genética , Epigenómica/métodos , Genoma Humano , Animales , Cromatina/química , Metilación de ADN , Estudio de Asociación del Genoma Completo , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Análisis de Secuencia de ADN/métodosRESUMEN
Epigenetic mechanisms have been proposed to play crucial roles in mammalian development, but their precise functions are only partially understood. To investigate epigenetic regulation of embryonic development, we differentiated human embryonic stem cells into mesendoderm, neural progenitor cells, trophoblast-like cells, and mesenchymal stem cells and systematically characterized DNA methylation, chromatin modifications, and the transcriptome in each lineage. We found that promoters that are active in early developmental stages tend to be CG rich and mainly engage H3K27me3 upon silencing in nonexpressing lineages. By contrast, promoters for genes expressed preferentially at later stages are often CG poor and primarily employ DNA methylation upon repression. Interestingly, the early developmental regulatory genes are often located in large genomic domains that are generally devoid of DNA methylation in most lineages, which we termed DNA methylation valleys (DMVs). Our results suggest that distinct epigenetic mechanisms regulate early and late stages of ES cell differentiation.
Asunto(s)
Metilación de ADN , Células Madre Embrionarias/metabolismo , Epigenómica , Regulación del Desarrollo de la Expresión Génica , Animales , Diferenciación Celular , Cromatina/metabolismo , Islas de CpG , Células Madre Embrionarias/citología , Histonas/metabolismo , Humanos , Metilación , Neoplasias/genética , Regiones Promotoras Genéticas , Pez Cebra/embriologíaRESUMEN
Single-cell omics technologies have revolutionized the study of gene regulation in complex tissues. A major computational challenge in analyzing these datasets is to project the large-scale and high-dimensional data into low-dimensional space while retaining the relative relationships between cells. This low dimension embedding is necessary to decompose cellular heterogeneity and reconstruct cell-type-specific gene regulatory programs. Traditional dimensionality reduction techniques, however, face challenges in computational efficiency and in comprehensively addressing cellular diversity across varied molecular modalities. Here we introduce a nonlinear dimensionality reduction algorithm, embodied in the Python package SnapATAC2, which not only achieves a more precise capture of single-cell omics data heterogeneities but also ensures efficient runtime and memory usage, scaling linearly with the number of cells. Our algorithm demonstrates exceptional performance, scalability and versatility across diverse single-cell omics datasets, including single-cell assay for transposase-accessible chromatin using sequencing, single-cell RNA sequencing, single-cell Hi-C and single-cell multi-omics datasets, underscoring its utility in advancing single-cell analysis.
Asunto(s)
Algoritmos , Cromatina , Análisis de la Célula Individual/métodosRESUMEN
Differential methylation of the two parental genomes in placental mammals is essential for genomic imprinting and embryogenesis. To systematically study this epigenetic process, we have generated a base-resolution, allele-specific DNA methylation (ASM) map in the mouse genome. We find parent-of-origin dependent (imprinted) ASM at 1,952 CG dinucleotides. These imprinted CGs form 55 discrete clusters including virtually all known germline differentially methylated regions (DMRs) and 23 previously unknown DMRs, with some occurring at microRNA genes. We also identify sequence-dependent ASM at 131,765 CGs. Interestingly, methylation at these sites exhibits a strong dependence on the immediate adjacent bases, allowing us to define a conserved sequence preference for the mammalian DNA methylation machinery. Finally, we report a surprising presence of non-CG methylation in the adult mouse brain, with some showing evidence of imprinting. Our results provide a resource for understanding the mechanisms of imprinting and allele-specific gene expression in mammalian cells.
Asunto(s)
Corteza Cerebral/metabolismo , Metilación de ADN , Impresión Genómica , Alelos , Animales , Islas de CpG , Femenino , Estudio de Asociación del Genoma Completo , Masculino , Ratones , Ratones de la Cepa 129RESUMEN
The study of 5-hydroxylmethylcytosines (5hmC) has been hampered by the lack of a method to map it at single-base resolution on a genome-wide scale. Affinity purification-based methods cannot precisely locate 5hmC nor accurately determine its relative abundance at each modified site. We here present a genome-wide approach, Tet-assisted bisulfite sequencing (TAB-Seq), that when combined with traditional bisulfite sequencing can be used for mapping 5hmC at base resolution and quantifying the relative abundance of 5hmC as well as 5mC. Application of this method to embryonic stem cells not only confirms widespread distribution of 5hmC in the mammalian genome but also reveals sequence bias and strand asymmetry at 5hmC sites. We observe high levels of 5hmC and reciprocally low levels of 5mC near but not on transcription factor-binding sites. Additionally, the relative abundance of 5hmC varies significantly among distinct functional sequence elements, suggesting different mechanisms for 5hmC deposition and maintenance.