RESUMEN
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
Asunto(s)
ADN/genética , Bases de Datos Genéticas , Genoma/genética , Genómica , Anotación de Secuencia Molecular , Sistema de Registros , Secuencias Reguladoras de Ácidos Nucleicos/genética , Animales , Cromatina/genética , Cromatina/metabolismo , ADN/química , Huella de ADN , Metilación de ADN/genética , Momento de Replicación del ADN , Desoxirribonucleasa I/metabolismo , Genoma Humano , Histonas/metabolismo , Humanos , Ratones , Ratones Transgénicos , Proteínas de Unión al ARN/genética , Transcripción Genética/genética , Transposasas/metabolismoRESUMEN
Dysregulation of MLL complex-mediated histone methylation plays a pivotal role in gene expression associated with diseases, but little is known about cellular factors modulating MLL complex activity. Here, we report that SON, previously known as an RNA splicing factor, controls MLL complex-mediated transcriptional initiation. SON binds to DNA near transcription start sites, interacts with menin, and inhibits MLL complex assembly, resulting in decreased H3K4me3 and transcriptional repression. Importantly, alternatively spliced short isoforms of SON are markedly upregulated in acute myeloid leukemia. The short isoforms compete with full-length SON for chromatin occupancy but lack the menin-binding ability, thereby antagonizing full-length SON function in transcriptional repression while not impairing full-length SON-mediated RNA splicing. Furthermore, overexpression of a short isoform of SON enhances replating potential of hematopoietic progenitors. Our findings define SON as a fine-tuner of the MLL-menin interaction and reveal short SON overexpression as a marker indicating aberrant transcriptional initiation in leukemia.
Asunto(s)
Proteínas de Unión al ADN/genética , N-Metiltransferasa de Histona-Lisina/biosíntesis , Leucemia Mieloide Aguda/genética , Proteína de la Leucemia Mieloide-Linfoide/biosíntesis , Proteínas Proto-Oncogénicas/genética , Transcripción Genética , Empalme Alternativo/genética , Línea Celular Tumoral , Cromatina/genética , Proteínas de Unión al ADN/biosíntesis , Regulación Leucémica de la Expresión Génica , N-Metiltransferasa de Histona-Lisina/genética , Humanos , Leucemia Mieloide Aguda/patología , Metilación , Antígenos de Histocompatibilidad Menor , Proteína de la Leucemia Mieloide-Linfoide/genética , Unión Proteica , Isoformas de Proteínas/genética , Proteínas Proto-Oncogénicas/metabolismoRESUMEN
As studies of DNA methylation increase in scope, it has become evident that methylation has a complex relationship with gene expression, plays an important role in defining cell types, and is disrupted in many diseases. We describe large-scale single-base resolution DNA methylation profiling on a diverse collection of 82 human cell lines and tissues using reduced representation bisulfite sequencing (RRBS). Analysis integrating RNA-seq and ChIP-seq data illuminates the functional role of this dynamic mark. Loci that are hypermethylated across cancer types are enriched for sites bound by NANOG in embryonic stem cells, which supports and expands the model of a stem/progenitor cell signature in cancer. CpGs that are hypomethylated across cancer types are concentrated in megabase-scale domains that occur near the telomeres and centromeres of chromosomes, are depleted of genes, and are enriched for cancer-specific EZH2 binding and H3K27me3 (repressive chromatin). In noncancer samples, there are cell-type specific methylation signatures preserved in primary cell lines and tissues as well as methylation differences induced by cell culture. The relationship between methylation and expression is context-dependent, and we find that CpG-rich enhancers bound by EP300 in the bodies of expressed genes are unmethylated despite the dense gene-body methylation surrounding them. Non-CpG cytosine methylation occurs in human somatic tissue, is particularly prevalent in brain tissue, and is reproducible across many individuals. This study provides an atlas of DNA methylation across diverse and well-characterized samples and enables new discoveries about DNA methylation and its role in gene regulation and disease.
Asunto(s)
Islas de CpG , Metilación de ADN , Línea Celular Tumoral , Cromatina , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Regiones Promotoras Genéticas , Alineación de Secuencia , Análisis de Secuencia de ADN , Sulfitos/metabolismoRESUMEN
ALS results from the selective and progressive degeneration of motor neurons. Although the underlying disease mechanisms remain unknown, glial cells have been implicated in ALS disease progression. Here, we examine the effects of glial cell/motor neuron interactions on gene expression using the hSOD1(G93A) (the G93A allele of the human superoxide dismutase gene) mouse model of ALS. We detect striking cell autonomous and nonautonomous changes in gene expression in cocultured motor neurons and glia, revealing that the two cell types profoundly affect each other. In addition, we found a remarkable concordance between the cell culture data and expression profiles of whole spinal cords and acutely isolated spinal cord cells during disease progression in the G93A mouse model, providing validation of the cell culture approach. Bioinformatics analyses identified changes in the expression of specific genes and signaling pathways that may contribute to motor neuron degeneration in ALS, among which are TGF-ß signaling pathways.
Asunto(s)
Esclerosis Amiotrófica Lateral/patología , Astrocitos/patología , Neuronas Motoras/patología , Animales , Modelos Animales de Enfermedad , Expresión Génica , Humanos , Ratones , Proteoglicanos/metabolismo , Receptores de Factores de Crecimiento Transformadores beta/metabolismo , Médula Espinal/enzimología , Médula Espinal/metabolismo , Superóxido Dismutasa/genética , Superóxido Dismutasa/metabolismo , Regulación hacia ArribaRESUMEN
Chromatin immunoprecipitation coupled with DNA sequencing (ChIP-seq) is the major contemporary method for mapping in vivo protein-DNA interactions in the genome. It identifies sites of transcription factor, cofactor and RNA polymerase occupancy, as well as the distribution of histone marks. Consortia such as the ENCyclopedia Of DNA Elements (ENCODE) have produced large datasets using manual protocols. However, future measurements of hundreds of additional factors in many cell types and physiological states call for higher throughput and consistency afforded by automation. Such automation advances, when provided by multiuser facilities, could also improve the quality and efficiency of individual small-scale projects. The immunoprecipitation process has become rate-limiting, and is a source of substantial variability when performed manually. Here we report a fully automated robotic ChIP (R-ChIP) pipeline that allows up to 96 reactions. A second bottleneck is the dearth of renewable ChIP-validated immune reagents, which do not yet exist for most mammalian transcription factors. We used R-ChIP to screen new mouse monoclonal antibodies raised against p300, a histone acetylase, well-known as a marker of active enhancers, for which ChIP-competent monoclonal reagents have been lacking. We identified, validated for ChIP-seq, and made publicly available a monoclonal reagent called ENCITp300-1.