RESUMEN
The basic body plan and major physiological axes have been highly conserved during mammalian evolution, yet only a small fraction of the human genome sequence appears to be subject to evolutionary constraint. To quantify cis- versus trans-acting contributions to mammalian regulatory evolution, we performed genomic DNase I footprinting of the mouse genome across 25 cell and tissue types, collectively defining â¼8.6 million transcription factor (TF) occupancy sites at nucleotide resolution. Here we show that mouse TF footprints conjointly encode a regulatory lexicon that is â¼95% similar with that derived from human TF footprints. However, only â¼20% of mouse TF footprints have human orthologues. Despite substantial turnover of the cis-regulatory landscape, nearly half of all pairwise regulatory interactions connecting mouse TF genes have been maintained in orthologous human cell types through evolutionary innovation of TF recognition sequences. Furthermore, the higher-level organization of mouse TF-to-TF connections into cellular network architectures is nearly identical with human. Our results indicate that evolutionary selection on mammalian gene regulation is targeted chiefly at the level of trans-regulatory circuitry, enabling and potentiating cis-regulatory plasticity.
Asunto(s)
Secuencia Conservada/genética , Evolución Molecular , Mamíferos/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Animales , Huella de ADN , Regulación del Desarrollo de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Humanos , RatonesRESUMEN
The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
Asunto(s)
Genoma/genética , Genómica , Ratones/genética , Anotación de Secuencia Molecular , Animales , Linaje de la Célula/genética , Cromatina/genética , Cromatina/metabolismo , Secuencia Conservada/genética , Replicación del ADN/genética , Desoxirribonucleasa I/metabolismo , Regulación de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Estudio de Asociación del Genoma Completo , Humanos , ARN/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Especificidad de la Especie , Factores de Transcripción/metabolismo , Transcriptoma/genéticaRESUMEN
Regulatory factor binding to genomic DNA protects the underlying sequence from cleavage by DNase I, leaving nucleotide-resolution footprints. Using genomic DNase I footprinting across 41 diverse cell and tissue types, we detected 45 million transcription factor occupancy events within regulatory regions, representing differential binding to 8.4 million distinct short sequence elements. Here we show that this small genomic sequence compartment, roughly twice the size of the exome, encodes an expansive repertoire of conserved recognition sequences for DNA-binding proteins that nearly doubles the size of the human cis-regulatory lexicon. We find that genetic variants affecting allelic chromatin states are concentrated in footprints, and that these elements are preferentially sheltered from DNA methylation. High-resolution DNase I cleavage patterns mirror nucleotide-level evolutionary conservation and track the crystallographic topography of protein-DNA interfaces, indicating that transcription factor structure has been evolutionarily imprinted on the human genome sequence. We identify a stereotyped 50-base-pair footprint that precisely defines the site of transcript origination within thousands of human promoters. Finally, we describe a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function, and exhibit cell-selective occupancy patterns that closely parallel major regulators of development, differentiation and pluripotency.
Asunto(s)
Huella de ADN , ADN/genética , Enciclopedias como Asunto , Genoma Humano/genética , Anotación de Secuencia Molecular , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/metabolismo , Metilación de ADN , Proteínas de Unión al ADN/metabolismo , Desoxirribonucleasa I/metabolismo , Impresión Genómica , Genómica , Humanos , Polimorfismo de Nucleótido Simple/genética , Sitio de Iniciación de la TranscripciónRESUMEN
BACKGROUND: An improved understanding of the regulation of the fetal hemoglobin genes holds promise for the development of targeted therapeutic approaches for fetal hemoglobin induction in the ß-hemoglobinopathies. Although recent studies have uncovered trans-acting factors necessary for this regulation, limited insight has been gained into the cis-regulatory elements involved. METHODS: We identified three families with unusual patterns of hemoglobin expression, suggestive of deletions in the locus of the ß-globin gene (ß-globin locus). We performed array comparative genomic hybridization to map these deletions and confirmed breakpoints by means of polymerase-chain-reaction assays and DNA sequencing. We compared these deletions, along with previously mapped deletions, and studied the trans-acting factors binding to these sites in the ß-globin locus by using chromatin immunoprecipitation. RESULTS: We found a new (δß)(0)-thalassemia deletion and a rare hereditary persistence of fetal hemoglobin deletion with identical downstream breakpoints. Comparison of the two deletions resulted in the identification of a small intergenic region required for γ-globin (fetal hemoglobin) gene silencing. We mapped a Kurdish ß(0)-thalassemia deletion, which retains the required intergenic region, deletes other surrounding sequences, and maintains fetal hemoglobin silencing. By comparing these deletions and other previously mapped deletions, we elucidated a 3.5-kb intergenic region near the 5' end of the δ-globin gene that is necessary for γ-globin silencing. We found that a critical fetal hemoglobin silencing factor, BCL11A, and its partners bind within this region in the chromatin of adult erythroid cells. CONCLUSIONS: By studying three families with unusual deletions in the ß-globin locus, we identified an intergenic region near the δ-globin gene that is necessary for fetal hemoglobin silencing. (Funded by the National Institutes of Health and others.).
Asunto(s)
Hemoglobina Fetal/genética , Regulación de la Expresión Génica , Globinas beta/genética , Talasemia beta/genética , Adulto , Niño , Ensamble y Desensamble de Cromatina , Femenino , Eliminación de Gen , Silenciador del Gen , Humanos , Masculino , Linaje , Fenotipo , TransactivadoresRESUMEN
The ß-globin locus control region (LCR) is necessary for high-level ß-globin gene transcription and differentiation-dependent relocation of the ß-globin locus from the nuclear periphery to the central nucleoplasm and to foci of hyperphosphorylated Pol II "transcription factories" (TFys). To determine the contribution of individual LCR DNaseI hypersensitive sites (HSs) to transcription and nuclear location, in the present study, we compared ß-globin gene activity and location in erythroid cells derived from mice with deletions of individual HSs, deletions of 2 HSs, and deletion of the whole LCR and found all of the HSs had a similar spectrum of activities, albeit to different degrees. Each HS acts as an independent module to activate expression in an additive manner, and this is correlated with relocation away from the nuclear periphery. In contrast, HSs have redundant activities with respect to association with TFys and the probability that an allele is actively transcribed, as measured by primary RNA transcript FISH. The limiting effect on RNA levels occurs after ß-globin genes associate with TFys, at which time HSs contribute to the amount of RNA arising from each burst of transcription by stimulating transcriptional elongation.
Asunto(s)
Núcleo Celular/metabolismo , Región de Control de Posición/genética , Nucleoplasminas/metabolismo , Transcripción Genética/fisiología , Globinas beta/genética , Animales , Células Eritroides/metabolismo , Eliminación de Gen , Regulación del Desarrollo de la Expresión Génica/fisiología , Ratones , Ratones Transgénicos , ARN Mensajero/genética , Globinas beta/metabolismoRESUMEN
Active gene promoters are associated with covalent histone modifications, such as hyperacetylation, which can modulate chromatin structure and stabilize binding of transcription factors that recognize these modifications. At the beta-globin locus and several other loci, however, histone hyperacetylation extends beyond the promoter, over tens of kilobases; we term such patterns of histone modifications "hyperacetylated domains." Little is known of either the mechanism by which these domains form or their function. Here, we show that domain formation within the murine beta-globin locus occurs before either high-level gene expression or erythroid commitment. Analysis of beta-globin alleles harboring deletions of promoters or the locus control region demonstrates that these sequences are not required for domain formation, suggesting the existence of additional regulatory sequences within the locus. Deletion of embryonic globin gene promoters, however, resulted in the formation of a hyperacetylated domain over these genes in definitive erythroid cells, where they are otherwise inactive. Finally, sequences within beta-globin domains exhibit hyperacetylation in a context-dependent manner, and domains are maintained when transcriptional elongation is inhibited. These data narrow the range of possible mechanisms by which hyperacetylated domains form.
Asunto(s)
Embrión de Mamíferos/embriología , Regulación del Desarrollo de la Expresión Génica/fisiología , Histonas/metabolismo , Regiones Promotoras Genéticas/fisiología , Sitios de Carácter Cuantitativo/fisiología , Globinas beta/biosíntesis , Acetilación , Animales , Ratones , Estructura Terciaria de Proteína/fisiologíaRESUMEN
Functional assessment of disease-associated sequence variation at non-coding regulatory elements is complicated by their high degree of context sensitivity to both the local chromatin and nuclear environments. Allelic profiling of DNA accessibility across individuals has shown that only a select minority of sequence variation affects transcription factor (TF) occupancy, yet low sequence diversity in human populations means that no experimental assessment is available for the majority of disease-associated variants. Here we describe high-resolution in vivo maps of allelic DNA accessibility in liver, kidney, lung and B cells from 5 increasingly diverged strains of F1 hybrid mice. The high density of heterozygous sites in these hybrids enables precise quantification of effect size and cell-type specificity for hundreds of thousands of variants throughout the mouse genome. We show that chromatin-altering variants delineate characteristic sensitivity profiles for hundreds of TF motifs. We develop a compendium of TF-specific sensitivity profiles accounting for genomic context effects. Finally, we link maps of allelic accessibility to allelic transcript levels in the same samples. This work provides a foundation for quantitative prediction of cell-type specific effects of non-coding variation on TF activity, which will facilitate both fine-mapping and systems-level analyses of common disease-associated variation in human genomes.
Asunto(s)
ADN/genética , Alelos , Animales , Sitios de Unión/genética , Cromatina/genética , Cromatina/metabolismo , Mapeo Cromosómico , ADN/metabolismo , Femenino , Regulación de la Expresión Génica , Variación Genética , Genoma Humano , Humanos , Hibridación Genética , Masculino , Ratones , Ratones de la Cepa 129 , Ratones Endogámicos C3H , Ratones Endogámicos C57BL , Especificidad de Órganos/genética , Penetrancia , Secuencias Reguladoras de Ácidos Nucleicos , Factores de Transcripción/metabolismoRESUMEN
To study the evolutionary dynamics of regulatory DNA, we mapped >1.3 million deoxyribonuclease I-hypersensitive sites (DHSs) in 45 mouse cell and tissue types, and systematically compared these with human DHS maps from orthologous compartments. We found that the mouse and human genomes have undergone extensive cis-regulatory rewiring that combines branch-specific evolutionary innovation and loss with widespread repurposing of conserved DHSs to alternative cell fates, and that this process is mediated by turnover of transcription factor (TF) recognition elements. Despite pervasive evolutionary remodeling of the location and content of individual cis-regulatory regions, within orthologous mouse and human cell types the global fraction of regulatory DNA bases encoding recognition sites for each TF has been strictly conserved. Our findings provide new insights into the evolutionary forces shaping mammalian regulatory DNA landscapes.
Asunto(s)
Secuencia Conservada , ADN/genética , Evolución Molecular , Secuencias Reguladoras de Ácidos Nucleicos/genética , Factores de Transcripción/metabolismo , Animales , Secuencia de Bases , Desoxirribonucleasa I , Genoma Humano , Humanos , Ratones , Mapeo RestrictivoRESUMEN
To complement the human Encyclopedia of DNA Elements (ENCODE) project and to enable a broad range of mouse genomics efforts, the Mouse ENCODE Consortium is applying the same experimental pipelines developed for human ENCODE to annotate the mouse genome.
Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genómica , Ratones/genética , Anotación de Secuencia Molecular , Animales , Genoma , Genoma Humano , Humanos , InternetRESUMEN
We have examined the relationship between nuclear localization and transcriptional activity of the endogenous murine beta-globin locus during erythroid differentiation. Murine fetal liver cells were separated into distinct erythroid maturation stages by fluorescence-activated cell sorting, and the nuclear position of the locus was determined at each stage. We find that the beta-globin locus progressively moves away from the nuclear periphery with increasing maturation. Contrary to the prevailing notion that the nuclear periphery is a repressive compartment in mammalian cells, beta(major)-globin expression begins at the nuclear periphery prior to relocalization. However, relocation of the locus to the nuclear interior with maturation is accompanied by an increase in beta(major)-globin transcription. The distribution of nuclear polymerase II (Pol II) foci also changes with erythroid differentiation: Transcription factories decrease in number and contract toward the nuclear interior. Moreover, both efficient relocalization of the beta-globin locus from the periphery and its association with hyperphosphorylated Pol II transcription factories require the locus control region (LCR). These results suggest that the LCR-dependent association of the beta-globin locus with transcriptionally engaged Pol II foci provides the driving force for relocalization of the locus toward the nuclear interior during erythroid maturation.
Asunto(s)
Células Eritroides/metabolismo , Globinas/genética , Región de Control de Posición , Factores de Transcripción/metabolismo , Animales , Diferenciación Celular , Núcleo Celular/genética , Hibridación Fluorescente in Situ , Hígado/citología , Hígado/embriología , Ratones , ARN Polimerasa II/metabolismoRESUMEN
The locus control region (LCR) was thought to be necessary and sufficient for establishing and maintaining an open beta-globin locus chromatin domain in the repressive environment of the developing erythrocyte. However, deletion of the LCR from the endogenous locus had no significant effect on chromatin structure and did not silence transcription. Thus, the cis-regulatory elements that confer the open domain remain unidentified. The conserved DNaseI hypersensitivity sites (HSs) HS-62.5 and 3'HS1 that flank the locus, and the region upstream of the LCR have been implicated in globin gene regulation. The flanking HSs bind CCCTC binding factor (CTCF) and are thought to interact with the LCR to form a "chromatin hub" involved in beta-globin gene activation. Hispanic thalassemia, a deletion of the LCR and 27 kb upstream, leads to heterochromatinization and silencing of the locus. Thus, the region upstream of the LCR deleted in Hispanic thalassemia (upstream Hispanic region [UHR]) may be required for expression. To determine the importance of the UHR and flanking HSs for beta-globin expression, we generated and analyzed mice with targeted deletions of these elements. We demonstrate deletion of these regions alone, and in combination, do not affect transcription, bringing into question current models for the regulation of the beta-globin locus.