ABSTRACT
DNase I hypersensitive sites (DHSs) are markers of regulatory DNA and have underpinned the discovery of all classes of cis-regulatory elements including enhancers, promoters, insulators, silencers and locus control regions. Here we present the first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types. We identify â¼2.9 million DHSs that encompass virtually all known experimentally validated cis-regulatory sequences and expose a vast trove of novel elements, most with highly cell-selective regulation. Annotating these elements using ENCODE data reveals novel relationships between chromatin accessibility, transcription, DNA methylation and regulatory factor occupancy patterns. We connect â¼580,000 distal DHSs with their target promoters, revealing systematic pairing of different classes of distal DHSs and specific promoter types. Patterning of chromatin accessibility at many regulatory regions is organized with dozens to hundreds of co-activated elements, and the transcellular DNase I sensitivity pattern at a given region can predict cell-type-specific functional behaviours. The DHS landscape shows signatures of recent functional evolutionary constraint. However, the DHS compartment in pluripotent and immortalized cells exhibits higher mutation rates than that in highly differentiated cells, exposing an unexpected link between chromatin accessibility, proliferative potential and patterns of human variation.
Subject(s)
Chromatin/genetics , Chromatin/metabolism , DNA/genetics , Encyclopedias as Topic , Genome, Human/genetics , Molecular Sequence Annotation , Regulatory Sequences, Nucleic Acid/genetics , DNA Footprinting , DNA Methylation , DNA-Binding Proteins/metabolism , Deoxyribonuclease I/metabolism , Evolution, Molecular , Genomics , Humans , Mutation Rate , Promoter Regions, Genetic/genetics , Transcription Factors/metabolism , Transcription Initiation Site , Transcription, GeneticABSTRACT
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Subject(s)
Genome, Human/genetics , Genomics , Regulatory Sequences, Nucleic Acid/genetics , Transcription, Genetic/genetics , Chromatin/genetics , Chromatin/metabolism , Chromatin Immunoprecipitation , Conserved Sequence/genetics , DNA Replication , Evolution, Molecular , Exons/genetics , Genetic Variation/genetics , Heterozygote , Histones/metabolism , Humans , Pilot Projects , Protein Binding , RNA, Messenger/genetics , RNA, Untranslated/genetics , Transcription Factors/metabolism , Transcription Initiation SiteABSTRACT
Localized accessibility of critical DNA sequences to the regulatory machinery is a key requirement for regulation of human genes. Here we describe a high-resolution, genome-scale approach for quantifying chromatin accessibility by measuring DNase I sensitivity as a continuous function of genome position using tiling DNA microarrays (DNase-array). We demonstrate this approach across 1% ( approximately 30 Mb) of the human genome, wherein we localized 2,690 classical DNase I hypersensitive sites with high sensitivity and specificity, and also mapped larger-scale patterns of chromatin architecture. DNase I hypersensitive sites exhibit marked aggregation around transcriptional start sites (TSSs), though the majority mark nonpromoter functional elements. We also developed a computational approach for visualizing higher-order features of chromatin structure. This revealed that human chromatin organization is dominated by large (100-500 kb) 'superclusters' of DNase I hypersensitive sites, which encompass both gene-rich and gene-poor regions. DNase-array is a powerful and straightforward approach for systematic exposition of the cis-regulatory architecture of complex genomes.