Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 54
Filter
2.
Nature ; 584(7820): 244-251, 2020 08.
Article in English | MEDLINE | ID: mdl-32728217

ABSTRACT

DNase I hypersensitive sites (DHSs) are generic markers of regulatory DNA1-5 and contain genetic variations associated with diseases and phenotypic traits6-8. We created high-resolution maps of DHSs from 733 human biosamples encompassing 438 cell and tissue types and states, and integrated these to delineate and numerically index approximately 3.6 million DHSs within the human genome sequence, providing a common coordinate system for regulatory DNA. Here we show that these maps highly resolve the cis-regulatory compartment of the human genome, which encodes unexpectedly diverse cell- and tissue-selective regulatory programs at very high density. These programs can be captured comprehensively by a simple vocabulary that enables the assignment to each DHS of a regulatory barcode that encapsulates its tissue manifestations, and global annotation of protein-coding and non-coding RNA genes in a manner orthogonal to gene expression. Finally, we show that sharply resolved DHSs markedly enhance the genetic association and heritability signals of diseases and traits. Rather than being confined to a small number of distal elements or promoters, we find that genetic signals converge on congruently regulated sets of DHSs that decorate entire gene bodies. Together, our results create a universal, extensible coordinate system and vocabulary for human regulatory DNA marked by DHSs, and provide a new global perspective on the architecture of human gene regulation.


Subject(s)
Chromatin/genetics , DNA/metabolism , Deoxyribonuclease I/metabolism , Molecular Sequence Annotation , Chromatin/chemistry , Chromatin/metabolism , DNA/chemistry , DNA/genetics , Gene Expression Regulation , Genes/genetics , Genome, Human/genetics , Humans , Promoter Regions, Genetic/genetics , Regulatory Sequences, Nucleic Acid/genetics
3.
Nature ; 583(7818): 729-736, 2020 07.
Article in English | MEDLINE | ID: mdl-32728250

ABSTRACT

Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, but it remains challenging to distinguish variants that affect regulatory function2. Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3-6. However, only a small fraction of such sites have been precisely resolved on the human genome sequence6. Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate about 4.5 million compact genomic elements that encode transcription factor occupancy at nucleotide resolution. We map the fine-scale structure within about 1.6 million DNase I-hypersensitive sites and show that the overwhelming majority are populated by well-spaced sites of single transcription factor-DNA interaction. Cell-context-dependent cis-regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements. We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions1,7 is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.


Subject(s)
DNA Footprinting/standards , Genome, Human/genetics , Transcription Factors/metabolism , Consensus Sequence , DNA/genetics , DNA/metabolism , Deoxyribonuclease I/metabolism , Genetics, Population , Genome-Wide Association Study , Humans , Models, Molecular , Polymorphism, Single Nucleotide , Regulatory Sequences, Nucleic Acid/genetics
4.
Nature ; 583(7818): 699-710, 2020 07.
Article in English | MEDLINE | ID: mdl-32728249

ABSTRACT

The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.


Subject(s)
DNA/genetics , Databases, Genetic , Genome/genetics , Genomics , Molecular Sequence Annotation , Registries , Regulatory Sequences, Nucleic Acid/genetics , Animals , Chromatin/genetics , Chromatin/metabolism , DNA/chemistry , DNA Footprinting , DNA Methylation/genetics , DNA Replication Timing , Deoxyribonuclease I/metabolism , Genome, Human , Histones/metabolism , Humans , Mice , Mice, Transgenic , RNA-Binding Proteins/genetics , Transcription, Genetic/genetics , Transposases/metabolism
5.
EBioMedicine ; 41: 427-442, 2019 Mar.
Article in English | MEDLINE | ID: mdl-30827930

ABSTRACT

BACKGROUND: Transcriptional dysregulation drives cancer formation but the underlying mechanisms are still poorly understood. Renal cell carcinoma (RCC) is the most common malignant kidney tumor which canonically activates the hypoxia-inducible transcription factor (HIF) pathway. Despite intensive study, novel therapeutic strategies to target RCC have been difficult to develop. Since the RCC epigenome is relatively understudied, we sought to elucidate key mechanisms underpinning the tumor phenotype and its clinical behavior. METHODS: We performed genome-wide chromatin accessibility (DNase-seq) and transcriptome profiling (RNA-seq) on paired tumor/normal samples from 3 patients undergoing nephrectomy for removal of RCC. We incorporated publicly available data on HIF binding (ChIP-seq) in a RCC cell line. We performed integrated analyses of these high-resolution, genome-scale datasets together with larger transcriptomic data available through The Cancer Genome Atlas (TCGA). FINDINGS: Though HIF transcription factors play a cardinal role in RCC oncogenesis, we found that numerous transcription factors with a RCC-selective expression pattern also demonstrated evidence of HIF binding near their gene body. Examination of chromatin accessibility profiles revealed that some of these transcription factors influenced the tumor's regulatory landscape, notably the stem cell transcription factor POU5F1 (OCT4). Elevated POU5F1 transcript levels were correlated with advanced tumor stage and poorer overall survival in RCC patients. Unexpectedly, we discovered a HIF-pathway-responsive promoter embedded within a endogenous retroviral long terminal repeat (LTR) element at the transcriptional start site of the PSOR1C3 long non-coding RNA gene upstream of POU5F1. RNA transcripts are induced from this promoter and read through PSOR1C3 into POU5F1 producing a novel POU5F1 transcript isoform. Rather than being unique to the POU5F1 locus, we found that HIF binds to several other transcriptionally active LTR elements genome-wide correlating with broad gene expression changes in RCC. INTERPRETATION: Integrated transcriptomic and epigenomic analysis of matched tumor and normal tissues from even a small number of primary patient samples revealed remarkably convergent shared regulatory landscapes. Several transcription factors appear to act downstream of HIF including the potent stem cell transcription factor POU5F1. Dysregulated expression of POU5F1 is part of a larger pattern of gene expression changes in RCC that may be induced by HIF-dependent reactivation of dormant promoters embedded within endogenous retroviral LTRs.


Subject(s)
Endogenous Retroviruses/genetics , Epigenomics , Basic Helix-Loop-Helix Transcription Factors/genetics , Binding Sites , Carcinoma, Renal Cell/genetics , Carcinoma, Renal Cell/mortality , Carcinoma, Renal Cell/pathology , Cell Line, Tumor , Cytochrome Reductases/genetics , Endogenous Retroviruses/physiology , Gene Expression Regulation, Neoplastic , Humans , Hypoxia-Inducible Factor 1/genetics , Kidney Neoplasms/genetics , Kidney Neoplasms/mortality , Kidney Neoplasms/pathology , Octamer Transcription Factor-3/genetics , Octamer Transcription Factor-3/metabolism , Oxidoreductases Acting on Sulfur Group Donors , Phosphoric Diester Hydrolases/genetics , Promoter Regions, Genetic , Proteins/genetics , Pyrophosphatases/genetics , RNA, Long Noncoding , Survival Rate , Terminal Repeat Sequences/genetics , Ubiquitin-Conjugating Enzymes/genetics
6.
J Am Soc Nephrol ; 30(3): 421-441, 2019 Mar.
Article in English | MEDLINE | ID: mdl-30760496

ABSTRACT

BACKGROUND: Linking genetic risk loci identified by genome-wide association studies (GWAS) to their causal genes remains a major challenge. Disease-associated genetic variants are concentrated in regions containing regulatory DNA elements, such as promoters and enhancers. Although researchers have previously published DNA maps of these regulatory regions for kidney tubule cells and glomerular endothelial cells, maps for podocytes and mesangial cells have not been available. METHODS: We generated regulatory DNA maps (DNase-seq) and paired gene expression profiles (RNA-seq) from primary outgrowth cultures of human glomeruli that were composed mainly of podocytes and mesangial cells. We generated similar datasets from renal cortex cultures, to compare with those of the glomerular cultures. Because regulatory DNA elements can act on target genes across large genomic distances, we also generated a chromatin conformation map from freshly isolated human glomeruli. RESULTS: We identified thousands of unique regulatory DNA elements, many located close to transcription factor genes, which the glomerular and cortex samples expressed at different levels. We found that genetic variants associated with kidney diseases (GWAS) and kidney expression quantitative trait loci were enriched in regulatory DNA regions. By combining GWAS, epigenomic, and chromatin conformation data, we functionally annotated 46 kidney disease genes. CONCLUSIONS: We demonstrate a powerful approach to functionally connect kidney disease-/trait-associated loci to their target genes by leveraging unique regulatory DNA maps and integrated epigenomic and genetic analysis. This process can be applied to other kidney cell types and will enhance our understanding of genome regulation and its effects on gene expression in kidney disease.

7.
Nat Genet ; 50(10): 1388-1398, 2018 10.
Article in English | MEDLINE | ID: mdl-30202056

ABSTRACT

Structural variants (SVs) can contribute to oncogenesis through a variety of mechanisms. Despite their importance, the identification of SVs in cancer genomes remains challenging. Here, we present a framework that integrates optical mapping, high-throughput chromosome conformation capture (Hi-C), and whole-genome sequencing to systematically detect SVs in a variety of normal or cancer samples and cell lines. We identify the unique strengths of each method and demonstrate that only integrative approaches can comprehensively identify SVs in the genome. By combining Hi-C and optical mapping, we resolve complex SVs and phase multiple SV events to a single haplotype. Furthermore, we observe widespread structural variation events affecting the functions of noncoding sequences, including the deletion of distal regulatory sequences, alteration of DNA replication timing, and the creation of novel three-dimensional chromatin structural domains. Our results indicate that noncoding SVs may be underappreciated mutational drivers in cancer genomes.


Subject(s)
Genome, Human , Genomic Structural Variation , Neoplasms/genetics , Systems Biology/methods , A549 Cells , Cell Line, Tumor , Chromosome Mapping , DNA, Neoplasm/analysis , DNA, Neoplasm/genetics , Genes, Neoplasm , Genetic Variation , High-Throughput Nucleotide Sequencing/methods , Humans , K562 Cells , Linkage Disequilibrium , Sequence Analysis, DNA/methods , Systems Integration
9.
Nat Genet ; 47(12): 1393-401, 2015 Dec.
Article in English | MEDLINE | ID: mdl-26502339

ABSTRACT

The function of human regulatory regions depends exquisitely on their local genomic environment and on cellular context, complicating experimental analysis of common disease- and trait-associated variants that localize within regulatory DNA. We use allelically resolved genomic DNase I footprinting data encompassing 166 individuals and 114 cell types to identify >60,000 common variants that directly influence transcription factor occupancy and regulatory DNA accessibility in vivo. The unprecedented scale of these data enables systematic analysis of the impact of sequence variation on transcription factor occupancy in vivo. We leverage this analysis to develop accurate models of variation affecting the recognition sites for diverse transcription factors and apply these models to discriminate nearly 500,000 common regulatory variants likely to affect transcription factor occupancy across the human genome. The approach and results provide a new foundation for the analysis and interpretation of noncoding variation in complete human genomes and for systems-level investigation of disease-associated variants.


Subject(s)
Chromatin/metabolism , Gene Expression Regulation , Genetic Variation/genetics , Polymorphism, Single Nucleotide/genetics , Promoter Regions, Genetic/genetics , Regulatory Elements, Transcriptional/genetics , Transcription Factors/metabolism , Genome, Human , Genomics/methods , Humans , Phenotype , Protein Binding , Transcription Factors/genetics
10.
Nature ; 518(7539): 317-30, 2015 Feb 19.
Article in English | MEDLINE | ID: mdl-25693563

ABSTRACT

The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.


Subject(s)
Epigenesis, Genetic/genetics , Epigenomics , Genome, Human/genetics , Base Sequence , Cell Lineage/genetics , Cells, Cultured , Chromatin/chemistry , Chromatin/genetics , Chromatin/metabolism , Chromosomes, Human/chemistry , Chromosomes, Human/genetics , Chromosomes, Human/metabolism , DNA/chemistry , DNA/genetics , DNA/metabolism , DNA Methylation , Datasets as Topic , Enhancer Elements, Genetic/genetics , Genetic Variation/genetics , Genome-Wide Association Study , Histones/metabolism , Humans , Organ Specificity/genetics , RNA/genetics , Reference Values
11.
Nature ; 515(7527): 365-70, 2014 Nov 20.
Article in English | MEDLINE | ID: mdl-25409825

ABSTRACT

The basic body plan and major physiological axes have been highly conserved during mammalian evolution, yet only a small fraction of the human genome sequence appears to be subject to evolutionary constraint. To quantify cis- versus trans-acting contributions to mammalian regulatory evolution, we performed genomic DNase I footprinting of the mouse genome across 25 cell and tissue types, collectively defining ∼8.6 million transcription factor (TF) occupancy sites at nucleotide resolution. Here we show that mouse TF footprints conjointly encode a regulatory lexicon that is ∼95% similar with that derived from human TF footprints. However, only ∼20% of mouse TF footprints have human orthologues. Despite substantial turnover of the cis-regulatory landscape, nearly half of all pairwise regulatory interactions connecting mouse TF genes have been maintained in orthologous human cell types through evolutionary innovation of TF recognition sequences. Furthermore, the higher-level organization of mouse TF-to-TF connections into cellular network architectures is nearly identical with human. Our results indicate that evolutionary selection on mammalian gene regulation is targeted chiefly at the level of trans-regulatory circuitry, enabling and potentiating cis-regulatory plasticity.


Subject(s)
Conserved Sequence/genetics , Evolution, Molecular , Mammals/genetics , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/genetics , Transcription Factors/metabolism , Animals , DNA Footprinting , Gene Expression Regulation, Developmental/genetics , Gene Regulatory Networks/genetics , Humans , Mice
12.
Nature ; 515(7527): 355-64, 2014 Nov 20.
Article in English | MEDLINE | ID: mdl-25409824

ABSTRACT

The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.


Subject(s)
Genome/genetics , Genomics , Mice/genetics , Molecular Sequence Annotation , Animals , Cell Lineage/genetics , Chromatin/genetics , Chromatin/metabolism , Conserved Sequence/genetics , DNA Replication/genetics , Deoxyribonuclease I/metabolism , Gene Expression Regulation/genetics , Gene Regulatory Networks/genetics , Genome-Wide Association Study , Humans , RNA/genetics , Regulatory Sequences, Nucleic Acid/genetics , Species Specificity , Transcription Factors/metabolism , Transcriptome/genetics
13.
Science ; 346(6212): 1007-12, 2014 Nov 21.
Article in English | MEDLINE | ID: mdl-25411453

ABSTRACT

To study the evolutionary dynamics of regulatory DNA, we mapped >1.3 million deoxyribonuclease I-hypersensitive sites (DHSs) in 45 mouse cell and tissue types, and systematically compared these with human DHS maps from orthologous compartments. We found that the mouse and human genomes have undergone extensive cis-regulatory rewiring that combines branch-specific evolutionary innovation and loss with widespread repurposing of conserved DHSs to alternative cell fates, and that this process is mediated by turnover of transcription factor (TF) recognition elements. Despite pervasive evolutionary remodeling of the location and content of individual cis-regulatory regions, within orthologous mouse and human cell types the global fraction of regulatory DNA bases encoding recognition sites for each TF has been strictly conserved. Our findings provide new insights into the evolutionary forces shaping mammalian regulatory DNA landscapes.


Subject(s)
Conserved Sequence , DNA/genetics , Evolution, Molecular , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/metabolism , Animals , Base Sequence , Deoxyribonuclease I , Genome, Human , Humans , Mice , Restriction Mapping
14.
Antimicrob Agents Chemother ; 57(5): 2204-15, 2013 May.
Article in English | MEDLINE | ID: mdl-23459479

ABSTRACT

Pseudomonas aeruginosa can develop resistance to polymyxin as a consequence of mutations in the PhoPQ regulatory system, mediated by covalent lipid A modification. Transposon mutagenesis of a polymyxin-resistant phoQ mutant defined 41 novel loci required for resistance, including two regulatory systems, ColRS and CprRS. Deletion of the colRS genes, individually or in tandem, abrogated the polymyxin resistance of a ΔphoQ mutant, as did individual or tandem deletion of cprRS. Individual deletion of colR or colS in a ΔphoQ mutant also suppressed 4-amino-L-arabinose addition to lipid A, consistent with the known role of this modification in polymyxin resistance. Surprisingly, tandem deletion of colRS or cprRS in the ΔphoQ mutant or individual deletion of cprR or cprS failed to suppress 4-amino-L-arabinose addition to lipid A, indicating that this modification alone is not sufficient for PhoPQ-mediated polymyxin resistance in P. aeruginosa. Episomal expression of colRS or cprRS in tandem or of cprR individually complemented the Pm resistance phenotype in the ΔphoQ mutant, while episomal expression of colR, colS, or cprS individually did not. Highly polymyxin-resistant phoQ mutants of P. aeruginosa isolated from polymyxin-treated cystic fibrosis patients harbored mutant alleles of colRS and cprS; when expressed in a ΔphoQ background, these mutant alleles enhanced polymyxin resistance. These results define ColRS and CprRS as two-component systems regulating polymyxin resistance in P. aeruginosa, indicate that addition of 4-amino-L-arabinose to lipid A is not the only PhoPQ-regulated biochemical mechanism required for resistance, and demonstrate that colRS and cprS mutations can contribute to high-level clinical resistance.


Subject(s)
Anti-Bacterial Agents/pharmacology , Bacterial Proteins/genetics , Drug Resistance, Bacterial/drug effects , Gene Expression Regulation, Bacterial/drug effects , Genes, Regulator/drug effects , Polymyxins/pharmacology , Pseudomonas aeruginosa/drug effects , Arabinose/analogs & derivatives , Arabinose/metabolism , Bacterial Proteins/metabolism , Cystic Fibrosis/drug therapy , Cystic Fibrosis/microbiology , DNA Transposable Elements , Drug Resistance, Bacterial/genetics , Gene Deletion , Genetic Complementation Test , Genetic Loci , Humans , Lipid A/metabolism , Mutation , Plasmids , Pseudomonas Infections/drug therapy , Pseudomonas Infections/microbiology , Pseudomonas aeruginosa/genetics , Pseudomonas aeruginosa/isolation & purification , Pseudomonas aeruginosa/metabolism
16.
J Bacteriol ; 194(24): 6965-6, 2012 Dec.
Article in English | MEDLINE | ID: mdl-23209222

ABSTRACT

Here we report the complete, accurate 1.89-Mb genome sequence of Francisella tularensis subsp. holarctica strain FSC200, isolated in 1998 in the Swedish municipality Ljusdal, which is in an area where tularemia is highly endemic. This genome is important because strain FSC200 has been extensively used for functional and genetic studies of Francisella and is well-characterized.


Subject(s)
Francisella tularensis/genetics , Genome, Bacterial , Tularemia/microbiology , Bacterial Typing Techniques , Base Sequence , Child, Preschool , DNA, Bacterial/genetics , Female , Francisella tularensis/isolation & purification , Humans , Molecular Sequence Data , Sequence Analysis, DNA , Sweden
17.
Nature ; 489(7414): 83-90, 2012 Sep 06.
Article in English | MEDLINE | ID: mdl-22955618

ABSTRACT

Regulatory factor binding to genomic DNA protects the underlying sequence from cleavage by DNase I, leaving nucleotide-resolution footprints. Using genomic DNase I footprinting across 41 diverse cell and tissue types, we detected 45 million transcription factor occupancy events within regulatory regions, representing differential binding to 8.4 million distinct short sequence elements. Here we show that this small genomic sequence compartment, roughly twice the size of the exome, encodes an expansive repertoire of conserved recognition sequences for DNA-binding proteins that nearly doubles the size of the human cis-regulatory lexicon. We find that genetic variants affecting allelic chromatin states are concentrated in footprints, and that these elements are preferentially sheltered from DNA methylation. High-resolution DNase I cleavage patterns mirror nucleotide-level evolutionary conservation and track the crystallographic topography of protein-DNA interfaces, indicating that transcription factor structure has been evolutionarily imprinted on the human genome sequence. We identify a stereotyped 50-base-pair footprint that precisely defines the site of transcript origination within thousands of human promoters. Finally, we describe a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function, and exhibit cell-selective occupancy patterns that closely parallel major regulators of development, differentiation and pluripotency.


Subject(s)
DNA Footprinting , DNA/genetics , Encyclopedias as Topic , Genome, Human/genetics , Molecular Sequence Annotation , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/metabolism , DNA Methylation , DNA-Binding Proteins/metabolism , Deoxyribonuclease I/metabolism , Genomic Imprinting , Genomics , Humans , Polymorphism, Single Nucleotide/genetics , Transcription Initiation Site
18.
Science ; 337(6099): 1190-5, 2012 Sep 07.
Article in English | MEDLINE | ID: mdl-22955828

ABSTRACT

Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure-related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn's disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.


Subject(s)
DNA/genetics , Disease/genetics , Genetic Variation , Polymorphism, Single Nucleotide , Regulatory Elements, Transcriptional , Regulatory Sequences, Nucleic Acid , Transcription Factors/metabolism , Alleles , Chromatin/metabolism , Chromatin/ultrastructure , Crohn Disease/genetics , Deoxyribonuclease I/metabolism , Electrocardiography , Fetal Development , Fetus/metabolism , Gene Regulatory Networks , Genome, Human , Genome-Wide Association Study , Humans , Multiple Sclerosis/genetics , Phenotype , Promoter Regions, Genetic , Transcription Factors/chemistry , Transcription Factors/genetics
19.
Genome Res ; 22(9): 1680-8, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22955980

ABSTRACT

CTCF is a ubiquitously expressed regulator of fundamental genomic processes including transcription, intra- and interchromosomal interactions, and chromatin structure. Because of its critical role in genome function, CTCF binding patterns have long been assumed to be largely invariant across different cellular environments. Here we analyze genome-wide occupancy patterns of CTCF by ChIP-seq in 19 diverse human cell types, including normal primary cells and immortal lines. We observed highly reproducible yet surprisingly plastic genomic binding landscapes, indicative of strong cell-selective regulation of CTCF occupancy. Comparison with massively parallel bisulfite sequencing data indicates that 41% of variable CTCF binding is linked to differential DNA methylation, concentrated at two critical positions within the CTCF recognition sequence. Unexpectedly, CTCF binding patterns were markedly different in normal versus immortal cells, with the latter showing widespread disruption of CTCF binding associated with increased methylation. Strikingly, this disruption is accompanied by up-regulation of CTCF expression, with the result that both normal and immortal cells maintain the same average number of CTCF occupancy sites genome-wide. These results reveal a tight linkage between DNA methylation and the global occupancy patterns of a major sequence-specific regulatory factor.


Subject(s)
DNA Methylation , Repressor Proteins/metabolism , Binding Sites/genetics , CCCTC-Binding Factor , Cell Line , Chromatin Immunoprecipitation , Cluster Analysis , CpG Islands , Gene Expression Regulation , High-Throughput Nucleotide Sequencing , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...