RESUMO
Long noncoding RNAs (lncRNAs) can regulate the activity of target genes by participating in the organization of chromatin architecture. We have devised a "chromatin-RNA in situ reverse transcription sequencing" (CRIST-seq) approach to profile the lncRNA interaction network in gene regulatory elements by combining the simplicity of RNA biotin labeling with the specificity of the CRISPR/Cas9 system. Using gene-specific gRNAs, we describe a pluripotency-specific lncRNA interacting network in the promoters of Sox2 and Pou5f1, two critical stem cell factors that are required for the maintenance of pluripotency. The promoter-interacting lncRNAs were specifically activated during reprogramming into pluripotency. Knockdown of these lncRNAs caused the stem cells to exit from pluripotency. In contrast, overexpression of the pluripotency-associated lncRNA activated the promoters of core stem cell factor genes and enhanced fibroblast reprogramming into pluripotency. These CRIST-seq data suggest that the Sox2 and Pou5f1 promoters are organized within a unique lncRNA interaction network that determines the fate of pluripotency during reprogramming. This CRIST approach may be broadly used to map lncRNA interaction networks at target loci across the genome.
Assuntos
Cromatina/genética , Fator 3 de Transcrição de Octâmero/genética , RNA Longo não Codificante/genética , Fatores de Transcrição SOXB1/genética , Análise de Sequência de RNA/métodos , Animais , Sistemas CRISPR-Cas , Linhagem Celular , Reprogramação Celular , Fibroblastos/citologia , Fibroblastos/metabolismo , Camundongos , Células-Tronco Pluripotentes/citologia , Células-Tronco Pluripotentes/metabolismo , Regiões Promotoras Genéticas , Sequências Reguladoras de Ácido NucleicoRESUMO
Eukaryotic chromosomes replicate in a temporal order known as the replication-timing program. In mammals, replication timing is cell-type-specific with at least half the genome switching replication timing during development, primarily in units of 400-800 kilobases ('replication domains'), whose positions are preserved in different cell types, conserved between species, and appear to confine long-range effects of chromosome rearrangements. Early and late replication correlate, respectively, with open and closed three-dimensional chromatin compartments identified by high-resolution chromosome conformation capture (Hi-C), and, to a lesser extent, late replication correlates with lamina-associated domains (LADs). Recent Hi-C mapping has unveiled substructure within chromatin compartments called topologically associating domains (TADs) that are largely conserved in their positions between cell types and are similar in size to replication domains. However, TADs can be further sub-stratified into smaller domains, challenging the significance of structures at any particular scale. Moreover, attempts to reconcile TADs and LADs to replication-timing data have not revealed a common, underlying domain structure. Here we localize boundaries of replication domains to the early-replicating border of replication-timing transitions and map their positions in 18 human and 13 mouse cell types. We demonstrate that, collectively, replication domain boundaries share a near one-to-one correlation with TAD boundaries, whereas within a cell type, adjacent TADs that replicate at similar times obscure replication domain boundaries, largely accounting for the previously reported lack of alignment. Moreover, cell-type-specific replication timing of TADs partitions the genome into two large-scale sub-nuclear compartments revealing that replication-timing transitions are indistinguishable from late-replicating regions in chromatin composition and lamina association and accounting for the reduced correlation of replication timing to LADs and heterochromatin. Our results reconcile cell-type-specific sub-nuclear compartmentalization and replication timing with developmentally stable structural domains and offer a unified model for large-scale chromosome structure and function.
Assuntos
Cromatina/química , Cromatina/genética , Período de Replicação do DNA , DNA/biossíntese , Animais , Compartimento Celular , Cromatina/metabolismo , Montagem e Desmontagem da Cromatina , DNA/genética , Genoma/genética , Heterocromatina/química , Heterocromatina/genética , Heterocromatina/metabolismo , Humanos , Camundongos , Especificidade de Órgãos , Fatores de TempoRESUMO
The folding and three-dimensional (3D) organization of chromatin in the nucleus critically impacts genome function. The past decade has witnessed rapid advances in genomic tools for delineating 3D genome architecture. Among them, chromosome conformation capture (3C)-based methods such as Hi-C are the most widely used techniques for mapping chromatin interactions. However, traditional Hi-C protocols rely on restriction enzymes (REs) to fragment chromatin and are therefore limited in resolution. We recently developed DNase Hi-C for mapping 3D genome organization, which uses DNase I for chromatin fragmentation. DNase Hi-C overcomes RE-related limitations associated with traditional Hi-C methods, leading to improved methodological resolution. Furthermore, combining this method with DNA capture technology provides a high-throughput approach (targeted DNase Hi-C) that allows for mapping fine-scale chromatin architecture at exceptionally high resolution. Hence, targeted DNase Hi-C will be valuable for delineating the physical landscapes of cis-regulatory networks that control gene expression and for characterizing phenotype-associated chromatin 3D signatures. Here, we provide a detailed description of method design and step-by-step working protocols for these two methods.
Assuntos
Mapeamento Cromossômico/métodos , Desoxirribonuclease I/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Imageamento Tridimensional/métodos , Imagem Molecular/métodos , Técnicas de Cultura de Células/instrumentação , Técnicas de Cultura de Células/métodos , Núcleo Celular/genética , Núcleo Celular/metabolismo , Cromatina/química , Cromatina/genética , Mapeamento Cromossômico/instrumentação , Reagentes de Ligações Cruzadas/química , Enzimas de Restrição do DNA/química , Enzimas de Restrição do DNA/metabolismo , Desoxirribonuclease I/química , Formaldeído/química , Biblioteca Gênica , Sequenciamento de Nucleotídeos em Larga Escala/instrumentação , Imageamento Tridimensional/instrumentação , Imagem Molecular/instrumentação , Técnicas de Cultura de Tecidos/instrumentação , Técnicas de Cultura de Tecidos/métodos , Sequenciamento Completo do Genoma/instrumentação , Sequenciamento Completo do Genoma/métodosRESUMO
High-throughput methods based on chromosome conformation capture have greatly advanced our understanding of the three-dimensional (3D) organization of genomes but are limited in resolution by their reliance on restriction enzymes. Here we describe a method called DNase Hi-C for comprehensively mapping global chromatin contacts. DNase Hi-C uses DNase I for chromatin fragmentation, leading to greatly improved efficiency and resolution over that of Hi-C. Coupling this method with DNA-capture technology provides a high-throughput approach for targeted mapping of fine-scale chromatin architecture. We applied targeted DNase Hi-C to characterize the 3D organization of 998 large intergenic noncoding RNA (lincRNA) promoters in two human cell lines. Our results revealed that expression of lincRNAs is tightly controlled by complex mechanisms involving both super-enhancers and the Polycomb repressive complex. Our results provide the first glimpse of the cell type-specific 3D organization of lincRNA genes.
Assuntos
Cromatina/fisiologia , RNA não Traduzido/genética , Cromatina/química , Cromatina/ultraestrutura , Mapeamento Cromossômico , Desoxirribonuclease I/metabolismo , Genoma , Humanos , Células K562 , Conformação Proteica , Elementos Reguladores de Transcrição/genéticaRESUMO
BACKGROUND: Transcription factors regulate numerous cellular processes by controlling the rate of production of each gene. The regulatory relations are modeled using transcriptional regulatory networks. Recent studies have shown that such networks have an underlying hierarchical organization. We consider the problem of discovering the underlying hierarchy in transcriptional regulatory networks. RESULTS: We first transform this problem to a mixed integer programming problem. We then use existing tools to solve the resulting problem. For larger networks this strategy does not work due to rapid increase in running time and space usage. We use divide and conquer strategy for such networks. We use our method to analyze the transcriptional regulatory networks of E. coli, H. sapiens and S. cerevisiae. CONCLUSIONS: Our experiments demonstrate that: (i) Our method gives statistically better results than three existing state of the art methods; (ii) Our method is robust against errors in the data and (iii) Our method's performance is not affected by the different topologies in the data.
Assuntos
Biologia Computacional/métodos , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Software , Fatores de Transcrição/metabolismo , Escherichia coli/genética , Humanos , Saccharomyces cerevisiae/genéticaRESUMO
We consider the problem of similarity queries in biological network databases. Given a database of networks, similarity query returns all the database networks whose similarity (i.e. alignment score) to a given query network is at least a specified similarity cutoff value. Alignment of two networks is a very costly operation, which makes exhaustive comparison of all the database networks with a query impractical. To tackle this problem, we develop a novel indexing method, named RINQ (Reference-based Indexing for Biological Network Queries). Our method uses a set of reference networks to eliminate a large portion of the database quickly for each query. A reference network is a small biological network. We precompute and store the alignments of all the references with all the database networks. When our database is queried, we align the query network with all the reference networks. Using these alignments, we calculate a lower bound and an approximate upper bound to the alignment score of each database network with the query network. With the help of upper and lower bounds, we eliminate the majority of the database networks without aligning them to the query network. We also quickly identify a small portion of these as guaranteed to be similar to the query. We perform pairwise alignment only for the remaining networks. We also propose a supervised method to pick references that have a large chance of filtering the unpromising database networks. Extensive experimental evaluation suggests that (i) our method reduced the running time of a single query on a database of around 300 networks from over 2 days to only 8 h; (ii) our method outperformed the state of the art method Closure Tree and SAGA by a factor of three or more; and (iii) our method successfully identified statistically and biologically significant relationships across networks and organisms.
Assuntos
Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Algoritmos , Animais , Biologia Computacional , Humanos , Transdução de Sinais , SoftwareRESUMO
BACKGROUND: A specific 3-dimensional intrachromosomal architecture of core stem cell factor genes is required to reprogram a somatic cell into pluripotency. As little is known about the epigenetic readers that orchestrate this architectural remodeling, we used a novel chromatin RNA in situ reverse transcription sequencing (CRIST-seq) approach to profile long noncoding RNAs (lncRNAs) in the Oct4 promoter. RESULTS: We identify Platr10 as an Oct4 - Sox2 binding lncRNA that is activated in somatic cell reprogramming. Platr10 is essential for the maintenance of pluripotency, and lack of this lncRNA causes stem cells to exit from pluripotency. In fibroblasts, ectopically expressed Platr10 functions in trans to activate core stem cell factor genes and enhance pluripotent reprogramming. Using RNA reverse transcription-associated trap sequencing (RAT-seq), we show that Platr10 interacts with multiple pluripotency-associated genes, including Oct4, Sox2, Klf4, and c-Myc, which have been extensively used to reprogram somatic cells. Mechanistically, we demonstrate that Platr10 helps orchestrate intrachromosomal promoter-enhancer looping and recruits TET1, the enzyme that actively induces DNA demethylation for the initiation of pluripotency. We further show that Platr10 contains an Oct4 binding element that interacts with the Oct4 promoter and a TET1-binding element that recruits TET1. Mutation of either of these two elements abolishes Platr10 activity. CONCLUSION: These data suggest that Platr10 functions as a novel chromatin RNA molecule to control pluripotency in trans by modulating chromatin architecture and regulating DNA methylation in the core stem cell factor network.
Assuntos
Reprogramação Celular , Cromatina/metabolismo , Células-Tronco Pluripotentes/metabolismo , RNA Longo não Codificante/metabolismo , Animais , Metilação de DNA , Fibroblastos/metabolismo , Camundongos , Fator 3 de Transcrição de Octâmero/genética , Regiões Promotoras Genéticas , RNA Longo não Codificante/genética , Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição SOXB1/metabolismo , Análise de Sequência de RNARESUMO
We consider the problem of finding a subnetwork in a given biological network (i.e. target network) that is most similar to a given small query network. We aim to find the optimal solution (i.e. the subnetwork with the largest alignment score) with a provable confidence bound. There is no known polynomial time solution to this problem in the literature. Alon et al. has developed a state-of-the-art coloring method that reduces the cost of this problem. This method randomly colors the target network prior to alignment for many iterations until a user-supplied confidence is reached. Here we develop a novel coloring method, named k-hop coloring (k is a positive integer), that achieves a provable confidence value in a small number of iterations without sacrificing the optimality. Our method considers the color assignments already made in the neighborhood of each target network node while assigning a color to a node. This way, it preemptively avoids many color assignments that are guaranteed to fail to produce the optimal alignment. We also develop a filtering method that eliminates the nodes that cannot be aligned without reducing the alignment score after each coloring instance. We demonstrate both theoretically and experimentally that our coloring method outperforms that of Alon et al., which is also used by a number network alignment methods, including QPath and QNet, by a factor of three without reducing the confidence in the optimality of the result. Our experiments also suggest that the resulting alignment method is capable of identifying functionally enriched regions in the target network successfully.