RESUMO
Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases.
Assuntos
Cromatina/química , Genoma Humano , Proteínas Repressoras/metabolismo , Transcrição Gênica , Animais , Fator de Ligação a CCCTC , Proteínas de Ciclo Celular/metabolismo , Linhagem Celular , Cromatina/genética , Cromatina/metabolismo , Proteínas Cromossômicas não Histona/metabolismo , Cromossomos/metabolismo , Empacotamento do DNA , Humanos , RNA Polimerase II/metabolismo , Salamandridae , CoesinasRESUMO
Single-cell multiomics techniques have been widely applied to detect the key signature of cells. These methods have achieved a single-molecule resolution and can even reveal spatial localization. These emerging methods provide insights elucidating the features of genomic, epigenomic and transcriptomic heterogeneity in individual cells. However, they have given rise to new computational challenges in data processing. Here, we describe Single-cell Single-molecule multiple Omics Pipeline (ScSmOP), a universal pipeline for barcode-indexed single-cell single-molecule multiomics data analysis. Essentially, the C language is utilized in ScSmOP to set up spaced-seed hash table-based algorithms for barcode identification according to ligation-based barcoding data and synthesis-based barcoding data, followed by data mapping and deconvolution. We demonstrate high reproducibility of data processing between ScSmOP and published pipelines in comprehensive analyses of single-cell omics data (scRNA-seq, scATAC-seq, scARC-seq), single-molecule chromatin interaction data (ChIA-Drop, SPRITE, RD-SPRITE), single-cell single-molecule chromatin interaction data (scSPRITE) and spatial transcriptomic data from various cell types and species. Additionally, ScSmOP shows more rapid performance and is a versatile, efficient, easy-to-use and robust pipeline for single-cell single-molecule multiomics data analysis.
Assuntos
Genômica , Multiômica , Reprodutibilidade dos Testes , Cromatina/genética , Análise de DadosRESUMO
The genomes of multicellular organisms are extensively folded into 3D chromosome territories within the nucleus1. Advanced 3D genome-mapping methods that combine proximity ligation and high-throughput sequencing (such as chromosome conformation capture, Hi-C)2, and chromatin immunoprecipitation techniques (such as chromatin interaction analysis by paired-end tag sequencing, ChIA-PET)3, have revealed topologically associating domains4 with frequent chromatin contacts, and have identified chromatin loops mediated by specific protein factors for insulation and regulation of transcription5-7. However, these methods rely on pairwise proximity ligation and reflect population-level views, and thus cannot reveal the detailed nature of chromatin interactions. Although single-cell Hi-C8 potentially overcomes this issue, this method may be limited by the sparsity of data that is inherent to current single-cell assays. Recent advances in microfluidics have opened opportunities for droplet-based genomic analysis9 but this approach has not yet been adapted for chromatin interaction analysis. Here we describe a strategy for multiplex chromatin-interaction analysis via droplet-based and barcode-linked sequencing, which we name ChIA-Drop. We demonstrate the robustness of ChIA-Drop in capturing complex chromatin interactions with single-molecule precision, which has not been possible using methods based on population-level pairwise contacts. By applying ChIA-Drop to Drosophila cells, we show that chromatin topological structures predominantly consist of multiplex chromatin interactions with high heterogeneity; ChIA-Drop also reveals promoter-centred multivalent interactions, which provide topological insights into transcription.
Assuntos
Cromatina/genética , Cromatina/metabolismo , Microfluídica/métodos , Análise de Sequência de DNA/métodos , Imagem Individual de Molécula/métodos , Imagem Individual de Molécula/normas , Animais , Sítios de Ligação/genética , Linhagem Celular , Cromatina/química , Drosophila melanogaster/citologia , Drosophila melanogaster/genética , Microfluídica/normas , Conformação de Ácido Nucleico , Regiões Promotoras Genéticas/genética , Ligação Proteica , RNA Polimerase II/química , RNA Polimerase II/metabolismo , Transcrição GênicaRESUMO
The emerging ligation-free three-dimensional (3D) genome mapping technologies can identify multiplex chromatin interactions with single-molecule precision. These technologies not only offer new insight into high-dimensional chromatin organization and gene regulation, but also introduce new challenges in data visualization and analysis. To overcome these challenges, we developed MCIBox, a toolkit for multi-way chromatin interaction (MCI) analysis, including a visualization tool and a platform for identifying micro-domains with clustered single-molecule chromatin complexes. MCIBox is based on various clustering algorithms integrated with dimensionality reduction methods that can display multiplex chromatin interactions at single-molecule level, allowing users to explore chromatin extrusion patterns and super-enhancers regulation modes in transcription, and to identify single-molecule chromatin complexes that are clustered into micro-domains. Furthermore, MCIBox incorporates a two-dimensional kernel density estimation algorithm to identify micro-domains boundaries automatically. These micro-domains were stratified with distinctive signatures of transcription activity and contained different cell-cycle-associated genes. Taken together, MCIBox represents an invaluable tool for the study of multiple chromatin interactions and inaugurates a previously unappreciated view of 3D genome structure.
Assuntos
Cromatina , Sequências Reguladoras de Ácido Nucleico , Cromatina/genética , Genoma , Regulação da Expressão GênicaRESUMO
CRISPR/Cas-based transcriptional activators can be enhanced by intrinsically disordered regions (IDRs). However, the underlying mechanisms are still debatable. Here, we examine 12 well-known IDRs by fusing them to the dCas9-VP64 activator, of which only seven can augment activation, albeit independently of their phase separation capabilities. Moreover, modular domains (MDs), another class of multivalent molecules, though ineffective in enhancing dCas9-VP64 activity on their own, show substantial enhancement in transcriptional activation when combined with dCas9-VP64-IDR. By varying the number of gRNA binding sites and fusing dCas9-VP64 with different IDRs/MDs, we uncover that optimal, rather than maximal, cis-trans cooperativity enables the most robust activation. Finally, targeting promoter-enhancer pairs yields synergistic effects, which can be further amplified via enhancing chromatin interactions. Overall, our study develops a versatile platform for efficient gene activation and sheds important insights into CRIPSR-based transcriptional activators enhanced with multivalent molecules.
Assuntos
Sistemas CRISPR-Cas , Ativação Transcricional , Humanos , Regiões Promotoras Genéticas , RNA Guia de Sistemas CRISPR-Cas/genética , RNA Guia de Sistemas CRISPR-Cas/metabolismo , Células HEK293 , Sítios de Ligação , Cromatina/metabolismo , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Elementos Facilitadores GenéticosRESUMO
The three-dimensional (3D) organization of chromatin within the nucleus is crucial for gene regulation. However, the 3D architectural features that coordinate the activation of an entire chromosome remain largely unknown. We introduce an omics method, RNA-associated chromatin DNA-DNA interactions, that integrates RNA polymerase II (RNAPII)-mediated regulome with stochastic optical reconstruction microscopy to investigate the landscape of noncoding RNA roX2-associated chromatin topology for gene equalization to achieve dosage compensation. Our findings reveal that roX2 anchors to the target gene transcription end sites (TESs) and spreads in a distinctive boot-shaped configuration, promoting a more open chromatin state for hyperactivation. Furthermore, roX2 arches TES to transcription start sites to enhance transcriptional loops, potentially facilitating RNAPII convoying and connecting proximal promoter-promoter transcriptional hubs for synergistic gene regulation. These TESs cluster as roX2 compartments, surrounded by inactive domains for coactivation of multiple genes within the roX2 territory. In addition, roX2 structures gradually form and scaffold for stepwise coactivation in dosage compensation.
Assuntos
Cromatina , RNA Polimerase II , Cromossomo X , Cromatina/metabolismo , Cromatina/genética , Cromossomo X/genética , RNA Polimerase II/metabolismo , RNA Polimerase II/genética , Animais , RNA não Traduzido/genética , Regulação da Expressão Gênica , Mecanismo Genético de Compensação de Dose , Regiões Promotoras Genéticas , Sítio de Iniciação de TranscriçãoRESUMO
Chromatin structural domains, or topologically associated domains (TADs), are a general organizing principle in chromatin biology. RNA polymerase II (RNAPII) mediates multiple chromatin interactive loops, tethering together as RNAPII-associated chromatin interaction domains (RAIDs) to offer a framework for gene regulation. RAID and TAD alterations have been found to be associated with diseases. They can be further dissected as micro-domains (micro-TADs and micro-RAIDs) by clustering single-molecule chromatin-interactive complexes from next-generation three-dimensional (3D) genome techniques, such as ChIA-Drop. Currently, there are few tools available for micro-domain boundary identification. In this work, we developed the MCI-frcnn deep learning method to train a Faster Region-based Convolutional Neural Network (Faster R-CNN) for micro-domain boundary detection. At the training phase in MCI-frcnn, 50 images of RAIDs from Drosophila RNAPII ChIA-Drop data, containing 261 micro-RAIDs with ground truth boundaries, were trained for 7 days. Using this well-trained MCI-frcnn, we detected micro-RAID boundaries for the input new images, with a fast speed (5.26 fps), high recognition accuracy (AUROC = 0.85, mAP = 0.69), and high boundary region quantification (genomic IoU = 76%). We further applied MCI-frcnn to detect human micro-TADs boundaries using human GM12878 SPRITE data and obtained a high region quantification score (mean gIoU = 85%). In all, the MCI-frcnn deep learning method which we developed in this work is a general tool for micro-domain boundary detection.
RESUMO
Mechanical signals from the extracellular microenvironment have been implicated in tumor and metastatic progression. Here, we identify nucleoporin NUP210 as a metastasis susceptibility gene for human estrogen receptor positive (ER+) breast cancer and a cellular mechanosensor. Nup210 depletion suppresses lung metastasis in mouse models of breast cancer. Mechanistically, NUP210 interacts with LINC complex protein SUN2 which connects the nucleus to the cytoskeleton. In addition, the NUP210/SUN2 complex interacts with chromatin via the short isoform of BRD4 and histone H3.1/H3.2 at the nuclear periphery. In Nup210 knockout cells, mechanosensitive genes accumulate H3K27me3 heterochromatin modification, mediated by the polycomb repressive complex 2 and differentially reposition within the nucleus. Transcriptional repression in Nup210 knockout cells results in defective mechanotransduction and focal adhesion necessary for their metastatic capacity. Our study provides an important role of nuclear pore protein in cellular mechanosensation and metastasis.
Assuntos
Neoplasias da Mama/patologia , Heterocromatina/metabolismo , Mecanotransdução Celular/genética , Complexo de Proteínas Formadoras de Poros Nucleares/metabolismo , Animais , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Fator de Ligação a CCCTC/metabolismo , Linhagem Celular Tumoral , Movimento Celular/genética , Citoesqueleto/metabolismo , Proteína Potenciadora do Homólogo 2 de Zeste/metabolismo , Adesões Focais/genética , Regulação Neoplásica da Expressão Gênica , Histonas/metabolismo , Humanos , Metiltransferases/metabolismo , Camundongos , Metástase Neoplásica , Células Neoplásicas Circulantes/metabolismo , Membrana Nuclear/metabolismo , Complexo de Proteínas Formadoras de Poros Nucleares/genética , Proteínas Nucleares/metabolismo , Polimorfismo Genético , Prognóstico , Regiões Promotoras Genéticas , Ligação Proteica , Proteínas Repressoras/metabolismo , Fatores de Transcrição/metabolismo , Microambiente TumoralRESUMO
BACKGROUND: Acute promyeloid leukemia (APL) is characterized by the oncogenic fusion protein PML-RARα, a major etiological agent in APL. However, the molecular mechanisms underlying the role of PML-RARα in leukemogenesis remain largely unknown. RESULTS: Using an inducible system, we comprehensively analyze the 3D genome organization in myeloid cells and its reorganization after PML-RARα induction and perform additional analyses in patient-derived APL cells with native PML-RARα. We discover that PML-RARα mediates extensive chromatin interactions genome-wide. Globally, it redefines the chromatin topology of the myeloid genome toward a more condensed configuration in APL cells; locally, it intrudes RNAPII-associated interaction domains, interrupts myeloid-specific transcription factors binding at enhancers and super-enhancers, and leads to transcriptional repression of genes critical for myeloid differentiation and maturation. CONCLUSIONS: Our results not only provide novel topological insights for the roles of PML-RARα in transforming myeloid cells into leukemia cells, but further uncover a topological framework of a molecular mechanism for oncogenic fusion proteins in cancers.
Assuntos
Montagem e Desmontagem da Cromatina , Regulação Neoplásica da Expressão Gênica , Leucemia Promielocítica Aguda/metabolismo , Proteínas de Fusão Oncogênica/metabolismo , Linhagem Celular Tumoral , Humanos , Leucemia Promielocítica Aguda/etiologiaRESUMO
The single-molecule multiplex chromatin interaction data are generated by emerging 3D genome mapping technologies such as GAM, SPRITE, and ChIA-Drop. These datasets provide insights into high-dimensional chromatin organization, yet introduce new computational challenges. Thus, we developed MIA-Sig, an algorithmic solution based on signal processing and information theory. We demonstrate its ability to de-noise the multiplex data, assess the statistical significance of chromatin complexes, and identify topological domains and frequent inter-domain contacts. On chromatin immunoprecipitation (ChIP)-enriched data, MIA-Sig can clearly distinguish the protein-associated interactions from the non-specific topological domains. Together, MIA-Sig represents a novel algorithmic framework for multiplex chromatin interaction analysis.
Assuntos
Cromatina/metabolismo , Processamento de Sinais Assistido por Computador , Software , Algoritmos , Regiões Promotoras GenéticasRESUMO
Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) is a robust method for capturing genome-wide chromatin interactions. Unlike other 3C-based methods, it includes a chromatin immunoprecipitation (ChIP) step that enriches for interactions mediated by specific target proteins. This unique feature allows ChIA-PET to provide the functional specificity and higher resolution needed to detect chromatin interactions, which chromosome conformation capture (3C)/Hi-C approaches have not achieved. The original ChIA-PET protocol generates short paired-end tags (2 × 20 base pairs (bp)) to detect two genomic loci that are far apart on linear chromosomes but are in spatial proximity in the folded genome. We have improved the original approach by developing long-read ChIA-PET, in which the length of the paired-end tags is increased (up to 2 × 250 bp). The longer PET reads not only improve the tag-mapping efficiency but also increase the probability of covering phased single-nucleotide polymorphisms (SNPs), which allows haplotype-specific chromatin interactions to be identified. Here, we provide the detailed protocol for long-read ChIA-PET that includes cell fixation and lysis, chromatin fragmentation by sonication, ChIP, proximity ligation with a bridge linker, Tn5 tagmentation, PCR amplification and high-throughput sequencing. For a well-trained molecular biologist, it typically takes 6 d from cell harvesting to the completion of library construction, up to a further 36 h for DNA sequencing and <20 h for processing of raw sequencing reads.