RESUMO
Interphase chromosomes reside within distinct nuclear regions known as chromosome territories (CTs). Recent observations from Hi-C analyses, a method mapping chromosomal interactions, have revealed varied decay in contact probabilities among different chromosomes. Our study explores the relationship between this contact decay and the particular shapes of the chromosome territories they occupy. For this, we employed molecular dynamics (MD) simulations to examine how confined polymers, resembling chromosomes, behave within different confinement geometries similar to chromosome territory boundaries. Our simulations unveil so far unreported relationships between contact probabilities and end-to-end distances varying based on different confinement geometries. These findings highlight the crucial impact of chromosome territories on shaping the larger-scale properties of 3D genome organization. They emphasize the intrinsic connection between the shapes of these territories and the contact behaviors exhibited by chromosomes. Understanding these correlations is key to accurately interpret Hi-C and microscopy data, and offers vital insights into the foundational principles governing genomic organization.
Assuntos
Cromossomos , Simulação de Dinâmica Molecular , Polímeros/química , Humanos , Cromatina/genética , InterfaseRESUMO
MOTIVATION: Hi-C is gaining prominence as a method for mapping genome organization. With declining sequencing costs and a growing demand for higher-resolution data, efficient tools for processing Hi-C datasets at different resolutions are crucial. Over the past decade, the .hic and Cooler file formats have become the de-facto standard to store interaction matrices produced by Hi-C experiments in binary format. Interoperability issues make it unnecessarily difficult to convert between the two formats and to develop applications that can process each format natively. RESULTS: We developed hictk, a toolkit that can transparently operate on .hic and .cool files with excellent performance. The toolkit is written in C++ and consists of a C++ library with Python and R bindings as well as CLI tools to perform common operations directly from the shell, including converting between .hic and .mcool formats. We benchmark the performance of hictk and compare it with other popular tools and libraries. We conclude that hictk significantly outperforms existing tools while providing the flexibility of natively working with both file formats without code duplication. AVAILABILITY AND IMPLEMENTATION: The hictk library, Python bindings and CLI tools are released under the MIT license as a multi-platform application available at github.com/paulsengroup/hictk. Pre-built binaries for Linux and macOS are available on bioconda. Python bindings for hictk are available on GitHub at github.com/paulsengroup/hictkpy, while R bindings are available on GitHub at github.com/paulsengroup/hictkR.
Assuntos
Software , Genômica/métodosRESUMO
Breast cancer entails intricate alterations in genome organization and expression. However, how three-dimensional (3D) chromatin structure changes in the progression from a normal to a breast cancer malignant state remains unknown. To address this, we conducted an analysis combining Hi-C data with lamina-associated domains (LADs), epigenomic marks, and gene expression in an in vitro model of breast cancer progression. Our results reveal that while the fundamental properties of topologically associating domains (TADs) remain largely stable, significant changes occur in the organization of compartments and subcompartments. These changes are closely correlated with alterations in the expression of oncogenic genes. We also observe a restructuring of TAD-TAD interactions, coinciding with a loss of spatial compartmentalization and radial positioning of the 3D genome. Notably, we identify a previously unrecognized interchromosomal insertion event, wherein a locus on chromosome 8 housing the MYC oncogene is inserted into a highly active subcompartment on chromosome 10. This insertion leads to the formation of de novo enhancer contacts and activation of the oncogene, illustrating how structural variants can interact with the 3D genome to drive oncogenic states. In summary, our findings provide evidence for the degradation of genome organization at multiple scales during breast cancer progression revealing novel relationships between genome 3D structure and oncogenic processes.
RESUMO
The three-dimensional organization of chromatin plays a crucial role in gene regulation and cellular processes like deoxyribonucleic acid (DNA) transcription, replication and repair. Hi-C and related techniques provide detailed views of spatial proximities within the nucleus. However, data analysis is challenging partially due to a lack of well-defined, underpinning mathematical frameworks. Recently, recognizing and analyzing geometric patterns in Hi-C data has emerged as a powerful approach. This review provides a summary of algorithms for automatic recognition and analysis of geometric patterns in Hi-C data and their correspondence with chromatin structure. We classify existing algorithms on the basis of the data representation and pattern recognition paradigm they make use of. Finally, we outline some of the challenges ahead and promising future directions.
Assuntos
Algoritmos , Cromatina , Cromatina/genética , Análise de DadosRESUMO
DNA loop extrusion emerges as a key process establishing genome structure and function. We introduce MoDLE, a computational tool for fast, stochastic modeling of molecular contacts from DNA loop extrusion capable of simulating realistic contact patterns genome wide in a few minutes. MoDLE accurately simulates contact maps in concordance with existing molecular dynamics approaches and with Micro-C data and does so orders of magnitude faster than existing approaches. MoDLE runs efficiently on machines ranging from laptops to high performance computing clusters and opens up for exploratory and predictive modeling of 3D genome structure in a wide range of settings.
Assuntos
DNARESUMO
The three-dimensional (3D) organization of the genome is shaped by interactions with multiple structures within the nucleus, affecting gene expression outcomes. Technological breakthroughs in recent years have generated vast data reflecting various aspects of nuclear genome architecture in space and time. Integrating these datasets into comprehensive 3D genome models can reveal new insights into genome structure and regulation in normal and disease states. In this chapter, we provide a step-by-step guide on how to generate publication-ready integrated 3D genome models from (raw) Hi-C data and from lamin-genome (LAD) contacts.
Assuntos
Genoma , Lâmina Nuclear , Núcleo Celular/genética , Cromatina , LaminasRESUMO
BACKGROUND: Mechanisms underlying genome 3D organization and domain formation in the mammalian nucleus are not completely understood. Multiple processes such as transcriptional compartmentalization, DNA loop extrusion and interactions with the nuclear lamina dynamically act on chromatin at multiple levels. Here, we explore long-range interaction patterns between topologically associated domains (TADs) in several cell types. RESULTS: We find that TAD long-range interactions are connected to many key features of chromatin organization, including open and closed compartments, compaction and loop extrusion processes. Domains that form large TAD cliques tend to be repressive across cell types, when comparing gene expression, LINE/SINE repeat content and chromatin subcompartments. Further, TADs in large cliques are larger in genomic size, less dense and depleted of convergent CTCF motifs, in contrast to smaller and denser TADs formed by a loop extrusion process. CONCLUSIONS: Our results shed light on the organizational principles that govern repressive and active domains in the human genome.
Assuntos
Montagem e Desmontagem da Cromatina , Cromatina , Animais , Cromossomos , Expressão Gênica , Genoma Humano , HumanosRESUMO
The intrinsic dynamic nature of chromosomes is emerging as a fundamental component in regulating DNA transcription, replication, and damage-repair among other nuclear functions. With this increased awareness, reinforced over the last ten years, many new experimental techniques, mainly based on microscopy and chromosome conformation capture, have been introduced to study the genome in space and time. Owing to the increasing complexity of these cutting-edge techniques, computational approaches have become of paramount importance to interpret, contextualize, and complement such experiments with new insights. Hence, it is becoming crucial for experimental biologists to have a clear understanding of the diverse theoretical modeling approaches available and the biological information each of them can provide.
Assuntos
Cromossomos/ultraestrutura , Modelos Teóricos , Nucleossomos/ultraestrutura , Transcrição Gênica , Cromossomos/genética , DNA/genética , DNA/ultraestrutura , Dano ao DNA/genética , Reparo do DNA/genética , Replicação do DNA/genética , Nucleossomos/genéticaRESUMO
Genomic information is selectively used to direct spatial and temporal gene expression during differentiation. Interactions between topologically associating domains (TADs) and between chromatin and the nuclear lamina organize and position chromosomes in the nucleus. However, how these genomic organizers together shape genome architecture is unclear. Here, using a dual-lineage differentiation system, we report long-range TAD-TAD interactions that form constitutive and variable TAD cliques. A differentiation-coupled relationship between TAD cliques and lamina-associated domains suggests that TAD cliques stabilize heterochromatin at the nuclear periphery. We also provide evidence of dynamic TAD cliques during mouse embryonic stem-cell differentiation and somatic cell reprogramming and of inter-TAD associations in single-cell high-resolution chromosome conformation capture (Hi-C) data. TAD cliques represent a level of four-dimensional genome conformation that reinforces the silencing of repressed developmental genes.
Assuntos
Diferenciação Celular/genética , Cromatina/genética , Adipogenia/genética , Animais , Linhagem da Célula/genética , Cromatina/ultraestrutura , Montagem e Desmontagem da Cromatina , Expressão Gênica , Genoma , Genoma Humano , Humanos , Camundongos , Modelos Genéticos , Células-Tronco Embrionárias Murinas/citologia , Células-Tronco Neurais/citologia , Neurogênese/genética , Lâmina Nuclear/genética , Células-Tronco/citologiaRESUMO
At the nuclear periphery, the genome is anchored to A- and B-type nuclear lamins in the form of heterochromatic lamina-associated domains. A-type lamins also associate with chromatin in the nuclear interior, away from the peripheral nuclear lamina. This nucleoplasmic lamin A environment tends to be euchromatic, suggesting distinct roles of lamin A in the regulation of gene expression in peripheral and more central regions of the nucleus. The hot-spot lamin A R482W mutation causing familial partial lipodystrophy of Dunnigan-type (FPLD2), affects lamin A association with chromatin at the nuclear periphery and in the nuclear interior, and is associated with 3-dimensional (3D) rearrangements of chromatin. Here, we highlight features of nuclear lamin association with the genome at the nuclear periphery and in the nuclear interior. We address recent data showing a rewiring of such interactions in cells from FPLD2 patients, and in adipose progenitor and induced pluripotent stem cell models of FPLD2. We discuss associated epigenetic and genome conformation changes elicited by the lamin A R482W mutation at the gene level. The findings argue that the mutation adversely impacts both global and local genome architecture throughout the nucleus space. The results, together with emerging new computational modeling tools, mark the start of a new era in our understanding of the 3D genomics of laminopathies.
RESUMO
Chrom3D is a computational platform for 3D genome modeling that simulates the spatial positioning of chromosome domains relative to each other and relative to the nuclear periphery. In Chrom3D, chromosomes are modeled as chains of contiguous beads, in which each bead represents a genomic domain. In this protocol, a bead represents a topologically associated domain (TAD) mapped from ensemble Hi-C data. Chrom3D takes as input data significant pairwise TAD-TAD interactions determined from a Hi-C contact matrix, and TAD interactions with the nuclear periphery, determined by ChIP-sequencing of nuclear lamins to define lamina-associated domains (LADs). Chrom3D is based on Monte Carlo simulations initiated from a starting random bead configuration. During the optimization process, TAD-TAD interactions constrain bead positions relative to each other, whereas LAD information constrains the corresponding bead toward the nuclear periphery. Optimization can be repeated many times to generate an ensemble of 3D genome models. Analyses of the models enable estimations of the radial positioning of genomic sites in the nucleus across cells in a population. Chrom3D provides opportunities to reveal spatial relationships between TADs and LADs. More generally, predictions from Chrom3D models can be experimentally tested in the laboratory. We describe the entire Chrom3D protocol for modeling a 3D diploid human genome, from the creation of input files to the final rendering of 3D genome structures. The procedure takes â¼18 h. Chrom3D is freely available on GitHub.
Assuntos
Cromatina/química , Cromossomos/química , Genoma , Imageamento Tridimensional/métodos , Conformação Molecular , Linhagem Celular , Fibroblastos , Humanos , SoftwareRESUMO
The p.R482W hotspot mutation in A-type nuclear lamins causes familial partial lipodystrophy of Dunnigan-type (FPLD2), a lipodystrophic syndrome complicated by early onset atherosclerosis. Molecular mechanisms underlying endothelial cell dysfunction conferred by the lamin A mutation remain elusive. However, lamin A regulates epigenetic developmental pathways and mutations could perturb these functions. Here, we demonstrate that lamin A R482W elicits endothelial differentiation defects in a developmental model of FPLD2. Genome modeling in fibroblasts from patients with FPLD2 caused by the lamin A R482W mutation reveals repositioning of the mesodermal regulator T/Brachyury locus towards the nuclear center relative to normal fibroblasts, suggesting enhanced activation propensity of the locus in a developmental model of FPLD2. Addressing this issue, we report phenotypic and transcriptional alterations in mesodermal and endothelial differentiation of induced pluripotent stem cells we generated from a patient with R482W-associated FPLD2. Correction of the LMNA mutation ameliorates R482W-associated phenotypes and gene expression. Transcriptomics links endothelial differentiation defects to decreased Polycomb-mediated repression of the T/Brachyury locus and over-activation of T target genes. Binding of the Polycomb repressor complex 2 to T/Brachyury is impaired by the mutated lamin A network, which is unable to properly associate with the locus. This leads to a deregulation of vascular gene expression over time. By connecting a lipodystrophic hotspot lamin A mutation to a disruption of early mesodermal gene expression and defective endothelial differentiation, we propose that the mutation rewires the fate of several lineages, resulting in multi-tissue pathogenic phenotypes.
Assuntos
Células Endoteliais/metabolismo , Proteínas Fetais/genética , Regulação da Expressão Gênica no Desenvolvimento , Lamina Tipo A/genética , Lipodistrofia Parcial Familiar/genética , Proteínas do Grupo Polycomb/genética , Proteínas com Domínio T/genética , Adolescente , Adulto , Estudos de Casos e Controles , Diferenciação Celular/genética , Linhagem da Célula/genética , Células Endoteliais/patologia , Feminino , Proteínas Fetais/metabolismo , Fibroblastos/metabolismo , Fibroblastos/patologia , Redes Reguladoras de Genes , Humanos , Células-Tronco Pluripotentes Induzidas/metabolismo , Células-Tronco Pluripotentes Induzidas/patologia , Lamina Tipo A/metabolismo , Lipodistrofia Parcial Familiar/metabolismo , Lipodistrofia Parcial Familiar/patologia , Masculino , Mesoderma/metabolismo , Mesoderma/patologia , Pessoa de Meia-Idade , Mutação , Proteínas do Grupo Polycomb/metabolismo , Cultura Primária de Células , Ligação Proteica , Transdução de Sinais , Proteínas com Domínio T/metabolismoRESUMO
The development of many sporadic cancers is directly initiated by carcinogen exposure. Carcinogens induce malignancies by creating DNA lesions (i.e., adducts) that can result in mutations if left unrepaired. Despite this knowledge, there has been remarkably little investigation into the regulation of susceptibility to acquire DNA lesions. In this study, we present the first quantitative human genome-wide map of DNA lesions induced by ultraviolet (UV) radiation, the ubiquitous carcinogen in sunlight that causes skin cancer. Remarkably, the pattern of carcinogen susceptibility across the genome of primary cells significantly reflects mutation frequency in malignant melanoma. Surprisingly, DNase-accessible euchromatin is protected from UV, while lamina-associated heterochromatin at the nuclear periphery is vulnerable. Many cancer driver genes have an intrinsic increase in carcinogen susceptibility, including the BRAF oncogene that has the highest mutation frequency in melanoma. These findings provide a genome-wide snapshot of DNA injuries at the earliest stage of carcinogenesis. Furthermore, they identify carcinogen susceptibility as an origin of genome instability that is regulated by nuclear architecture and mirrors mutagenesis in cancer.
Assuntos
Carcinógenos/toxicidade , Transformação Celular Neoplásica , Resistência a Medicamentos/genética , Instabilidade Genômica/efeitos dos fármacos , Instabilidade Genômica/genética , Mutagênese , Sequência de Bases/fisiologia , Transformação Celular Neoplásica/efeitos dos fármacos , Transformação Celular Neoplásica/genética , Células Cultivadas , Dano ao DNA , Resistência a Medicamentos/efeitos dos fármacos , Epigênese Genética/efeitos dos fármacos , Humanos , Melanoma/etiologia , Melanoma/genética , Mutagênese/efeitos dos fármacos , Mutagênese/genética , Neoplasias Cutâneas/etiologia , Neoplasias Cutâneas/genética , Raios Ultravioleta , Melanoma Maligno CutâneoRESUMO
Current three-dimensional (3D) genome modeling platforms are limited by their inability to account for radial placement of loci in the nucleus. We present Chrom3D, a user-friendly whole-genome 3D computational modeling framework that simulates positions of topologically-associated domains (TADs) relative to each other and to the nuclear periphery. Chrom3D integrates chromosome conformation capture (Hi-C) and lamin-associated domain (LAD) datasets to generate structure ensembles that recapitulate radial distributions of TADs detected in single cells. Chrom3D reveals unexpected spatial features of LAD regulation in cells from patients with a laminopathy-causing lamin mutation. Chrom3D is freely available on github.
Assuntos
Cromatina/genética , Biologia Computacional/métodos , Lâmina Nuclear/genética , Adulto , Feminino , Genoma , Células HeLa , Humanos , Masculino , Modelos Genéticos , Adulto JovemRESUMO
Combining genome-wide structural models with phenomenological data is at the forefront of efforts to understand the organizational principles regulating the human genome. Here, we use chromosome-chromosome contact data as knowledge-based constraints for large-scale three-dimensional models of the human diploid genome. The resulting models remain minimally entangled and acquire several functional features that are observed in vivo and that were never used as input for the model. We find, for instance, that gene-rich, active regions are drawn towards the nuclear center, while gene poor and lamina associated domains are pushed to the periphery. These and other properties persist upon adding local contact constraints, suggesting their compatibility with non-local constraints for the genome organization. The results show that suitable combinations of data analysis and physical modelling can expose the unexpectedly rich functionally-related properties implicit in chromosome-chromosome contact data. Specific directions are suggested for further developments based on combining experimental data analysis and genomic structural modelling.
Assuntos
Cromossomos Humanos/genética , Genoma Humano , Modelos Genéticos , Linhagem Celular , Segregação de Cromossomos/genética , Cromossomos Humanos/ultraestrutura , Bases de Dados Genéticas , Diploide , Fibroblastos/ultraestrutura , Células-Tronco Embrionárias Humanas/ultraestrutura , Humanos , Imageamento TridimensionalRESUMO
Genome-wide sequencing technologies enable investigations of the structural properties of the genome in various spatial dimensions. Here, we review computational techniques developed to model the three-dimensional genome in single cells versus ensembles of cells and assess their underlying assumptions. We further address approaches to study the spatio-temporal aspects of genome organization from single-cell data.
Assuntos
Biologia Computacional/métodos , Nucleossomos/química , Análise de Célula Única/métodos , Algoritmos , Modelos Moleculares , Conformação de Ácido Nucleico , Nucleossomos/genética , Análise EspacialRESUMO
UNLABELLED: : We present Galaxy Portal app, an open source interface to the Galaxy system through smart phones and tablets. The Galaxy Portal provides convenient and efficient monitoring of job completion, as well as opportunities for inspection of results and execution history. In addition to being useful to the Galaxy community, we believe that the app also exemplifies a useful way of exploiting mobile interfaces for research/high-performance computing resources in general. AVAILABILITY AND IMPLEMENTATION: The source is freely available under a GPL license on GitHub, along with user documentation and pre-compiled binaries and instructions for several platforms: https://github.com/Tarostar/QMLGalaxyPortal It is available for iOS version 7 (and newer) through the Apple App Store, and for Android through Google Play for version 4.1 (API 16) or newer. CONTACT: geirksa@ifi.uio.no.
Assuntos
Aplicativos Móveis , SoftwareRESUMO
The three-dimensional (3D) structure of the genome is important for orchestration of gene expression and cell differentiation. While mapping genomes in 3D has for a long time been elusive, recent adaptations of high-throughput sequencing to chromosome conformation capture (3C) techniques, allows for genome-wide structural characterization for the first time. However, reconstruction of "consensus" 3D genomes from 3C-based data is a challenging problem, since the data are aggregated over millions of cells. Recent single-cell adaptations to the 3C-technique, however, allow for non-aggregated structural assessment of genome structure, but data suffer from sparse and noisy interaction sampling. We present a manifold based optimization (MBO) approach for the reconstruction of 3D genome structure from chromosomal contact data. We show that MBO is able to reconstruct 3D structures based on the chromosomal contacts, imposing fewer structural violations than comparable methods. Additionally, MBO is suitable for efficient high-throughput reconstruction of large systems, such as entire genomes, allowing for comparative studies of genomic structure across cell-lines and different species.
Assuntos
Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Genoma/genética , Imageamento Tridimensional/métodos , Algoritmos , Animais , Cromossomos/química , Cromossomos/genética , CamundongosRESUMO
Identification of three-dimensional (3D) interactions between regulatory elements across the genome is crucial to unravel the complex regulatory machinery that orchestrates proliferation and differentiation of cells. ChIA-PET is a novel method to identify such interactions, where physical contacts between regions bound by a specific protein are quantified using next-generation sequencing. However, determining the significance of the observed interaction frequencies in such datasets is challenging, and few methods have been proposed. Despite the fact that regions that are close in linear genomic distance have a much higher tendency to interact by chance, no methods to date are capable of taking such dependency into account. Here, we propose a statistical model taking into account the genomic distance relationship, as well as the general propensity of anchors to be involved in contacts overall. Using both real and simulated data, we show that the previously proposed statistical test, based on Fisher's exact test, leads to invalid results when data are dependent on genomic distance. We also evaluate our method on previously validated cell-line specific and constitutive 3D interactions, and show that relevant interactions are significant, while avoiding over-estimating the significance of short nearby interactions.
Assuntos
Cromatina/química , Genômica/métodos , Modelos Estatísticos , Subunidade alfa 2 de Fator de Ligação ao Core/genética , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNARESUMO
Differentiation of osteoblasts from mesenchymal stem cells (MSCs) is an integral part of bone development and homeostasis, and may when improperly regulated cause disease such as bone cancer or osteoporosis. Using unbiased high-throughput methods we here characterize the landscape of global changes in gene expression, histone modifications, and DNA methylation upon differentiation of human MSCs to the osteogenic lineage. Furthermore, we provide a first genome-wide characterization of DNA binding sites of the bone master regulatory transcription factor Runt-related transcription factor 2 (RUNX2) in human osteoblasts, revealing target genes associated with regulation of proliferation, migration, apoptosis, and with a significant overlap with p53 regulated genes. These findings expand on emerging evidence of a role for RUNX2 in cancer, including bone metastases, and the p53 regulatory network. We further demonstrate that RUNX2 binds to distant regulatory elements, promoters, and with high frequency to gene 3' ends. Finally, we identify TEAD2 and GTF2I as novel regulators of osteogenesis.