Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 61
Filtrar
1.
bioRxiv ; 2024 Mar 29.
Artículo en Inglés | MEDLINE | ID: mdl-37066352

RESUMEN

Knowledge of locations and activities of cis -regulatory elements (CREs) is needed to decipher basic mechanisms of gene regulation and to understand the impact of genetic variants on complex traits. Previous studies identified candidate CREs (cCREs) using epigenetic features in one species, making comparisons difficult between species. In contrast, we conducted an interspecies study defining epigenetic states and identifying cCREs in blood cell types to generate regulatory maps that are comparable between species, using integrative modeling of eight epigenetic features jointly in human and mouse in our V al i dated S ystematic I ntegrati on (VISION) Project. The resulting catalogs of cCREs are useful resources for further studies of gene regulation in blood cells, indicated by high overlap with known functional elements and strong enrichment for human genetic variants associated with blood cell phenotypes. The contribution of each epigenetic state in cCREs to gene regulation, inferred from a multivariate regression, was used to estimate epigenetic state Regulatory Potential (esRP) scores for each cCRE in each cell type, which were used to categorize dynamic changes in cCREs. Groups of cCREs displaying similar patterns of regulatory activity in human and mouse cell types, obtained by joint clustering on esRP scores, harbored distinctive transcription factor binding motifs that were similar between species. An interspecies comparison of cCREs revealed both conserved and species-specific patterns of epigenetic evolution. Finally, we showed that comparisons of the epigenetic landscape between species can reveal elements with similar roles in regulation, even in the absence of genomic sequence alignment.

2.
bioRxiv ; 2023 Nov 11.
Artículo en Inglés | MEDLINE | ID: mdl-37986839

RESUMEN

Despite the unique ability of pioneer transcription factors (PFs) to target nucleosomal sites in closed chromatin, they only bind a small fraction of their genomic motifs. The underlying mechanism of this selectivity is not well understood. Here, we design a high-throughput assay called ChIP-ISO to systematically dissect sequence features affecting the binding specificity of a classic PF, FOXA1. Combining ChIP-ISO with in vitro and neural network analyses, we find that 1) FOXA1 binding is strongly affected by co-binding TFs AP-1 and CEBPB, 2) FOXA1 and AP-1 show binding cooperativity in vitro, 3) FOXA1's binding is determined more by local sequences than chromatin context, including eu-/heterochromatin, and 4) AP-1 is partially responsible for differential binding of FOXA1 in different cell types. Our study presents a framework for elucidating genetic rules underlying PF binding specificity and reveals a mechanism for context-specific regulation of its binding.

3.
bioRxiv ; 2023 Oct 20.
Artículo en Inglés | MEDLINE | ID: mdl-37904910

RESUMEN

Genome-wide nucleosome profiles are predominantly characterized using MNase-seq, which involves extensive MNase digestion and size selection to enrich for mono-nucleosome-sized fragments. Most available MNase-seq analysis packages assume that nucleosomes uniformly protect 147bp DNA fragments. However, some nucleosomes with atypical histone or chemical compositions protect shorter lengths of DNA. The rigid assumptions imposed by current nucleosome analysis packages ignore variation in nucleosome lengths, potentially blinding investigators to regulatory roles played by atypical nucleosomes. To enable the characterization of different nucleosome types from MNase-seq data, we introduce the Size-based Expectation Maximization (SEM) nucleosome calling package. SEM employs a hierarchical Gaussian mixture model to estimate the positions and subtype identity of nucleosomes from MNase-seq fragments. Nucleosome subtypes are automatically identified based on the distribution of protected DNA fragment lengths at nucleosome positions. Benchmark analysis indicates that SEM is on par with existing packages in terms of standard nucleosome-calling accuracy metrics, while uniquely providing the ability to characterize nucleosome subtype identities. Using SEM on a low-dose MNase H2B MNase-ChIP-seq dataset from mouse embryonic stem cells, we identified three nucleosome types: short-fragment nucleosomes, canonical nucleosomes, and di-nucleosomes. The short-fragment nucleosomes can be divided further into two subtypes based on their chromatin accessibility. Interestingly, the subset of short-fragment nucleosomes in accessible regions exhibit high MNase sensitivity and display distribution patterns around transcription start sites (TSSs) and CTCF peaks, similar to the previously reported "fragile nucleosomes". These SEM-defined accessible short-fragment nucleosomes are found not just in promoters, but also in enhancers and other regulatory regions. Additional investigations reveal their co-localization with the chromatin remodelers Chd6, Chd8, and Ep400. In summary, SEM provides an effective platform for distinguishing various nucleosome subtypes, paving the way for future exploration of non-standard nucleosomes.

4.
bioRxiv ; 2023 Oct 31.
Artículo en Inglés | MEDLINE | ID: mdl-37873361

RESUMEN

The DNA-binding activities of transcription factors (TFs) are influenced by both intrinsic sequence preferences and extrinsic interactions with cell-specific chromatin landscapes and other regulatory proteins. Disentangling the roles of these binding determinants remains challenging. For example, the FoxA subfamily of Forkhead domain (Fox) TFs are known pioneer factors that can bind to relatively inaccessible sites during development. Yet FoxA TF binding also varies across cell types, pointing to a combination of intrinsic and extrinsic forces guiding their binding. While other Forkhead domain TFs are often assumed to have pioneering abilities, how sequence and chromatin features influence the binding of related Fox TFs has not been systematically characterized. Here, we present a principled approach to compare the relative contributions of intrinsic DNA sequence preference and cell-specific chromatin environments to a TF's DNA-binding activities. We apply our approach to investigate how a selection of Fox TFs (FoxA1, FoxC1, FoxG1, FoxL2, and FoxP3) vary in their binding specificity. We over-express the selected Fox TFs in mouse embryonic stem cells, which offer a platform to contrast each TF's binding activity within the same preexisting chromatin background. By applying a convolutional neural network to interpret the Fox TF binding patterns, we evaluate how sequence and preexisting chromatin features jointly contribute to induced TF binding. We demonstrate that Fox TFs bind different DNA targets, and drive differential gene expression patterns, even when induced in identical chromatin settings. Despite the association between Forkhead domains and pioneering activities, the selected Fox TFs display a wide range of affinities for preexiting chromatin states. Using sequence and chromatin feature attribution techniques to interpret the neural network predictions, we show that differential sequence preferences combined with differential abilities to engage relatively inaccessible chromatin together explain Fox TF binding patterns at individual sites and genome-wide.

5.
bioRxiv ; 2023 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-37745557

RESUMEN

Transposable elements (TEs) and other repetitive regions have been shown to contain gene regulatory elements, including transcription factor binding sites. Unfortunately, regulatory elements harbored by repeats have proven difficult to characterize using short-read sequencing assays such as ChIP-seq or ATAC-seq. Most regulatory genomics analysis pipelines discard "multi-mapped" reads that align equally well to multiple genomic locations. Since multi-mapped reads arise predominantly from repeats, current analysis pipelines fail to detect a substantial portion of regulatory events that occur in repetitive regions. To address this shortcoming, we developed Allo, a new approach to allocate multi-mapped reads in an efficient, accurate, and user-friendly manner. Allo combines probabilistic mapping of multi-mapped reads with a convolutional neural network that recognizes the read distribution features of potential peaks, offering enhanced accuracy in multi-mapping read assignment. Allo also provides read-level output in the form of a corrected alignment file, making it compatible with existing regulatory genomics analysis pipelines and downstream peak-finders. In a demonstration application on CTCF ChIP-seq data, we show that Allo results in the discovery of thousands of new CTCF peaks. Many of these peaks contain the expected cognate motif and/or serve as TAD boundaries. We additionally apply Allo to a diverse collection of ENCODE ChIP-seq datasets, resulting in multiple previously unidentified interactions between transcription factors and repetitive element families. Finally, we show that Allo may be particularly effective in identifying ChIP-seq peaks in younger TEs, which hold evolutionary significance due to their emergence during human evolution from primates.

6.
Genome Biol ; 24(1): 79, 2023 04 18.
Artículo en Inglés | MEDLINE | ID: mdl-37072822

RESUMEN

A promising alternative to comprehensively performing genomics experiments is to, instead, perform a subset of experiments and use computational methods to impute the remainder. However, identifying the best imputation methods and what measures meaningfully evaluate performance are open questions. We address these questions by comprehensively analyzing 23 methods from the ENCODE Imputation Challenge. We find that imputation evaluations are challenging and confounded by distributional shifts from differences in data collection and processing over time, the amount of available data, and redundancy among performance measures. Our analyses suggest simple steps for overcoming these issues and promising directions for more robust research.


Asunto(s)
Algoritmos , Epigenómica , Genómica/métodos
7.
Front Neurosci ; 16: 903881, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35801179

RESUMEN

Neuronal programming by forced expression of transcription factors (TFs) holds promise for clinical applications of regenerative medicine. However, the mechanisms by which TFs coordinate their activities on the genome and control distinct neuronal fates remain obscure. Using direct neuronal programming of embryonic stem cells, we dissected the contribution of a series of TFs to specific neuronal regulatory programs. We deconstructed the Ascl1-Lmx1b-Foxa2-Pet1 TF combination that has been shown to generate serotonergic neurons and found that stepwise addition of TFs to Ascl1 canalizes the neuronal fate into a diffuse monoaminergic fate. The addition of pioneer factor Foxa2 represses Phox2b to induce serotonergic fate, similar to in vivo regulatory networks. Foxa2 and Pet1 appear to act synergistically to upregulate serotonergic fate. Foxa2 and Pet1 co-bind to a small fraction of genomic regions but mostly bind to different regulatory sites. In contrast to the combinatorial binding activities of other programming TFs, Pet1 does not strictly follow the Foxa2 pioneer. These findings highlight the challenges in formulating generalizable rules for describing the behavior of TF combinations that program distinct neuronal subtypes.

8.
Science ; 377(6601): eabk2820, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35771912

RESUMEN

Precise Hox gene expression is crucial for embryonic patterning. Intra-Hox transcription factor binding and distal enhancer elements have emerged as the major regulatory modules controlling Hox gene expression. However, quantifying their relative contributions has remained elusive. Here, we introduce "synthetic regulatory reconstitution," a conceptual framework for studying gene regulation, and apply it to the HoxA cluster. We synthesized and delivered variant rat HoxA clusters (130 to 170 kilobases) to an ectopic location in the mouse genome. We found that a minimal HoxA cluster recapitulated correct patterns of chromatin remodeling and transcription in response to patterning signals, whereas the addition of distal enhancers was needed for full transcriptional output. Synthetic regulatory reconstitution could provide a generalizable strategy for deciphering the regulatory logic of gene expression in complex genomes.


Asunto(s)
Tipificación del Cuerpo , Regulación del Desarrollo de la Expresión Génica , Genes Homeobox , Proteínas de Homeodominio , Animales , Tipificación del Cuerpo/genética , Elementos de Facilitación Genéticos , Genoma , Proteínas de Homeodominio/genética , Ratones , Ratas , Transcripción Genética
9.
Genome Biol ; 23(1): 99, 2022 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-35440038

RESUMEN

Reproducibility is a significant challenge in (epi)genomic research due to the complexity of experiments composed of traditional biochemistry and informatics. Recent advances have exacerbated this as high-throughput sequencing data is generated at an unprecedented pace. Here, we report the development of a Platform for Epi-Genomic Research (PEGR), a web-based project management platform that tracks and quality controls experiments from conception to publication-ready figures, compatible with multiple assays and bioinformatic pipelines. It supports rigor and reproducibility for biochemists working at the bench, while fully supporting reproducibility and reliability for bioinformaticians through integration with the Galaxy platform.


Asunto(s)
Epigenómica , Genómica , Biología Computacional , Genoma , Reproducibilidad de los Resultados , Programas Informáticos
10.
Cell Rep ; 38(11): 110524, 2022 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-35294876

RESUMEN

In pluripotent cells, a delicate activation-repression balance maintains pro-differentiation genes ready for rapid activation. The identity of transcription factors (TFs) that specifically repress pro-differentiation genes remains obscure. By targeting ∼1,700 TFs with CRISPR loss-of-function screen, we found that ZBTB11 and ZFP131 are required for embryonic stem cell (ESC) pluripotency. ESCs without ZBTB11 or ZFP131 lose colony morphology, reduce proliferation rate, and upregulate transcription of genes associated with three germ layers. ZBTB11 and ZFP131 bind proximally to pro-differentiation genes. ZBTB11 or ZFP131 loss leads to an increase in H3K4me3, negative elongation factor (NELF) complex release, and concomitant transcription at associated genes. Together, our results suggest that ZBTB11 and ZFP131 maintain pluripotency by preventing premature expression of pro-differentiation genes and present a generalizable framework to maintain cellular potency.


Asunto(s)
Células Madre Embrionarias , Células Madre Pluripotentes , Animales , Humanos , Ratones , Diferenciación Celular/genética , Sistemas CRISPR-Cas , Células Madre Embrionarias/metabolismo , Estratos Germinativos/metabolismo , Células Madre Pluripotentes/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
11.
Genome Res ; 32(3): 512-523, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-35042722

RESUMEN

The intrinsic DNA sequence preferences and cell type-specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell type-specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species-specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results show that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats.


Asunto(s)
Redes Neurales de la Computación , Factores de Transcripción , Sitios de Unión , Secuenciación de Inmunoprecipitación de Cromatina , Biología Computacional/métodos , Unión Proteica , Factores de Transcripción/metabolismo
12.
Bioinformatics ; 37(18): 3011-3013, 2021 09 29.
Artículo en Inglés | MEDLINE | ID: mdl-33681991

RESUMEN

SUMMARY: Epigenetic modifications reflect key aspects of transcriptional regulation, and many epigenomic datasets have been generated under different biological contexts to provide insights into regulatory processes. However, the technical noise in epigenomic datasets and the many dimensions (features) examined make it challenging to effectively extract biologically meaningful inferences from these datasets. We developed a package that reduces noise while normalizing the epigenomic data by a novel normalization method, followed by integrative dimensional reduction by learning and assigning epigenetic states. This package, called S3V2-IDEAS, can be used to identify epigenetic states for multiple features, or identify discretized signal intensity levels and a master peak list across different cell types for a single feature. We illustrate the outputs and performance of S3V2-IDEAS using 137 epigenomics datasets from the VISION project that provides ValIdated Systematic IntegratiON of epigenomic data in hematopoiesis. AVAILABILITY AND IMPLEMENTATION: S3V2-IDEAS pipeline is freely available as open source software released under an MIT license at: https://github.com/guanjue/S3V2_IDEAS_ESMP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Epigenómica , Programas Informáticos , Epigenómica/métodos , Epigénesis Genética , Regulación de la Expresión Génica , Hematopoyesis
13.
Nature ; 592(7853): 309-314, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33692541

RESUMEN

The genome-wide architecture of chromatin-associated proteins that maintains chromosome integrity and gene regulation is not well defined. Here we use chromatin immunoprecipitation, exonuclease digestion and DNA sequencing (ChIP-exo/seq)1,2 to define this architecture in Saccharomyces cerevisiae. We identify 21 meta-assemblages consisting of roughly 400 different proteins that are related to DNA replication, centromeres, subtelomeres, transposons and transcription by RNA polymerase (Pol) I, II and III. Replication proteins engulf a nucleosome, centromeres lack a nucleosome, and repressive proteins encompass three nucleosomes at subtelomeric X-elements. We find that most promoters associated with Pol II evolved to lack a regulatory region, having only a core promoter. These constitutive promoters comprise a short nucleosome-free region (NFR) adjacent to a +1 nucleosome, which together bind the transcription-initiation factor TFIID to form a preinitiation complex. Positioned insulators protect core promoters from upstream events. A small fraction of promoters evolved an architecture for inducibility, whereby sequence-specific transcription factors (ssTFs) create a nucleosome-depleted region (NDR) that is distinct from an NFR. We describe structural interactions among ssTFs, their cognate cofactors and the genome. These interactions include the nucleosomal and transcriptional regulators RPD3-L, SAGA, NuA4, Tup1, Mediator and SWI-SNF. Surprisingly, we do not detect interactions between ssTFs and TFIID, suggesting that such interactions do not stably occur. Our model for gene induction involves ssTFs, cofactors and general factors such as TBP and TFIIB, but not TFIID. By contrast, constitutive transcription involves TFIID but not ssTFs engaged with their cofactors. From this, we define a highly integrated network of gene regulation by ssTFs.


Asunto(s)
Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Genoma Fúngico/genética , Complejos Multiproteicos/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Factores de Transcripción/genética , Coenzimas/metabolismo , Complejos Multiproteicos/metabolismo , Regiones Promotoras Genéticas , ARN Polimerasa I/metabolismo , ARN Polimerasa II/metabolismo , ARN Polimerasa III/metabolismo , Proteína de Unión a TATA-Box/genética , Proteína de Unión a TATA-Box/metabolismo , Factor de Transcripción TFIIB/genética , Factor de Transcripción TFIIB/metabolismo , Factor de Transcripción TFIID , Factores de Transcripción/metabolismo
14.
Genome Biol ; 22(1): 20, 2021 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-33413545

RESUMEN

BACKGROUND: Transcription factor (TF) binding specificity is determined via a complex interplay between the transcription factor's DNA binding preference and cell type-specific chromatin environments. The chromatin features that correlate with transcription factor binding in a given cell type have been well characterized. For instance, the binding sites for a majority of transcription factors display concurrent chromatin accessibility. However, concurrent chromatin features reflect the binding activities of the transcription factor itself and thus provide limited insight into how genome-wide TF-DNA binding patterns became established in the first place. To understand the determinants of transcription factor binding specificity, we therefore need to examine how newly activated transcription factors interact with sequence and preexisting chromatin landscapes. RESULTS: Here, we investigate the sequence and preexisting chromatin predictors of TF-DNA binding by examining the genome-wide occupancy of transcription factors that have been induced in well-characterized chromatin environments. We develop Bichrom, a bimodal neural network that jointly models sequence and preexisting chromatin data to interpret the genome-wide binding patterns of induced transcription factors. We find that the preexisting chromatin landscape is a differential global predictor of TF-DNA binding; incorporating preexisting chromatin features improves our ability to explain the binding specificity of some transcription factors substantially, but not others. Furthermore, by analyzing site-level predictors, we show that transcription factor binding in previously inaccessible chromatin tends to correspond to the presence of more favorable cognate DNA sequences. CONCLUSIONS: Bichrom thus provides a framework for modeling, interpreting, and visualizing the joint sequence and chromatin landscapes that determine TF-DNA binding dynamics.


Asunto(s)
Cromatina , Redes Neurales de la Computación , Unión Proteica/genética , Factores de Transcripción/metabolismo , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/metabolismo , Sitios de Unión/genética , Proteínas de Unión al ADN/metabolismo , Regulación de la Expresión Génica , Genoma , Histonas/metabolismo , Humanos
15.
Methods ; 189: 12-21, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-32652235

RESUMEN

Few existing methods enable the visualization of relationships between regulatory genomic activities and genome organization as captured by Hi-C experimental data. Genome-wide Hi-C datasets are often displayed using "heatmap" matrices, but it is difficult to intuit from these heatmaps which biochemical activities are compartmentalized together. High-dimensional Hi-C data vectors can alternatively be projected onto three-dimensional space using dimensionality reduction techniques. The resulting three-dimensional structures can serve as scaffolds for projecting other forms of genomic information, thereby enabling the exploration of relationships between genome organization and various genome annotations. However, while three-dimensional models are contextually appropriate for chromatin interaction data, some analyses and visualizations may be more intuitively and conveniently performed in two-dimensional space. We present a novel approach to the visualization and analysis of chromatin organization based on the Self-Organizing Map (SOM). The SOM algorithm provides a two-dimensional manifold which adapts to represent the high dimensional chromatin interaction space. The resulting data structure can then be used to assess relationships between regulatory genomic activities and chromatin interactions. For example, given a set of genomic coordinates corresponding to a given biochemical activity, the degree to which this activity is segregated or compartmentalized in chromatin interaction space can be intuitively visualized on the 2D SOM grid and quantified using Lorenz curve analysis. We demonstrate our approach for exploratory analysis of genome compartmentalization in a high-resolution Hi-C dataset from the human GM12878 cell line. Our SOM-based approach provides an intuitive visualization of the large-scale structure of Hi-C data and serves as a platform for integrative analyses of the relationships between various genomic activities and genome organization.


Asunto(s)
Algoritmos , Cromatina/metabolismo , Epigenómica/métodos , Redes Reguladoras de Genes , Línea Celular , Secuenciación de Inmunoprecipitación de Cromatina , Mapeo Cromosómico , Humanos , Programas Informáticos
16.
Mol Microbiol ; 115(5): 1005-1024, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33368818

RESUMEN

Differentiation from asexual blood stages to mature sexual gametocytes is required for the transmission of malaria parasites. Here, we report that the ApiAP2 transcription factor, PfAP2-G2 (PF3D7_1408200) plays a critical role in the maturation of Plasmodium falciparum gametocytes. PfAP2-G2 binds to the promoters of a wide array of genes that are expressed at many stages of the parasite life cycle. Interestingly, we also find binding of PfAP2-G2 within the gene body of almost 3,000 genes, which strongly correlates with the location of H3K36me3 and several other histone modifications as well as Heterochromatin Protein 1 (HP1), suggesting that occupancy of PfAP2-G2 in gene bodies may serve as an alternative regulatory mechanism. Disruption of pfap2-g2 does not impact asexual development, but the majority of sexual parasites are unable to mature beyond stage III gametocytes. The absence of pfap2-g2 leads to overexpression of 28% of the genes bound by PfAP2-G2 and none of the PfAP2-G2 bound genes are downregulated, suggesting that it is a repressor. We also find that PfAP2-G2 interacts with chromatin remodeling proteins, a microrchidia (MORC) protein, and another ApiAP2 protein (PF3D7_1139300). Overall our data demonstrate that PfAP2-G2 establishes an essential gametocyte maturation program in association with other chromatin-related proteins.


Asunto(s)
Células Germinativas/crecimiento & desarrollo , Malaria Falciparum/parasitología , Plasmodium falciparum/crecimiento & desarrollo , Plasmodium falciparum/metabolismo , Proteínas Protozoarias/metabolismo , Factores de Transcripción/metabolismo , Gametogénesis , Regulación del Desarrollo de la Expresión Génica , Células Germinativas/metabolismo , Humanos , Estadios del Ciclo de Vida , Plasmodium falciparum/genética , Proteínas Protozoarias/genética , Factores de Transcripción/genética
17.
Development ; 147(22)2020 11 23.
Artículo en Inglés | MEDLINE | ID: mdl-33028607

RESUMEN

Although Hox genes encode for conserved transcription factors (TFs), they are further divided into anterior, central and posterior groups based on their DNA-binding domain similarity. The posterior Hox group expanded in the deuterostome clade and patterns caudal and distal structures. We aimed to address how similar Hox TFs diverge to induce different positional identities. We studied Hox TF DNA-binding and regulatory activity during an in vitro motor neuron differentiation system that recapitulates embryonic development. We found diversity in the genomic binding profiles of different Hox TFs, even among the posterior group paralogs that share similar DNA-binding domains. These differences in genomic binding were explained by differing abilities to bind to previously inaccessible sites. For example, the posterior group HOXC9 had a greater ability to bind occluded sites than the posterior HOXC10, producing different binding patterns and driving differential gene expression programs. From these results, we propose that the differential abilities of posterior Hox TFs to bind to previously inaccessible chromatin drive patterning diversification.This article has an associated 'The people behind the papers' interview.


Asunto(s)
Diferenciación Celular , Cromatina/metabolismo , Desarrollo Embrionario , Regulación del Desarrollo de la Expresión Génica , Proteínas de Homeodominio/metabolismo , Neuronas Motoras/metabolismo , Factores de Transcripción/metabolismo , Animales , Línea Celular , Cromatina/genética , Proteínas de Homeodominio/genética , Ratones , Neuronas Motoras/citología , Factores de Transcripción/genética
18.
Nucleic Acids Res ; 48(20): 11215-11226, 2020 11 18.
Artículo en Inglés | MEDLINE | ID: mdl-32747934

RESUMEN

The ChIP-exo assay precisely delineates protein-DNA crosslinking patterns by combining chromatin immunoprecipitation with 5' to 3' exonuclease digestion. Within a regulatory complex, the physical distance of a regulatory protein to DNA affects crosslinking efficiencies. Therefore, the spatial organization of a protein-DNA complex could potentially be inferred by analyzing how crosslinking signatures vary between its subunits. Here, we present a computational framework that aligns ChIP-exo crosslinking patterns from multiple proteins across a set of coordinately bound regulatory regions, and which detects and quantifies protein-DNA crosslinking events within the aligned profiles. By producing consistent measurements of protein-DNA crosslinking strengths across multiple proteins, our approach enables characterization of relative spatial organization within a regulatory complex. Applying our approach to collections of ChIP-exo data, we demonstrate that it can recover aspects of regulatory complex spatial organization at yeast ribosomal protein genes and yeast tRNA genes. We also demonstrate the ability to quantify changes in protein-DNA complex organization across conditions by applying our approach to analyze Drosophila Pol II transcriptional components. Our results suggest that principled analyses of ChIP-exo crosslinking patterns enable inference of spatial organization within protein-DNA complexes.


Asunto(s)
Inmunoprecipitación de Cromatina/métodos , Proteínas de Unión al ADN/metabolismo , Exonucleasas/química , ARN de Transferencia/genética , Proteínas Ribosómicas/genética , Alineación de Secuencia/métodos , Factores de Transcripción/metabolismo , Algoritmos , Animales , Sitios de Unión , Simulación por Computador , Proteínas de Unión al ADN/química , Bases de Datos Genéticas , Drosophila/química , Drosophila/genética , Drosophila/metabolismo , Regiones Promotoras Genéticas , Unión Proteica , ARN Polimerasa II/química , ARN Polimerasa II/genética , ARN Polimerasa II/metabolismo , ARN Polimerasa III/química , ARN Polimerasa III/genética , ARN Polimerasa III/metabolismo , ARN de Transferencia/química , ARN de Transferencia/metabolismo , Proteínas Ribosómicas/química , Proteínas Ribosómicas/metabolismo , Saccharomyces cerevisiae/química , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Análisis de Secuencia de ADN/métodos , Factor de Transcripción TFIIIB/química , Factor de Transcripción TFIIIB/genética , Factor de Transcripción TFIIIB/metabolismo , Factores de Transcripción/química , Factores de Transcripción/genética , Factores de Transcripción TFIII/química , Factores de Transcripción TFIII/genética , Factores de Transcripción TFIII/metabolismo , Sitio de Iniciación de la Transcripción
19.
Genome Res ; 30(3): 472-484, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32132109

RESUMEN

Thousands of epigenomic data sets have been generated in the past decade, but it is difficult for researchers to effectively use all the data relevant to their projects. Systematic integrative analysis can help meet this need, and the VISION project was established for validated systematic integration of epigenomic data in hematopoiesis. Here, we systematically integrated extensive data recording epigenetic features and transcriptomes from many sources, including individual laboratories and consortia, to produce a comprehensive view of the regulatory landscape of differentiating hematopoietic cell types in mouse. By using IDEAS as our integrative and discriminative epigenome annotation system, we identified and assigned epigenetic states simultaneously along chromosomes and across cell types, precisely and comprehensively. Combining nuclease accessibility and epigenetic states produced a set of more than 200,000 candidate cis-regulatory elements (cCREs) that efficiently capture enhancers and promoters. The transitions in epigenetic states of these cCREs across cell types provided insights into mechanisms of regulation, including decreases in numbers of active cCREs during differentiation of most lineages, transitions from poised to active or inactive states, and shifts in nuclease accessibility of CTCF-bound elements. Regression modeling of epigenetic states at cCREs and gene expression produced a versatile resource to improve selection of cCREs potentially regulating target genes. These resources are available from our VISION website to aid research in genomics and hematopoiesis.


Asunto(s)
Epigénesis Genética , Hematopoyesis/genética , Células Madre Hematopoyéticas/metabolismo , Animales , Ratones , Elementos Reguladores de la Transcripción , Transcriptoma
20.
J Comput Biol ; 27(3): 429-435, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32023130

RESUMEN

Regulatory proteins can employ multiple direct and indirect modes of interaction with the genome. The ChIP-exo mixture model (ChExMix) provides a principled approach to detecting multiple protein-DNA interaction modes in a single ChIP-exo experiment. ChExMix discovers and characterizes binding event subtypes in ChIP-exo data by leveraging both protein-DNA cross-linking signatures and DNA motifs. In this study, we present a summary of the major features and applications of ChExMix. We demonstrate that ChExMix does not require high-resolution protein-DNA binding assay data to detect binding event subtypes. Specifically, we apply ChExMix to analyze 393 ChIP-seq data profiles in K562 cells. Similar binding event subtypes are discovered across multiple proteins, suggesting the existence of colocalized regulatory protein modules that are recruited to DNA through a particular sequence-specific transcription factor. Our results thus suggest that ChExMix can characterize protein-DNA binding interaction modes using data from multiple types of protein-DNA interaction assays.


Asunto(s)
Biología Computacional/métodos , Proteínas de Unión al ADN/metabolismo , ADN/metabolismo , Algoritmos , Inmunoprecipitación de Cromatina , ADN/química , Proteínas de Unión al ADN/química , Bases de Datos Genéticas , Humanos , Células K562 , Motivos de Nucleótidos , Unión Proteica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...