Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Genome Biol ; 23(1): 119, 2022 05 24.
Artigo em Inglês | MEDLINE | ID: mdl-35606795

RESUMO

BACKGROUND: The analysis of chromatin binding patterns of proteins in different biological states is a main application of chromatin immunoprecipitation followed by sequencing (ChIP-seq). A large number of algorithms and computational tools for quantitative comparison of ChIP-seq datasets exist, but their performance is strongly dependent on the parameters of the biological system under investigation. Thus, a systematic assessment of available computational tools for differential ChIP-seq analysis is required to guide the optimal selection of analysis tools based on the present biological scenario. RESULTS: We created standardized reference datasets by in silico simulation and sub-sampling of genuine ChIP-seq data to represent different biological scenarios and binding profiles. Using these data, we evaluated the performance of 33 computational tools and approaches for differential ChIP-seq analysis. Tool performance was strongly dependent on peak size and shape as well as on the scenario of biological regulation. CONCLUSIONS: Our analysis provides unbiased guidelines for the optimized choice of software tools in differential ChIP-seq analysis.


Assuntos
Algoritmos , Sequenciamento de Cromatina por Imunoprecipitação , Imunoprecipitação da Cromatina , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Software
2.
Pediatr Res ; 92(5): 1332-1340, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-35173300

RESUMO

BACKGROUND: Identification and functional annotations of regulatory sequences play a pivotal role in heart development and function. METHODS: To generate a map of human heart-specific enhancers, we performed an integrative analysis of 148 chromatin immunoprecipitation coupled to massively parallel sequencing (ChIP-seq) samples with enhancer-associated epigenetic marks from the heart, liver, brain, and kidney. Functional validation of heart-specific enhancer activity was then performed using cultured cells. RESULTS: A 144.6-Mb candidate heart-specific enhancer compendium was generated by integrating the analysis of 148 epigenomic data sets from human and mouse hearts and control tissues. To validate in vivo enhancer activity, we tested 12 of these sequences around 45 CHD-related genes in cultured cells and found that 8 (67%) have reproducible heart-specific enhancer activity. A functional analysis demonstrated that the identified human heart-specific enhancer wf1 regulates the FBN1 gene which is involved in heart disease. CONCLUSIONS: Our study provides an integrative analysis pipeline for ChIP-seq data and identified a comprehensive catalog of human heart-specific enhancers for clinical CHD-related studies. IMPACT: Establishing an efficient way to analyze regulatory regions in CHD is very important. A highly qualified heart-specific enhancer compendium was generated by integrating 148 online ChIP-seq samples. Sixty-seven percent of predicted regulatory sequences have reproducible heart-specific enhancer activity in vivo. Human heart-specific enhancer wf1 regulates the CHD-related FBN1 gene.


Assuntos
Sequenciamento de Cromatina por Imunoprecipitação , Elementos Facilitadores Genéticos , Camundongos , Animais , Humanos , Imunoprecipitação da Cromatina , Coração , Sequenciamento de Nucleotídeos em Larga Escala
3.
PLoS Comput Biol ; 16(2): e1007644, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32069291

RESUMO

Methods for the analysis of time series single cell expression data (scRNA-Seq) either do not utilize information about transcription factors (TFs) and their targets or only study these as a post-processing step. Using such information can both, improve the accuracy of the reconstructed model and cell assignments, while at the same time provide information on how and when the process is regulated. We developed the Continuous-State Hidden Markov Models TF (CSHMM-TF) method which integrates probabilistic modeling of scRNA-Seq data with the ability to assign TFs to specific activation points in the model. TFs are assumed to influence the emission probabilities for cells assigned to later time points allowing us to identify not just the TFs controlling each path but also their order of activation. We tested CSHMM-TF on several mouse and human datasets. As we show, the method was able to identify known and novel TFs for all processes, assigned time of activation agrees with both expression information and prior knowledge and combinatorial predictions are supported by known interactions. We also show that CSHMM-TF improves upon prior methods that do not utilize TF-gene interaction.


Assuntos
RNA Citoplasmático Pequeno/metabolismo , RNA-Seq , Análise de Célula Única , Fatores de Transcrição/metabolismo , Algoritmos , Animais , Imunoprecipitação da Cromatina , Biologia Computacional , Bases de Dados Factuais , Perfilação da Expressão Gênica , Humanos , Fígado/metabolismo , Pulmão/metabolismo , Cadeias de Markov , Camundongos , Modelos Estatísticos , Probabilidade , Transcrição Gênica
4.
J Biosci ; 452020.
Artigo em Inglês | MEDLINE | ID: mdl-31965989

RESUMO

Malaria is a deadly, infectious disease caused by the parasite Plasmodium, leading to millions of deaths worldwide. Plasmodium requires a coordinated pattern of sequential gene expression for surviving in both invertebrate and vertebrate host environments. As parasites largely depend on host resources, they also develop efficient mechanisms to sense and adapt to variable nutrient conditions in the environment and modulate their virulence. Earlier we have shown that PfGCN5, a histone acetyltransferase, binds to the stress-responsive and virulence-related genes in a poised state and regulates their expression under temperature and artemisinin treatment conditions in P. falciparum. In this study, we show upregulation of PfGCN5 upon nutrient stress condition. With the help of chromatin immunoprecipitation coupled high-throughput sequencing (ChIP-seq) and transcriptomic (RNA-sequencing) analyses, we show that PfGCN5 is associated with the genes that are important for the maintenance of parasite cellular homeostasis upon nutrient stress condition. Furthermore, we identified various metabolic enzymes as interacting partners of PfGCN5 by immunoprecipitation coupled with mass spectroscopy, possibly acting as a sensor of nutrient conditions in the environment. We also demonstrated that PfGCN5 interacts and acetylates PfGAPDH in vitro. Collectively, our data provides important insights into transcriptional deregulation upon nutrient stress condition and elucidate the role of PfGCN5 during nutrient stress condition.


Assuntos
Histona Acetiltransferases/genética , Malária Falciparum/genética , Plasmodium falciparum/genética , Proteínas de Protozoários/genética , Acetilação , Imunoprecipitação da Cromatina , Regulação da Expressão Gênica , Interações Hospedeiro-Parasita/genética , Humanos , Malária Falciparum/parasitologia , Redes e Vias Metabólicas/genética , Nutrientes/genética , Nutrientes/metabolismo , Plasmodium falciparum/patogenicidade , RNA/economia , RNA-Seq , Estresse Fisiológico/genética
5.
Bioinformatics ; 34(14): 2356-2363, 2018 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-29528371

RESUMO

Motivation: Chromatin immunoprecipitation followed by sequencing (ChIP-seq) can detect read-enriched DNA loci for point-source (e.g. transcription factor binding) and broad-source factors (e.g. various histone modifications). Although numerous quality metrics for ChIP-seq data have been developed, the 'peaks' thus obtained are still difficult to assess with respect to signal-to-noise ratio (S/N) and the percentage of false positives. Results: We developed a quality-assessment tool for ChIP-seq data, strand-shift profile (SSP), which quantifies S/N and peak reliability without peak calling. We validated SSP in-depth using ≥ 1000 publicly available ChIP-seq datasets along with virtual data to demonstrate that SSP provides a quantifiable and sensitive score to different S/Ns for both point- and broad-source factors, which can be standardized across diverse cell types and read depths. SSP also provides an effective criterion to judge whether a specific normalization or a rejection is required for each sample, which cannot be estimated by quality metrics currently available. Finally, we show that 'hidden-duplicate reads' cause aberrantly high S/Ns, and SSP provides an additional metric to avoid them, which can also contribute to estimation of peak mode (point- or broad-source) of samples. Availability and implementation: SSP is open source software written in C++ and can be downloaded at https://github.com/rnakato/SSP. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Imunoprecipitação da Cromatina/métodos , Análise de Sequência de DNA/métodos , Software , Fatores de Transcrição/metabolismo , Animais , DNA/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Ligação Proteica , Reprodutibilidade dos Testes , Leveduras
6.
Nat Struct Mol Biol ; 25(1): 73-82, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-29323282

RESUMO

Histone 3 K4 trimethylation (depositing H3K4me3 marks) is typically associated with active promoters yet paradoxically occurs at untranscribed domains. Research to delineate the mechanisms of targeting H3K4 methyltransferases is ongoing. The oocyte provides an attractive system to investigate these mechanisms, because extensive H3K4me3 acquisition occurs in nondividing cells. We developed low-input chromatin immunoprecipitation to interrogate H3K4me3, H3K27ac and H3K27me3 marks throughout oogenesis. In nongrowing oocytes, H3K4me3 was restricted to active promoters, but as oogenesis progressed, H3K4me3 accumulated in a transcription-independent manner and was targeted to intergenic regions, putative enhancers and silent H3K27me3-marked promoters. Ablation of the H3K4 methyltransferase gene Mll2 resulted in loss of transcription-independent H3K4 trimethylation but had limited effects on transcription-coupled H3K4 trimethylation or gene expression. Deletion of Dnmt3a and Dnmt3b showed that DNA methylation protects regions from acquiring H3K4me3. Our findings reveal two independent mechanisms of targeting H3K4me3 to genomic elements, with MLL2 recruited to unmethylated CpG-rich regions independently of transcription.


Assuntos
Metilação de DNA , Histona-Lisina N-Metiltransferase/química , Histonas/química , Proteína de Leucina Linfoide-Mieloide/química , Animais , Imunoprecipitação da Cromatina , Ilhas de CpG , Feminino , Cadeias de Markov , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Células-Tronco Embrionárias Murinas/citologia , Análise Multivariada , Oócitos/citologia , Oogênese , Regiões Promotoras Genéticas , Análise de Sequência de RNA , Transcrição Gênica
7.
Methods Mol Biol ; 1672: 631-643, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29043652

RESUMO

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis can detect protein/DNA-binding and histone-modification sites across an entire genome. As there are various factors during sample preparation that affect the obtained results, multilateral quality assessments are essential. Here, we describe a step-by-step protocol using DROMPA, a program for user-friendly ChIP-seq pipelining. DROMPA can be used for quality assessment, data normalization, visualization, peak calling, and multiple statistical analyses.


Assuntos
Imunoprecipitação da Cromatina , Interpretação Estatística de Dados , Sequenciamento de Nucleotídeos em Larga Escala , Software , Imunoprecipitação da Cromatina/métodos , Imunoprecipitação da Cromatina/normas , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Controle de Qualidade , Interface Usuário-Computador , Fluxo de Trabalho
8.
BMC Bioinformatics ; 18(1): 530, 2017 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-29187152

RESUMO

BACKGROUND: Transcription factors (TFs) form a complex regulatory network within the cell that is crucial to cell functioning and human health. While methods to establish where a TF binds to DNA are well established, these methods provide no information describing how TFs interact with one another when they do bind. TFs tend to bind the genome in clusters, and current methods to identify these clusters are either limited in scope, unable to detect relationships beyond motif similarity, or not applied to TF-TF interactions. METHODS: Here, we present a proximity-based graph clustering approach to identify TF clusters using either ChIP-seq or motif search data. We use TF co-occurrence to construct a filtered, normalized adjacency matrix and use the Markov Clustering Algorithm to partition the graph while maintaining TF-cluster and cluster-cluster interactions. We then apply our graph structure beyond clustering, using it to increase the accuracy of motif-based TFBS searching for an example TF. RESULTS: We show that our method produces small, manageable clusters that encapsulate many known, experimentally validated transcription factor interactions and that our method is capable of capturing interactions that motif similarity methods might miss. Our graph structure is able to significantly increase the accuracy of motif TFBS searching, demonstrating that the TF-TF connections within the graph correlate with biological TF-TF interactions. CONCLUSION: The interactions identified by our method correspond to biological reality and allow for fast exploration of TF clustering and regulatory dynamics.


Assuntos
Algoritmos , Fatores de Transcrição/metabolismo , Imunoprecipitação da Cromatina , Análise por Conglomerados , DNA/química , DNA/isolamento & purificação , DNA/metabolismo , Redes Reguladoras de Genes , Humanos , Células K562 , Cadeias de Markov , Mapas de Interação de Proteínas/genética , Análise de Sequência de DNA , Fatores de Transcrição/genética
9.
Genome Biol ; 18(1): 219, 2017 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-29151363

RESUMO

BACKGROUND: Genome-wide quantification of enhancer activity in the human genome has proven to be a challenging problem. Recent efforts have led to the development of powerful tools for enhancer quantification. However, because of genome size and complexity, these tools have yet to be applied to the whole human genome. RESULTS:  In the current study, we use a human prostate cancer cell line, LNCaP as a model to perform whole human genome STARR-seq (WHG-STARR-seq) to reliably obtain an assessment of enhancer activity. This approach builds upon previously developed STARR-seq in the fly genome and CapSTARR-seq techniques in targeted human genomic regions. With an improved library preparation strategy, our approach greatly increases the library complexity per unit of starting material, which makes it feasible and cost-effective to explore the landscape of regulatory activity in the much larger human genome. In addition to our ability to identify active, accessible enhancers located in open chromatin regions, we can also detect sequences with the potential for enhancer activity that are located in inaccessible, closed chromatin regions. When treated with the histone deacetylase inhibitor, Trichostatin A, genes nearby this latter class of enhancers are up-regulated, demonstrating the potential for endogenous functionality of these regulatory elements. CONCLUSION: WHG-STARR-seq provides an improved approach to current pipelines for analysis of high complexity genomes to gain a better understanding of the intricacies of transcriptional regulation.


Assuntos
Elementos Facilitadores Genéticos , Genoma Humano , Genômica , Sequenciamento Completo do Genoma , Linhagem Celular , Cromatina , Imunoprecipitação da Cromatina , Biblioteca Genômica , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos
10.
PLoS One ; 12(3): e0172725, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28282436

RESUMO

It is now well established that eukaryote genomes have a common architectural organization into topologically associated domains (TADs) and evidence is accumulating that this organization plays an important role in gene regulation. However, the mechanisms that partition the genome into TADs and the nature of domain boundaries are still poorly understood. We have investigated boundary regions in the Drosophila genome and find that they can be identified as domains of very low H3K27me3. The genome-wide H3K27me3 profile partitions into two states; very low H3K27me3 identifies Depleted (D) domains that contain housekeeping genes and their regulators such as the histone acetyltransferase-containing NSL complex, whereas domains containing moderate-to-high levels of H3K27me3 (Enriched or E domains) are associated with regulated genes, irrespective of whether they are active or inactive. The D domains correlate with the boundaries of TADs and are enriched in a subset of architectural proteins, particularly Chromator, BEAF-32, and Z4/Putzig. However, rather than being clustered at the borders of these domains, these proteins bind throughout the H3K27me3-depleted regions and are much more strongly associated with the transcription start sites of housekeeping genes than with the H3K27me3 domain boundaries. While we have not demonstrated causality, we suggest that the D domain chromatin state, characterised by very low or absent H3K27me3 and established by housekeeping gene regulators, acts to separate topological domains thereby setting up the domain architecture of the genome.


Assuntos
Proteínas de Drosophila/metabolismo , Drosophila/genética , Histonas/metabolismo , Animais , Células Cultivadas , Cromatina/química , Cromatina/metabolismo , Imunoprecipitação da Cromatina , Drosophila/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/genética , Embrião não Mamífero/metabolismo , Genoma de Inseto , Histonas/química , Histonas/genética , Masculino , Cadeias de Markov , Proteínas do Grupo Polycomb/genética , Proteínas do Grupo Polycomb/metabolismo , Ligação Proteica , Domínios Proteicos , Espermatócitos/citologia , Espermatócitos/metabolismo , Sítio de Iniciação de Transcrição , Transcriptoma
11.
Methods Mol Biol ; 1552: 115-122, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28224494

RESUMO

Chromatin ImmunoPrecipitation-sequencing (ChIP-seq) experiments have now become routine in biology for the detection of protein binding sites. In this chapter, we show how hidden Markov models can be used for the analysis of data generated by ChIP-seq experiments. We show how a hidden Markov model can naturally account for spatial dependencies in the ChIP-seq data, how it can be used in the presence of data from multiple ChIP-seq experiments under the same biological condition, and how it naturally accounts for the different IP efficiencies of individual ChIP-seq experiments.


Assuntos
Imunoprecipitação da Cromatina/métodos , Cadeias de Markov , Modelos Estatísticos , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo , Humanos , Ligação Proteica
12.
Methods Mol Biol ; 1552: 135-148, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28224496

RESUMO

Hidden Markov model (HMM) is widely used for modeling spatially correlated genomic data (series data). In genomics, datasets of this kind are generated from genome-wide mapping studies through high-throughput methods such as chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq). When multiple regulatory protein binding sites or related epigenetic modifications are mapped simultaneously, the correlation between data series can be incorporated into the latent variable inference in a multivariate form of HMM, potentially increasing the statistical power of signal detection. In this chapter, we review the challenges of multivariate HMMs and propose a computationally tractable method called sparsely correlated HMMs (scHMM). We illustrate the method and the scHMM package using an example mouse ChIP-seq dataset.


Assuntos
Imunoprecipitação da Cromatina/métodos , Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Genoma , Genômica/métodos , Cadeias de Markov , Algoritmos , Animais , Sítios de Ligação , Epigênese Genética , Camundongos , Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição/metabolismo
13.
Nucleic Acids Res ; 45(8): e58, 2017 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-28053124

RESUMO

Comparing histone modification profiles between cancer and normal states, or across different tumor samples, can provide insights into understanding cancer initiation, progression and response to therapy. ChIP-seq histone modification data of cancer samples are distorted by copy number variation innate to any cancer cell. We present HMCan-diff, the first method designed to analyze ChIP-seq data to detect changes in histone modifications between two cancer samples of different genetic backgrounds, or between a cancer sample and a normal control. HMCan-diff explicitly corrects for copy number bias, and for other biases in the ChIP-seq data, which significantly improves prediction accuracy compared to methods that do not consider such corrections. On in silico simulated ChIP-seq data generated using genomes with differences in copy number profiles, HMCan-diff shows a much better performance compared to other methods that have no correction for copy number bias. Additionally, we benchmarked HMCan-diff on four experimental datasets, characterizing two histone marks in two different scenarios. We correlated changes in histone modifications between a cancer and a normal control sample with changes in gene expression. On all experimental datasets, HMCan-diff demonstrated better performance compared to the other methods.


Assuntos
Regulação Neoplásica da Expressão Gênica , Código das Histonas , Histonas/genética , Neoplasias/genética , Software , Algoritmos , Imunoprecipitação da Cromatina , Conjuntos de Dados como Assunto , Progressão da Doença , Dosagem de Genes , Histonas/metabolismo , Humanos , Cadeias de Markov , Neoplasias/metabolismo , Neoplasias/patologia
14.
Brief Bioinform ; 18(3): 367-381, 2017 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-27013647

RESUMO

Enriched region (ER) identification is a fundamental step in several next-generation sequencing (NGS) experiment types. Yet, although NGS experimental protocols recommend producing replicate samples for each evaluated condition and their consistency is usually assessed, typically pipelines for ER identification do not consider available NGS replicates. This may alter genome-wide descriptions of ERs, hinder significance of subsequent analyses on detected ERs and eventually preclude biological discoveries that evidence in replicate could support. MuSERA is a broadly useful stand-alone tool for both interactive and batch analysis of combined evidence from ERs in multiple ChIP-seq or DNase-seq replicates. Besides rigorously combining sample replicates to increase statistical significance of detected ERs, it also provides quantitative evaluations and graphical features to assess the biological relevance of each determined ER set within its genomic context; they include genomic annotation of determined ERs, nearest ER distance distribution, global correlation assessment of ERs and an integrated genome browser. We review MuSERA rationale and implementation, and illustrate how sets of significant ERs are expanded by applying MuSERA on replicates for several types of NGS data, including ChIP-seq of transcription factors or histone marks and DNase-seq hypersensitive sites. We show that MuSERA can determine a new, enhanced set of ERs for each sample by locally combining evidence on replicates, and prove how the easy-to-use interactive graphical displays and quantitative evaluations that MuSERA provides effectively support thorough inspection of obtained results and evaluation of their biological content, facilitating their understanding and biological interpretations. MuSERA is freely available at http://www.bioinformatics.deib.polimi.it/MuSERA/.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Imunoprecipitação da Cromatina , Genoma , Genômica , Software
15.
Oncotarget ; 8(8): 13015-13029, 2017 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-28035064

RESUMO

Von Willebrand factor (VWF) is a highly adhesive procoagulant molecule that mediates platelet adhesion to endothelial and subendothelial surfaces. Normally it is expressed exclusively in endothelial cells (ECs) and megakaryocytes. However, a few studies have reported VWF detection in cancer cells of non-endothelial origin, including osteosarcoma. A role for VWF in cancer metastasis has long been postulated but evidence supporting both pro- and anti-metastatic roles for VWF has been presented. We hypothesized that the role of VWF in cancer metastasis is influenced by its cellular origin and that cancer cell acquisition of VWF expression may contribute to enhanced metastatic potential. We demonstrated de novo expression of VWF in glioma as well as osteosarcoma cells. Endothelial monolayer adhesion, transmigration and extravasation capacities of VWF expressing cancer cells were shown to be enhanced compared to non-VWF expressing cells, and were significantly reduced as a result of VWF knock down. VWF expressing cancer cells were also detected in patient tumor samples of varying histologies. Analyses of the mechanism of transcriptional activation of the VWF in cancer cells demonstrated a pattern of trans-activating factor binding and epigenetic modifications consistent overall with that observed in ECs. These results demonstrate that cancer cells of non-endothelial origin can acquire de novo expression of VWF, which can enhance processes, including endothelial and platelet adhesion and extravasation, that contribute to cancer metastasis.


Assuntos
Glioma/patologia , Invasividade Neoplásica/patologia , Neoplasias/patologia , Osteossarcoma/patologia , Fator de von Willebrand/biossíntese , Animais , Embrião de Galinha , Imunoprecipitação da Cromatina , Metilação de DNA , Imunofluorescência , Técnicas de Silenciamento de Genes , Humanos , Imuno-Histoquímica , Camundongos
16.
Nucleic Acids Res ; 44(20): e153, 2016 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-27484474

RESUMO

The study of changes in protein-DNA interactions measured by ChIP-seq on dynamic systems, such as cell differentiation, response to treatments or the comparison of healthy and diseased individuals, is still an open challenge. There are few computational methods comparing changes in ChIP-seq signals with replicates. Moreover, none of these previous approaches addresses ChIP-seq specific experimental artefacts arising from studies with biological replicates. We propose THOR, a Hidden Markov Model based approach, to detect differential peaks between pairs of biological conditions with replicates. THOR provides all pre- and post-processing steps required in ChIP-seq analyses. Moreover, we propose a novel normalization approach based on housekeeping genes to deal with cases where replicates have distinct signal-to-noise ratios. To evaluate differential peak calling methods, we delineate a methodology using both biological and simulated data. This includes an evaluation procedure that associates differential peaks with changes in gene expression as well as histone modifications close to these peaks. We evaluate THOR and seven competing methods on data sets with distinct characteristics from in vitro studies with technical replicates to clinical studies of cancer patients. Our evaluation analysis comprises of 13 comparisons between pairs of biological conditions. We show that THOR performs best in all scenarios.


Assuntos
Imunoprecipitação da Cromatina , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Cadeias de Markov , Análise de Sequência de DNA , Algoritmos , Diferenciação Celular/genética , Conjuntos de Dados como Assunto , Células Dendríticas/imunologia , Células Dendríticas/metabolismo , Epigênese Genética , Histonas/metabolismo , Humanos , Linfoma de Células B/genética , Fluxo de Trabalho
17.
BMC Bioinformatics ; 17: 144, 2016 Mar 24.
Artigo em Inglês | MEDLINE | ID: mdl-27009150

RESUMO

BACKGROUND: Correctly identifying genomic regions enriched with histone modifications and transcription factors is key to understanding their regulatory and developmental roles. Conceptually, these regions are divided into two categories, narrow peaks and broad domains, and different algorithms are used to identify each one. Datasets that span these two categories are often analyzed with a single program for peak calling combined with an ad hoc method for domains. RESULTS: We developed hiddenDomains, which identifies both peaks and domains, and compare it to the leading algorithms using H3K27me3, H3K36me3, GABP, ESR1 and FOXA ChIP-seq datasets. The output from the programs was compared to qPCR-validated enriched and depleted sites, predicted transcription factor binding sites, and highly-transcribed gene bodies. With every method, hiddenDomains, performed as well as, if not better than algorithms dedicated to a specific type of analysis. CONCLUSIONS: hiddenDomains performs as well as the best domain and peak calling algorithms, making it ideal for analyzing ChIP-seq datasets, especially those that contain a mixture of peaks and domains.


Assuntos
Algoritmos , Imunoprecipitação da Cromatina , Receptor alfa de Estrogênio/metabolismo , Fator de Transcrição de Proteínas de Ligação GA/metabolismo , Histonas/metabolismo , Humanos , Cadeias de Markov
18.
Nat Methods ; 12(10): 963-965, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26280331

RESUMO

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is widely used to map histone marks and transcription factor binding throughout the genome. Here we present ChIPmentation, a method that combines chromatin immunoprecipitation with sequencing library preparation by Tn5 transposase ('tagmentation'). ChIPmentation introduces sequencing-compatible adaptors in a single-step reaction directly on bead-bound chromatin, which reduces time, cost and input requirements, thus providing a convenient and broadly useful alternative to existing ChIP-seq protocols.


Assuntos
Imunoprecipitação da Cromatina/métodos , Histonas/metabolismo , Fatores de Transcrição/metabolismo , Imunoprecipitação da Cromatina/economia , Imunoprecipitação da Cromatina/instrumentação , Genoma Humano , Humanos , Células K562 , Fatores de Transcrição/análise
19.
BMC Bioinformatics ; 16: 60, 2015 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-25884684

RESUMO

BACKGROUND: ChIP-seq has become a routine method for interrogating the genome-wide distribution of various histone modifications. An important experimental goal is to compare the ChIP-seq profiles between an experimental sample and a reference sample, and to identify regions that show differential enrichment. However, comparative analysis of samples remains challenging for histone modifications with broad domains, such as heterochromatin-associated H3K27me3, as most ChIP-seq algorithms are designed to detect well defined peak-like features. RESULTS: To address this limitation we introduce histoneHMM, a powerful bivariate Hidden Markov Model for the differential analysis of histone modifications with broad genomic footprints. histoneHMM aggregates short-reads over larger regions and takes the resulting bivariate read counts as inputs for an unsupervised classification procedure, requiring no further tuning parameters. histoneHMM outputs probabilistic classifications of genomic regions as being either modified in both samples, unmodified in both samples or differentially modified between samples. We extensively tested histoneHMM in the context of two broad repressive marks, H3K27me3 and H3K9me3, and evaluated region calls with follow up qPCR as well as RNA-seq data. Our results show that histoneHMM outperforms competing methods in detecting functionally relevant differentially modified regions. CONCLUSION: histoneHMM is a fast algorithm written in C++ and compiled as an R package. It runs in the popular R computing environment and thus seamlessly integrates with the extensive bioinformatic tool sets available through Bioconductor. This makeshistoneHMM an attractive choice for the differential analysis of ChIP-seq data. Software is available from http://histonehmm.molgen.mpg.de .


Assuntos
Algoritmos , Biologia Computacional/métodos , Genômica/métodos , Histonas/metabolismo , Processamento de Proteína Pós-Traducional , Software , Animais , Imunoprecipitação da Cromatina , Feminino , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Histonas/química , Histonas/genética , Humanos , Masculino , Cadeias de Markov , Camundongos , Ratos , Reação em Cadeia da Polimerase em Tempo Real
20.
Nat Commun ; 6: 6905, 2015 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-25872643

RESUMO

Cell-type specific regulation of gene expression requires the activation of promoters by distal genomic elements defined as enhancers. The identification and the characterization of enhancers are challenging in mammals due to their genome complexity. Here we develop CapStarr-Seq, a novel high-throughput strategy to quantitatively assess enhancer activity in mammals. This approach couples capture of regions of interest to previously developed Starr-seq technique. Extensive assessment of CapStarr-seq demonstrates accurate quantification of enhancer activity. Furthermore, we find that enhancer strength is associated with binding complexity of tissue-specific transcription factors and super-enhancers, while additive enhancer activity isolates key genes involved in cell identity and function. The CapStarr-Seq thus provides a fast and cost-effective approach to assess the activity of potential enhancers for a given cell type and will be helpful in decrypting transcription regulation mechanisms.


Assuntos
Elementos Facilitadores Genéticos/genética , Regulação da Expressão Gênica/genética , Expressão Gênica/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Fatores de Transcrição/genética , Animais , Imunoprecipitação da Cromatina , Masculino , Camundongos , Células NIH 3T3 , Regiões Promotoras Genéticas/genética , Análise de Sequência de DNA/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA