Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
1.
Cell ; 165(6): 1375-1388, 2016 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-27259149

RESUMO

How the chromatin regulatory landscape in the inner cell mass cells is established from differentially packaged sperm and egg genomes during preimplantation development is unknown. Here, we develop a low-input DNase I sequencing (liDNase-seq) method that allows us to generate maps of DNase I-hypersensitive site (DHS) of mouse preimplantation embryos from 1-cell to morula stage. The DHS landscape is progressively established with a drastic increase at the 8-cell stage. Paternal chromatin accessibility is quickly reprogrammed after fertilization to the level similar to maternal chromatin, while imprinted genes exhibit allelic accessibility bias. We demonstrate that transcription factor Nfya contributes to zygotic genome activation and DHS formation at the 2-cell stage and that Oct4 contributes to the DHSs gained at the 8-cell stage. Our study reveals the dynamic chromatin regulatory landscape during early development and identifies key transcription factors important for DHS establishment in mammalian embryos.


Assuntos
Blastocisto , Cromatina/metabolismo , Animais , Sítios de Ligação , Blastocisto/citologia , Massa Celular Interna do Blastocisto/metabolismo , Fator de Ligação a CCAAT/metabolismo , Mapeamento Cromossômico , DNA/metabolismo , Desoxirribonuclease I/metabolismo , Desenvolvimento Embrionário , Feminino , Regulação da Expressão Gênica no Desenvolvimento , Masculino , Camundongos , Fator 3 de Transcrição de Octâmero/metabolismo , Regiões Promotoras Genéticas
2.
BMC Biol ; 21(1): 165, 2023 07 31.
Artigo em Inglês | MEDLINE | ID: mdl-37525156

RESUMO

BACKGROUND: The development of cotton fiber is regulated by the orchestrated binding of regulatory proteins to cis-regulatory elements associated with developmental genes. The cis-trans regulatory dynamics occurred throughout the course of cotton fiber development are elusive. Here we generated genome-wide high-resolution DNase I hypersensitive sites (DHSs) maps to understand the regulatory mechanisms of cotton ovule and fiber development. RESULTS: We generated DNase I hypersensitive site (DHS) profiles from cotton ovules at 0 and 3 days post anthesis (DPA) and fibers at 8, 12, 15, and 18 DPA. We obtained a total of 1185 million reads and identified a total of 199,351 DHSs through ~ 30% unique mapping reads. It should be noted that more than half of DNase-seq reads mapped multiple genome locations and were not analyzed in order to achieve a high specificity of peak profile and to avoid bias from repetitive genomic regions. Distinct chromatin accessibilities were observed in the ovules (0 and 3 DPA) compared to the fiber elongation stages (8, 12, 15, and 18 DPA). Besides, the chromatin accessibility during ovules was particularly elevated in genomic regions enriched with transposable elements (TEs) and genes in TE-enriched regions were involved in ovule cell division. We analyzed cis-regulatory modules and revealed the influence of hormones on fiber development from the regulatory divergence of transcription factor (TF) motifs. Finally, we constructed a reliable regulatory network of TFs related to ovule and fiber development based on chromatin accessibility and gene co-expression network. From this network, we discovered a novel TF, WRKY46, which may shape fiber development by regulating the lignin content. CONCLUSIONS: Our results not only reveal the contribution of TEs in fiber development, but also predict and validate the TFs related to fiber development, which will benefit the research of cotton fiber molecular breeding.


Assuntos
Cromatina , Fatores de Transcrição , Cromatina/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Óvulo Vegetal/genética , Óvulo Vegetal/metabolismo , Redes Reguladoras de Genes , Desoxirribonuclease I/genética
3.
BMC Bioinformatics ; 22(1): 35, 2021 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-33516170

RESUMO

BACKGROUND: Assigning chromatin states genome-wide (e.g. promoters, enhancers, etc.) is commonly performed to improve functional interpretation of these states. However, computational methods to assign chromatin state suffer from the following drawbacks: they typically require data from multiple assays, which may not be practically feasible to obtain, and they depend on peak calling algorithms, which require careful parameterization and often exclude the majority of the genome. To address these drawbacks, we propose a novel learning technique built upon the Self-Organizing Map (SOM), Self-Organizing Map with Variable Neighborhoods (SOM-VN), to learn a set of representative shapes from a single, genome-wide, chromatin accessibility dataset to associate with a chromatin state assignment in which a particular RE is prevalent. These shapes can then be used to assign chromatin state using our workflow. RESULTS: We validate the performance of the SOM-VN workflow on 14 different samples of varying quality, namely one assay each of A549 and GM12878 cell lines and two each of H1 and HeLa cell lines, primary B-cells, and brain, heart, and stomach tissue. We show that SOM-VN learns shapes that are (1) non-random, (2) associated with known chromatin states, (3) generalizable across sets of chromosomes, and (4) associated with magnitude and multimodality. We compare the accuracy of SOM-VN chromatin states against the Clustering Aggregation Tool (CAGT), an unsupervised method that learns chromatin accessibility signal shapes but does not associate these shapes with REs, and we show that overall precision and recall is increased when learning shapes using SOM-VN as compared to CAGT. We further compare enhancer state assignments from SOM-VN in signals above a set threshold to enhancer state assignments from Predicting Enhancers from ATAC-seq Data (PEAS), a deep learning method that assigns enhancer chromatin states to peaks. We show that the precision-recall area under the curve for the assignment of enhancer states is comparable to PEAS. CONCLUSIONS: Our work shows that the SOM-VN workflow can learn relationships between REs and chromatin accessibility signal shape, which is an important step toward the goal of assigning and comparing enhancer state across multiple experiments and phenotypic states.


Assuntos
Cromatina , Elementos Facilitadores Genéticos , Regiões Promotoras Genéticas , Adulto , Algoritmos , Pré-Escolar , Cromatina/genética , Células HeLa , Humanos , Adulto Jovem
4.
Brief Bioinform ; 20(5): 1865-1877, 2019 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-30010713

RESUMO

Deoxyribonuclease I (DNase I)-hypersensitive site sequencing (DNase-seq) has been widely used to determine chromatin accessibility and its underlying regulatory lexicon. However, exploring DNase-seq data requires sophisticated downstream bioinformatics analyses. In this study, we first review computational methods for all of the major steps in DNase-seq data analysis, including experimental design, quality control, read alignment, peak calling, annotation of cis-regulatory elements, genomic footprinting and visualization. The challenges associated with each step are highlighted. Next, we provide a practical guideline and a computational pipeline for DNase-seq data analysis by integrating some of these tools. We also discuss the competing techniques and the potential applications of this pipeline for the analysis of analogous experimental data. Finally, we discuss the integration of DNase-seq with other functional genomics techniques.


Assuntos
Biologia Computacional/métodos , Gerenciamento de Dados/métodos , Desoxirribonuclease I/metabolismo , Análise de Sequência de DNA/métodos , Pegada de DNA , Controle de Qualidade
5.
Mol Ther ; 28(1): 19-28, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31672284

RESUMO

Defining the variables that impact the specificity of CRISPR/Cas9 has been a major research focus. Whereas sequence complementarity between guide RNA and target DNA substantially dictates cleavage efficiency, DNA accessibility of the targeted loci has also been hypothesized to be an important factor. In this study, functional data from two genome-wide assays, genome-wide, unbiased identification of DSBs enabled by sequencing (GUIDE-seq) and circularization for in vitro reporting of cleavage effects by sequencing (CIRCLE-seq), have been computationally analyzed in conjunction with DNA accessibility determined via DNase I-hypersensitive sequencing from the Encyclopedia of DNA Elements (ENCODE) Database and transcriptome from the Sequence Read Archive to determine whether cellular factors influence CRISPR-induced cleavage efficiency. CIRCLE-seq and GUIDE-seq datasets were selected to represent the absence and presence of cellular factors, respectively. Data analysis revealed that correlations between sequence similarity and CRISPR-induced cleavage frequency were altered by the presence of cellular factors that modulated the level of DNA accessibility. The above-mentioned correlation was abolished when cleavage sites were located in less accessible regions. Furthermore, CRISPR-mediated edits were permissive even at regions that were insufficient for most endogenous genes to be expressed. These results provide a strong basis to dissect the contribution of local chromatin modulation markers on CRISPR-induced cleavage efficiency.


Assuntos
Proteína 9 Associada à CRISPR/genética , Sistemas CRISPR-Cas/genética , Biologia Computacional/métodos , DNA/genética , Edição de Genes/métodos , Sequência de Bases/genética , Linhagem Celular Tumoral , Cromatina/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Bases de Dados Genéticas , Desoxirribonuclease I/genética , Genoma Humano , Células HEK293 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , RNA Guia de Cinetoplastídeos/genética , RNA-Seq , Transcrição Gênica , Transcriptoma
6.
J Exp Bot ; 71(17): 5280-5293, 2020 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-32526034

RESUMO

Limited information is available on abiotic stress-mediated alterations of chromatin conformation influencing gene expression in plants. In order to characterize the effect of abiotic stresses on changes in chromatin conformation, we employed FAIRE-seq (formaldehyde-assisted isolation of regulatory element sequencing) and DNase-seq to isolate accessible regions of chromatin from Arabidopsis thaliana seedlings exposed to either heat, cold, salt, or drought stress. Approximately 25% of regions in the Arabidopsis genome were captured as open chromatin, the majority of which included promoters and exons. A large proportion of chromatin regions apparently did not change their conformation in response to any of the four stresses. Digital footprints present within these regions had differential enrichment of motifs for binding of 43 different transcription factors. Further, in contrast to drought and salt stress, both high and low temperature treatments resulted in increased accessibility of the chromatin. Also, pseudogenes attained increased chromatin accessibility in response to cold and drought stresses. The highly accessible and inaccessible chromatin regions of seedlings exposed to drought stress correlated with the Ser/Thr protein kinases (MLK1 and MLK2)-mediated reduction and increase in H3 phosphorylation (H3T3Ph), respectively. The presented results provide a deeper understanding of abiotic stress-mediated chromatin modulation in plants.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Cromatina , Secas , Regulação da Expressão Gênica de Plantas , Plantas Geneticamente Modificadas/metabolismo , Estresse Fisiológico
7.
EMBO Rep ; 19(12)2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30413482

RESUMO

We have fully integrated public chromatin chromatin immunoprecipitation sequencing (ChIP-seq) and DNase-seq data (n > 70,000) derived from six representative model organisms (human, mouse, rat, fruit fly, nematode, and budding yeast), and have devised a data-mining platform-designated ChIP-Atlas (http://chip-atlas.org). ChIP-Atlas is able to show alignment and peak-call results for all public ChIP-seq and DNase-seq data archived in the NCBI Sequence Read Archive (SRA), which encompasses data derived from GEO, ArrayExpress, DDBJ, ENCODE, Roadmap Epigenomics, and the scientific literature. All peak-call data are integrated to visualize multiple histone modifications and binding sites of transcriptional regulators (TRs) at given genomic loci. The integrated data can be further analyzed to show TR-gene and TR-TR interactions, as well as to examine enrichment of protein binding for given multiple genomic coordinates or gene names. ChIP-Atlas is superior to other platforms in terms of data number and functionality for data mining across thousands of ChIP-seq experiments, and it provides insight into gene regulatory networks and epigenetic mechanisms.


Assuntos
Imunoprecipitação da Cromatina , Mineração de Dados , Análise de Sequência de DNA , Animais , Elementos Facilitadores Genéticos/genética , Loci Gênicos , Humanos , Internet , Fatores de Transcrição/metabolismo
8.
Glia ; 67(12): 2312-2328, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31339627

RESUMO

Microglia are brain-resident, myeloid cells that play important roles in health and brain pathologies. Herein, we report a comprehensive, replicated, false discovery rate-controlled dataset of DNase-hypersensitive (DHS) open chromatin regions for rat microglia. We compared the open chromatin landscapes in untreated primary microglial cultures and cultures stimulated for 6 hr with either glioma-conditioned medium (GCM) or lipopolysaccharide (LPS). Glioma-secreted factors induce proinvasive and immunosuppressive activation of microglia, and these cells then promote tumor growth. The open chromatin landscape of the rat microglia consisted of 126,640 reproducible DHS regions, among which 2,303 and 12,357 showed a significant change in openness following stimulation with GCM or LPS, respectively. Active genes exhibited constitutively open promoters, but there was no direct dependence between the aggregated openness of DHS regions near a gene and its expression. Individual regions mapped to the same gene often presented different patterns of openness changes. GCM-regulated DHS regions were more frequent in areas away from gene bodies, while LPS-regulated regions were more frequent in introns. GCM and LPS differentially affected the openness of regions mapped to immune checkpoint genes. The two treatments differentially affected the aggregated openness of regions mapped to genes in the Toll-like receptor signaling and axon guidance pathways, suggesting that the molecular machinery used by migrating microglia is similar to that of growing axons and that modulation of these pathways is instrumental in the induction of proinvasive polarization of microglia by glioma. Our dataset of open chromatin regions paves the way for studies of gene regulation in rat microglia.


Assuntos
Polaridade Celular/fisiologia , Cromatina/genética , Cromatina/metabolismo , Microglia/metabolismo , Animais , Animais Recém-Nascidos , Polaridade Celular/efeitos dos fármacos , Células Cultivadas , Meios de Cultivo Condicionados/toxicidade , Inflamação/induzido quimicamente , Inflamação/genética , Inflamação/metabolismo , Lipopolissacarídeos/toxicidade , Microglia/efeitos dos fármacos , Ratos , Ratos Wistar , Análise de Sequência de DNA/métodos
9.
Trends Genet ; 32(4): 238-249, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26962025

RESUMO

The ENCODE project represents a major leap from merely describing and comparing genomic sequences to surveying them for direct indicators of function. The astounding quantity of data produced by the ENCODE consortium can serve as a map to locate specific landmarks, guide hypothesis generation, and lead us to principles and mechanisms underlying genome biology. Despite its broad appeal, the size and complexity of the repository can be intimidating to prospective users. We present here some background about the ENCODE data, survey the resources available for accessing them, and describe a few simple principles to help prospective users choose the data type(s) that best suit their needs, where to get them, and how to use them to their best advantage.


Assuntos
Genômica , Bases de Dados Genéticas , Humanos , Internet , Polimorfismo de Nucleotídeo Único
10.
Brief Bioinform ; 18(3): 367-381, 2017 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-27013647

RESUMO

Enriched region (ER) identification is a fundamental step in several next-generation sequencing (NGS) experiment types. Yet, although NGS experimental protocols recommend producing replicate samples for each evaluated condition and their consistency is usually assessed, typically pipelines for ER identification do not consider available NGS replicates. This may alter genome-wide descriptions of ERs, hinder significance of subsequent analyses on detected ERs and eventually preclude biological discoveries that evidence in replicate could support. MuSERA is a broadly useful stand-alone tool for both interactive and batch analysis of combined evidence from ERs in multiple ChIP-seq or DNase-seq replicates. Besides rigorously combining sample replicates to increase statistical significance of detected ERs, it also provides quantitative evaluations and graphical features to assess the biological relevance of each determined ER set within its genomic context; they include genomic annotation of determined ERs, nearest ER distance distribution, global correlation assessment of ERs and an integrated genome browser. We review MuSERA rationale and implementation, and illustrate how sets of significant ERs are expanded by applying MuSERA on replicates for several types of NGS data, including ChIP-seq of transcription factors or histone marks and DNase-seq hypersensitive sites. We show that MuSERA can determine a new, enhanced set of ERs for each sample by locally combining evidence on replicates, and prove how the easy-to-use interactive graphical displays and quantitative evaluations that MuSERA provides effectively support thorough inspection of obtained results and evaluation of their biological content, facilitating their understanding and biological interpretations. MuSERA is freely available at http://www.bioinformatics.deib.polimi.it/MuSERA/.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Imunoprecipitação da Cromatina , Genoma , Genômica , Software
11.
New Phytol ; 223(4): 1937-1951, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31063599

RESUMO

Accessible chromatin changes dynamically during development and harbours functional regulatory regions which are poorly understood in the context of wood development. We explored the importance of accessible chromatin in Eucalyptus grandis in immature xylem generally, and MYB transcription factor-mediated transcriptional programmes specifically. We identified biologically reproducible DNase I Hypersensitive Sites (DHSs) and assessed their functional significance in immature xylem through their associations with gene expression, epigenomic data and DNA sequence conservation. We identified in vitro DNA binding sites for six secondary cell wall-associated Eucalyptus MYB (EgrMYB) transcription factors using DAP-seq, reconstructed protein-DNA networks of predicted targets based on binding sites within or outside DHSs and assessed biological enrichment of these networks with published datasets. 25 319 identified immature xylem DHSs were associated with increased transcription and significantly enriched for various epigenetic signatures (H3K4me3, H3K27me3, RNA pol II), conserved noncoding sequences and depleted single nucleotide variants. Predicted networks built from EgrMYB binding sites located in accessible chromatin were significantly enriched for systems biology datasets relevant to wood formation, whereas those occurring in inaccessible chromatin were not. Our study demonstrates that DHSs in E. grandis immature xylem, most of which are intergenic, are of functional significance to gene regulation in this tissue.


Assuntos
Cromatina/genética , Eucalyptus/crescimento & desenvolvimento , Madeira/crescimento & desenvolvimento , Sequência de Bases , Biomassa , Parede Celular/metabolismo , Desoxirribonuclease I/metabolismo , Eucalyptus/genética , Redes Reguladoras de Genes , Histonas/metabolismo , Anotação de Sequência Molecular , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição , Madeira/genética , Xilema/metabolismo
12.
BMC Genomics ; 19(1): 206, 2018 03 20.
Artigo em Inglês | MEDLINE | ID: mdl-29558892

RESUMO

BACKGROUND: The developmental gene regulatory network (GRN) that underlies skeletogenesis in sea urchins and other echinoderms is a paradigm of GRN structure, function, and evolution. This transcriptional network is deployed selectively in skeleton-forming primary mesenchyme cells (PMCs) of the early embryo. To advance our understanding of this model developmental GRN, we used genome-wide chromatin accessibility profiling to identify and characterize PMC cis-regulatory modules (CRMs). RESULTS: ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) analysis of purified PMCs provided a global picture of chromatin accessibility in these cells. We used both ATAC-seq and DNase-seq (DNase I hypersensitive site sequencing) to identify > 3000 sites that exhibited increased accessibility in PMCs relative to other embryonic cell lineages, and provide both computational and experimental evidence that a large fraction of these sites represent bona fide skeletogenic CRMs. Putative PMC CRMs were preferentially located near genes differentially expressed by PMCs and consensus binding sites for two key transcription factors in the PMC GRN, Alx1 and Ets1, were enriched in these CRMs. Moreover, a high proportion of candidate CRMs drove reporter gene expression specifically in PMCs in transgenic embryos. Surprisingly, we found that PMC CRMs were partially open in other embryonic lineages and exhibited hyperaccessibility as early as the 128-cell stage. CONCLUSIONS: Our work provides a comprehensive picture of chromatin accessibility in an early embryonic cell lineage. By identifying thousands of candidate PMC CRMs, we significantly enhance the utility of the sea urchin skeletogenic network as a general model of GRN architecture and evolution. Our work also shows that differential chromatin accessibility, which has been used for the high-throughput identification of enhancers in differentiated cell types, is a powerful approach for the identification of CRMs in early embryonic cells. Lastly, we conclude that in the sea urchin embryo, CRMs that control the cell type-specific expression of effector genes are hyperaccessible several hours in advance of gene activation.


Assuntos
Cromatina/genética , Embrião não Mamífero/metabolismo , Células-Tronco Mesenquimais/metabolismo , Sequências Reguladoras de Ácido Nucleico , Strongylocentrotus purpuratus/genética , Animais , Embrião não Mamífero/citologia , Regulação da Expressão Gênica no Desenvolvimento , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala , Células-Tronco Mesenquimais/citologia , Strongylocentrotus purpuratus/citologia , Strongylocentrotus purpuratus/crescimento & desenvolvimento , Fatores de Transcrição/metabolismo
13.
Crit Rev Biochem Mol Biol ; 50(4): 269-83, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26038153

RESUMO

Recent advances in experimental and computational methodologies are enabling ultra-high resolution genome-wide profiles of protein-DNA binding events. For example, the ChIP-exo protocol precisely characterizes protein-DNA cross-linking patterns by combining chromatin immunoprecipitation (ChIP) with 5' → 3' exonuclease digestion. Similarly, deeply sequenced chromatin accessibility assays (e.g. DNase-seq and ATAC-seq) enable the detection of protected footprints at protein-DNA binding sites. With these techniques and others, we have the potential to characterize the individual nucleotides that interact with transcription factors, nucleosomes, RNA polymerases and other regulatory proteins in a particular cellular context. In this review, we explain the experimental assays and computational analysis methods that enable high-resolution profiling of protein-DNA binding events. We discuss the challenges and opportunities associated with such approaches.


Assuntos
Cromatina/metabolismo , Proteínas de Ligação a DNA/metabolismo , DNA/metabolismo , Modelos Moleculares , Animais , Cromatina/química , Imunoprecipitação da Cromatina/tendências , Biologia Computacional/tendências , Simulação por Computador/tendências , DNA/química , Pegada de DNA/tendências , Proteínas de Ligação a DNA/química , Conjuntos de Dados como Assunto , Exodesoxirribonucleases/metabolismo , Sistemas Inteligentes , Genômica/métodos , Genômica/tendências , Humanos , Hidrólise , Conformação de Ácido Nucleico , Nucleossomos/química , Nucleossomos/metabolismo , Conformação Proteica , Pegadas de Proteínas/tendências
14.
BMC Bioinformatics ; 18(1): 357, 2017 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-28764645

RESUMO

BACKGROUND: High-throughput sequence (HTS) data exhibit position-specific nucleotide biases that obscure the intended signal and reduce the effectiveness of these data for downstream analyses. These biases are particularly evident in HTS assays for identifying regulatory regions in DNA (DNase-seq, ChIP-seq, FAIRE-seq, ATAC-seq). Biases may result from many experiment-specific factors, including selectivity of DNA restriction enzymes and fragmentation method, as well as sequencing technology-specific factors, such as choice of adapters/primers and sample amplification methods. RESULTS: We present a novel method to detect and correct position-specific nucleotide biases in HTS short read data. Our method calculates read-specific weights based on aligned reads to correct the over- or underrepresentation of position-specific nucleotide subsequences, both within and adjacent to the aligned read, relative to a baseline calculated in assay-specific enriched regions. Using HTS data from a variety of ChIP-seq, DNase-seq, FAIRE-seq, and ATAC-seq experiments, we show that our weight-adjusted reads reduce the position-specific nucleotide imbalance across reads and improve the utility of these data for downstream analyses, including identification and characterization of open chromatin peaks and transcription-factor binding sites. CONCLUSIONS: A general-purpose method to characterize and correct position-specific nucleotide sequence biases fills the need to recognize and deal with, in a systematic manner, binding-site preference for the growing number of HTS-based epigenetic assays. As the breadth and impact of these biases are better understood, the availability of a standard toolkit to correct them will be important.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Nucleotídeos/genética , Sequência de Bases , Viés , Sítios de Ligação , Biologia Computacional , DNA/metabolismo , Desoxirribonucleases/metabolismo , Análise de Componente Principal , Ligação Proteica , Análise de Sequência de DNA
15.
BMC Bioinformatics ; 18(1): 363, 2017 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-28789639

RESUMO

BACKGROUND: Next-generation sequencing (NGS) approaches are commonly used to identify key regulatory networks that drive transcriptional programs. Although these technologies are frequently used in biological studies, NGS data analysis remains a challenging, time-consuming, and often irreproducible process. Therefore, there is a need for a comprehensive and flexible workflow platform that can accelerate data processing and analysis so more time can be spent on functional studies. RESULTS: We have developed an integrative, stand-alone workflow platform, named CIPHER, for the systematic analysis of several commonly used NGS datasets including ChIP-seq, RNA-seq, MNase-seq, DNase-seq, GRO-seq, and ATAC-seq data. CIPHER implements various open source software packages, in-house scripts, and Docker containers to analyze and process single-ended and pair-ended datasets. CIPHER's pipelines conduct extensive quality and contamination control checks, as well as comprehensive downstream analysis. A typical CIPHER workflow includes: (1) raw sequence evaluation, (2) read trimming and adapter removal, (3) read mapping and quality filtering, (4) visualization track generation, and (5) extensive quality control assessment. Furthermore, CIPHER conducts downstream analysis such as: narrow and broad peak calling, peak annotation, and motif identification for ChIP-seq, differential gene expression analysis for RNA-seq, nucleosome positioning for MNase-seq, DNase hypersensitive site mapping, site annotation and motif identification for DNase-seq, analysis of nascent transcription from Global-Run On (GRO-seq) data, and characterization of chromatin accessibility from ATAC-seq datasets. In addition, CIPHER contains an "analysis" mode that completes complex bioinformatics tasks such as enhancer discovery and provides functions to integrate various datasets together. CONCLUSIONS: Using public and simulated data, we demonstrate that CIPHER is an efficient and comprehensive workflow platform that can analyze several NGS datasets commonly used in genome biology studies. Additionally, CIPHER's integrative "analysis" mode allows researchers to elicit important biological information from the combined dataset analysis.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Sequências Reguladoras de Ácido Nucleico/genética , Software , Imunoprecipitação da Cromatina , Mapeamento Cromossômico , Bases de Dados Genéticas , Elementos Facilitadores Genéticos/genética , Regulação da Expressão Gênica , Análise de Sequência de RNA
16.
BMC Genomics ; 18(1): 68, 2017 01 11.
Artigo em Inglês | MEDLINE | ID: mdl-28077088

RESUMO

BACKGROUND: Bone morphogenetic protein 4 (BMP4) plays an important role in cancer pathogenesis. In breast cancer, it reduces proliferation and increases migration in a cell line-dependent manner. To characterize the transcriptional mediators of these phenotypes, we performed RNA-seq and DNase-seq analyses after BMP4 treatment in MDA-MB-231 and T-47D breast cancer cells that respond to BMP4 with enhanced migration and decreased cell growth, respectively. RESULTS: The RNA-seq data revealed gene expression changes that were consistent with the in vitro phenotypes of the cell lines, particularly in MDA-MB-231, where migration-related processes were enriched. These results were confirmed when enrichment of BMP4-induced open chromatin regions was analyzed. Interestingly, the chromatin in transcription start sites of differentially expressed genes was already open in unstimulated cells, thus enabling rapid recruitment of transcription factors to the promoters as a response to stimulation. Further analysis and functional validation identified MBD2, CBFB, and HIF1A as downstream regulators of BMP4 signaling. Silencing of these transcription factors revealed that MBD2 was a consistent activator of target genes in both cell lines, CBFB an activator in cells with reduced proliferation phenotype, and HIF1A a repressor in cells with induced migration phenotype. CONCLUSIONS: Integrating RNA-seq and DNase-seq data showed that the phenotypic responses to BMP4 in breast cancer cell lines are reflected in transcriptomic and chromatin levels. We identified and experimentally validated downstream regulators of BMP4 signaling that relate to the different in vitro phenotypes and thus demonstrate that the downstream BMP4 response is regulated in a cell type-specific manner.


Assuntos
Proteína Morfogenética Óssea 4/metabolismo , Neoplasias da Mama/patologia , Desoxirribonucleases/metabolismo , Fenótipo , Análise de Sequência de RNA , Transdução de Sinais , Proteína Morfogenética Óssea 4/farmacologia , Linhagem Celular Tumoral , Proliferação de Células/efeitos dos fármacos , Cromatina/efeitos dos fármacos , Cromatina/metabolismo , Humanos , Transdução de Sinais/efeitos dos fármacos , Transcrição Gênica/efeitos dos fármacos
17.
BMC Bioinformatics ; 17(1): 404, 2016 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-27716038

RESUMO

BACKGROUND: Transcription factor binding, histone modification, and chromatin accessibility studies are important approaches to understanding the biology of gene regulation. ChIP-seq and DNase-seq have become the standard techniques for studying protein-DNA interactions and chromatin accessibility respectively, and comprehensive quality control (QC) and analysis tools are critical to extracting the most value from these assay types. Although many analysis and QC tools have been reported, few combine ChIP-seq and DNase-seq data analysis and quality control in a unified framework with a comprehensive and unbiased reference of data quality metrics. RESULTS: ChiLin is a computational pipeline that automates the quality control and data analyses of ChIP-seq and DNase-seq data. It is developed using a flexible and modular software framework that can be easily extended and modified. ChiLin is ideal for batch processing of many datasets and is well suited for large collaborative projects involving ChIP-seq and DNase-seq from different designs. ChiLin generates comprehensive quality control reports that include comparisons with historical data derived from over 23,677 public ChIP-seq and DNase-seq samples (11,265 datasets) from eight literature-based classified categories. To the best of our knowledge, this atlas represents the most comprehensive ChIP-seq and DNase-seq related quality metric resource currently available. These historical metrics provide useful heuristic quality references for experiment across all commonly used assay types. Using representative datasets, we demonstrate the versatility of the pipeline by applying it to different assay types of ChIP-seq data. The pipeline software is available open source at https://github.com/cfce/chilin . CONCLUSION: ChiLin is a scalable and powerful tool to process large batches of ChIP-seq and DNase-seq datasets. The analysis output and quality metrics have been structured into user-friendly directories and reports. We have successfully compiled 23,677 profiles into a comprehensive quality atlas with fine classification for users.


Assuntos
Imunoprecipitação da Cromatina/métodos , Desoxirribonucleases/genética , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Controle de Qualidade , Análise de Sequência de DNA/métodos , Software , Mapeamento Cromossômico , Interpretação Estatística de Dados , Bases de Dados Genéticas , Desoxirribonucleases/metabolismo , Humanos
18.
BMC Bioinformatics ; 17 Suppl 5: 206, 2016 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-27295177

RESUMO

BACKGROUND: Peak calling is a fundamental step in the analysis of data generated by ChIP-seq or similar techniques to acquire epigenetics information. Current peak callers are often hard to parameterise and may therefore be difficult to use for non-bioinformaticians. In this paper, we present the ChIP-seq analysis tool available in CLC Genomics Workbench and CLC Genomics Server (version 7.5 and up), a user-friendly peak-caller designed to be not specific to a particular *-seq protocol. RESULTS: We illustrate the advantages of a shape-based approach and describe the algorithmic principles underlying the implementation. Thanks to the generality of the idea and the fact the algorithm is able to learn the peak shape from the data, the implementation requires only minimal user input, while still being applicable to a range of *-seq protocols. Using independently validated benchmark datasets, we compare our implementation to other state-of-the-art algorithms explicitly designed to analyse ChIP-seq data and provide an evaluation in terms of receiver-operator characteristic (ROC) plots. In order to show the applicability of the method to similar *-seq protocols, we also investigate algorithmic performances on DNase-seq data. CONCLUSIONS: The results show that CLC shape-based peak caller ranks well among popular state-of-the-art peak callers while providing flexibility and ease-of-use.


Assuntos
Algoritmos , Genômica/métodos , Área Sob a Curva , Imunoprecipitação da Cromatina , Bases de Dados Genéticas , Humanos , Internet , Curva ROC , Análise de Sequência de DNA , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Interface Usuário-Computador
19.
Biochim Biophys Acta ; 1852(11): 2432-41, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26319416

RESUMO

Osteoclast differentiation is associated with both normal bone homeostasis and pathological bone diseases such as osteoporosis. Several transcription factors can regulate osteoclast differentiation, including c-fos and Nfatc1. Using genome-wide DNase-seq analysis, we found a novel transcription factor, SREBP2, that participates in osteoclast differentiation in vitro. Here, we asked whether SREBP2 actually plays a role in controlling bone metabolism in vivo. To answer this question, RAW264 cells, primary cultured osteoclasts and the mouse RANKL-induced bone loss model were treated with fatostatin, a small molecule inhibitor specific for the activation of SREBP. When cells were treated with fatostatin, osteoclast differentiation was impaired. Similar results were obtained following treatment with siRNA for Srebf2, the gene coding for SREBP2. In vivo, µCT analyses showed that fatostatin treatment preserved bone mass and structure in the proximal tibial trabecular bone in the mouse RANKL-induced bone loss model. In addition, bone histomorphometric analysis revealed that the protection of bone mass by fatostatin might have been achieved by suppression of RANKL-mediated osteoclast differentiation. These results indicated that the novel transcription factor SREBP2 physiologically functions in osteoclast differentiation in vivo and might be a possible therapeutic target for bone diseases.

20.
Genomics ; 106(3): 140-144, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26079656

RESUMO

Enhancers work with promoters to refine the timing, location, and level of gene expression. As they perform these functions, active enhancers generate a chromatin environment that is distinct from other areas of the genome. Therefore, profiling enhancer-associated chromatin features can produce genome-wide maps of potential regulatory elements. This review focuses on current technologies used to produce maps of potential tissue-specific enhancers by profiling chromatin from primary tissue. First, cells are separated from whole organisms either by affinity purification, automated cell sorting, or microdissection. Isolating the tissue prior to analysis ensures that the molecular signature of active enhancers will not become lost in an averaged signal from unrelated cell types. After cell isolation, the molecular feature that is profiled will depend on the abundance and quality of the harvested material. The combination of tissue isolation plus genome-wide chromatin profiling has successfully identified enhancers in several pioneering studies. In the future, the regulatory apparatus of healthy and diseased tissues will be explored in this manner, as researchers use the combined techniques to gain insight into how active enhancers may influence disease progression.


Assuntos
Cromatina/genética , Elementos Facilitadores Genéticos , Genoma Humano , Mapeamento Cromossômico , Humanos , Especificidade de Órgãos/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA