Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 81
Filtrar
1.
Genome Biol ; 25(1): 11, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38191487

RESUMO

BACKGROUND: Transcription factors bind DNA in specific sequence contexts. In addition to distinguishing one nucleobase from another, some transcription factors can distinguish between unmodified and modified bases. Current models of transcription factor binding tend not to take DNA modifications into account, while the recent few that do often have limitations. This makes a comprehensive and accurate profiling of transcription factor affinities difficult. RESULTS: Here, we develop methods to identify transcription factor binding sites in modified DNA. Our models expand the standard A/C/G/T DNA alphabet to include cytosine modifications. We develop Cytomod to create modified genomic sequences and we also enhance the MEME Suite, adding the capacity to handle custom alphabets. We adapt the well-established position weight matrix (PWM) model of transcription factor binding affinity to this expanded DNA alphabet. Using these methods, we identify modification-sensitive transcription factor binding motifs. We confirm established binding preferences, such as the preference of ZFP57 and C/EBPß for methylated motifs and the preference of c-Myc for unmethylated E-box motifs. CONCLUSIONS: Using known binding preferences to tune model parameters, we discover novel modified motifs for a wide array of transcription factors. Finally, we validate our binding preference predictions for OCT4 using cleavage under targets and release using nuclease (CUT&RUN) experiments across conventional, methylation-, and hydroxymethylation-enriched sequences. Our approach readily extends to other DNA modifications. As more genome-wide single-base resolution modification data becomes available, we expect that our method will yield insights into altered transcription factor binding affinities across many different modifications.


Assuntos
Regulação da Expressão Gênica , Fatores de Transcrição , Epigenômica , DNA , Epigênese Genética
3.
Bioinformatics ; 37(18): 2834-2840, 2021 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-33760053

RESUMO

MOTIVATION: Sequence motif discovery algorithms can identify novel sequence patterns that perform biological functions in DNA, RNA and protein sequences-for example, the binding site motifs of DNA- and RNA-binding proteins. RESULTS: The STREME algorithm presented here advances the state-of-the-art in ab initio motif discovery in terms of both accuracy and versatility. Using in vivo DNA (ChIP-seq) and RNA (CLIP-seq) data, and validating motifs with reference motifs derived from in vitro data, we show that STREME is more accurate, sensitive and thorough than several widely used algorithms (DREME, HOMER, MEME, Peak-motifs) and two other representative algorithms (ProSampler and Weeder). STREME's capabilities include the ability to find motifs in datasets with hundreds of thousands of sequences, to find both short and long motifs (from 3 to 30 positions), to perform differential motif discovery in pairs of sequence datasets, and to find motifs in sequences over virtually any alphabet (DNA, RNA, protein and user-defined alphabets). Unlike most motif discovery algorithms, STREME reports a useful estimate of the statistical significance of each motif it discovers. STREME is easy to use individually via its web server or via the command line, and is completely integrated with the widely used MEME Suite of sequence analysis tools. The name STREME stands for 'Simple, Thorough, Rapid, Enriched Motif Elicitation'. AVAILABILITY AND IMPLEMENTATION: The STREME web server and source code are provided freely for non-commercial use at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Sequenciamento de Cromatina por Imunoprecipitação , Sítios de Ligação , Análise de Sequência de DNA , DNA , Motivos de Nucleotídeos
4.
Bioinformatics ; 36(12): 3902-3904, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32246829

RESUMO

MOTIVATION: Identifying the genes regulated by a given transcription factor (TF) (its 'target genes') is a key step in developing a comprehensive understanding of gene regulation. Previously, we developed a method (CisMapper) for predicting the target genes of a TF based solely on the correlation between a histone modification at the TF's binding site and the expression of the gene across a set of tissues or cell lines. That approach is limited to organisms for which extensive histone and expression data are available, and does not explicitly incorporate the genomic distance between the TF and the gene. RESULTS: We present the T-Gene algorithm, which overcomes these limitations. It can be used to predict which genes are most likely to be regulated by a TF, and which of the TF's binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene's promoter, achieving median precision above 60%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median precision above 40%) based on distance alone when extensive histone/expression data is not available for the organism. T-Gene provides an estimate of the statistical significance of each of its predictions. AVAILABILITY AND IMPLEMENTATION: The T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Sítios de Ligação , Imunoprecipitação da Cromatina , Regulação da Expressão Gênica , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
5.
Bioinformatics ; 35(16): 2774-2782, 2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30596994

RESUMO

MOTIVATION: Post-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called 'motifs' that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation. RESULTS: We describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate P-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo re-implements the two most widely used PTM motif discovery algorithms-motif-x and MoDL-while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing 'background' peptides from an unshuffled proteome database. Our results thus suggest that many of the papers that use motif-x to find motifs may be reporting results that lack statistical support. AVAILABILITY AND IMPLEMENTATION: The MoMo web server and source code are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Processamento de Proteína Pós-Traducional , Software , Algoritmos , Motivos de Aminoácidos , Proteoma , Espectrometria de Massas em Tandem
6.
Nucleic Acids Res ; 46(21): 11381-11395, 2018 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-30335167

RESUMO

During embryogenesis, vascular development relies on a handful of transcription factors that instruct cell fate in a distinct sub-population of the endothelium (1). The SOXF proteins that comprise SOX7, 17 and 18, are molecular switches modulating arterio-venous and lymphatic endothelial differentiation (2,3). Here, we show that, in the SOX-F family, only SOX18 has the ability to switch between a monomeric and a dimeric form. We characterized the SOX18 dimer in binding assays in vitro, and using a split-GFP reporter assay in a zebrafish model system in vivo. We show that SOX18 dimerization is driven by a novel motif located in the vicinity of the C-terminus of the DNA binding region. Insertion of this motif in a SOX7 monomer forced its assembly into a dimer. Genome-wide analysis of SOX18 binding locations on the chromatin revealed enrichment for a SOX dimer binding motif, correlating with genes with a strong endothelial signature. Using a SOX18 small molecule inhibitor that disrupts dimerization, we revealed that dimerization is important for transcription. Overall, we show that dimerization is a specific feature of SOX18 that enables the recruitment of key endothelial transcription factors, and refines the selectivity of the binding to discrete genomic locations assigned to endothelial specific genes.


Assuntos
Fatores de Transcrição SOXF/química , Motivos de Aminoácidos , Animais , Técnicas Biossensoriais , Proteínas de Ligação a DNA/química , Células Endoteliais/metabolismo , Endotélio/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Proteínas de Fluorescência Verde/química , Humanos , Camundongos , Mutação , Fases de Leitura Aberta , Domínios Proteicos , Multimerização Proteica , Peixe-Zebra , Proteínas de Peixe-Zebra/química
7.
Nucleic Acids Res ; 45(11): 6572-6588, 2017 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-28541545

RESUMO

Krüppel-like factors (KLFs) are a family of 17 transcription factors characterized by a conserved DNA-binding domain of three zinc fingers and a variable N-terminal domain responsible for recruiting cofactors. KLFs have diverse functions in stem cell biology, embryo patterning, and tissue homoeostasis. KLF1 and related family members function as transcriptional activators via recruitment of co-activators such as EP300, whereas KLF3 and related members act as transcriptional repressors via recruitment of C-terminal Binding Proteins. KLF1 directly activates the Klf3 gene via an erythroid-specific promoter. Herein, we show KLF1 and KLF3 bind common as well as unique sites within the erythroid cell genome by ChIP-seq. We show KLF3 can displace KLF1 from key erythroid gene promoters and enhancers in vivo. Using 4sU RNA labelling and RNA-seq, we show this competition results in reciprocal transcriptional outputs for >50 important genes. Furthermore, Klf3-/- mice displayed exaggerated recovery from anemic stress and persistent cell cycling consistent with a role for KLF3 in dampening KLF1-driven proliferation. We suggest this study provides a paradigm for how KLFs work in incoherent feed-forward loops or networks to fine-tune transcription and thereby control diverse biological processes such as cell proliferation.


Assuntos
Elementos Facilitadores Genéticos , Fatores de Transcrição Kruppel-Like/metabolismo , Regiões Promotoras Genéticas , Ativação Transcricional , Animais , Linhagem Celular , Técnicas de Cocultura , Células Eritroides/metabolismo , Eritropoese , Camundongos , Transcrição Gênica
8.
Nucleic Acids Res ; 45(4): e19, 2017 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-28204599

RESUMO

Identifying the genomic regions and regulatory factors that control the transcription of genes is an important, unsolved problem. The current method of choice predicts transcription factor (TF) binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq), and then links the binding sites to putative target genes solely on the basis of the genomic distance between them. Evidence from chromatin conformation capture experiments shows that this approach is inadequate due to long-distance regulation via chromatin looping. We present CisMapper, which predicts the regulatory targets of a TF using the correlation between a histone mark at the TF's bound sites and the expression of each gene across a panel of tissues. Using both chromatin conformation capture and differential expression data, we show that CisMapper is more accurate at predicting the target genes of a TF than the distance-based approaches currently used, and is particularly advantageous for predicting the long-range regulatory interactions typical of tissue-specific gene expression. CisMapper also predicts which TF binding sites regulate a given gene more accurately than using genomic distance. Unlike distance-based methods, CisMapper can predict which transcription start site of a gene is regulated by a particular binding site of the TF.


Assuntos
Imunoprecipitação da Cromatina/métodos , Elementos Reguladores de Transcrição , Análise de Sequência de DNA/métodos , Software , Fatores de Transcrição/metabolismo , Algoritmos , Sítios de Ligação , Código das Histonas , Regiões Promotoras Genéticas , Sítio de Iniciação de Transcrição
9.
Elife ; 62017 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-28137359

RESUMO

Pharmacological targeting of transcription factors holds great promise for the development of new therapeutics, but strategies based on blockade of DNA binding, nuclear shuttling, or individual protein partner recruitment have yielded limited success to date. Transcription factors typically engage in complex interaction networks, likely masking the effects of specifically inhibiting single protein-protein interactions. Here, we used a combination of genomic, proteomic and biophysical methods to discover a suite of protein-protein interactions involving the SOX18 transcription factor, a known regulator of vascular development and disease. We describe a small-molecule that is able to disrupt a discrete subset of SOX18-dependent interactions. This compound selectively suppressed SOX18 transcriptional outputs in vitro and interfered with vascular development in zebrafish larvae. In a mouse pre-clinical model of breast cancer, treatment with this inhibitor significantly improved survival by reducing tumour vascular density and metastatic spread. Our studies validate an interactome-based molecular strategy to interfere with transcription factor activity, for the development of novel disease therapeutics.


Assuntos
Antineoplásicos/metabolismo , Neoplasias da Mama/prevenção & controle , Fatores de Transcrição SOXF/antagonistas & inibidores , Transcrição Gênica/efeitos dos fármacos , Animais , Fenômenos Biofísicos , Vasos Sanguíneos/embriologia , Modelos Animais de Doenças , Genômica , Camundongos , Proteômica , Resultado do Tratamento , Peixe-Zebra/embriologia , Proteínas de Peixe-Zebra/antagonistas & inibidores
10.
Bioinformatics ; 32(8): 1217-9, 2016 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-26704599

RESUMO

UNLABELLED: Precise regulatory control of genes, particularly in eukaryotes, frequently requires the joint action of multiple sequence-specific transcription factors. A cis-regulatory module (CRM) is a genomic locus that is responsible for gene regulation and that contains multiple transcription factor binding sites in close proximity. Given a collection of known transcription factor binding motifs, many bioinformatics methods have been proposed over the past 15 years for identifying within a genomic sequence candidate CRMs consisting of clusters of those motifs. RESULTS: The MCAST algorithm uses a hidden Markov model with a P-value-based scoring scheme to identify candidate CRMs. Here, we introduce a new version of MCAST that offers improved graphical output, a dynamic background model, statistical confidence estimates based on false discovery rate estimation and, most significantly, the ability to predict CRMs while taking into account epigenomic data such as DNase I sensitivity or histone modification data. We demonstrate the validity of MCAST's statistical confidence estimates and the utility of epigenomic priors in identifying CRMs. AVAILABILITY AND IMPLEMENTATION: MCAST is part of the MEME Suite software toolkit. A web server and source code are available at http://meme-suite.org and http://alternate.meme-suite.org CONTACT: t.bailey@imb.uq.edu.au or william-noble@uw.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Sítios de Ligação , Análise de Sequência de DNA , Genoma , Humanos , Elementos Reguladores de Transcrição , Software , Fatores de Transcrição
11.
Development ; 142(21): 3746-57, 2015 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-26534986

RESUMO

Transcription factors act during cortical development as master regulatory genes that specify cortical arealization and cellular identities. Although numerous transcription factors have been identified as being crucial for cortical development, little is known about their downstream targets and how they mediate the emergence of specific neuronal connections via selective axon guidance. The EMX transcription factors are essential for early patterning of the cerebral cortex, but whether EMX1 mediates interhemispheric connectivity by controlling corpus callosum formation remains unclear. Here, we demonstrate that in mice on the C57Bl/6 background EMX1 plays an essential role in the midline crossing of an axonal subpopulation of the corpus callosum derived from the anterior cingulate cortex. In the absence of EMX1, cingulate axons display reduced expression of the axon guidance receptor NRP1 and form aberrant axonal bundles within the rostral corpus callosum. EMX1 also functions as a transcriptional activator of Nrp1 expression in vitro, and overexpression of this protein in Emx1 knockout mice rescues the midline-crossing phenotype. These findings reveal a novel role for the EMX1 transcription factor in establishing cortical connectivity by regulating the interhemispheric wiring of a subpopulation of neurons within the mouse anterior cingulate cortex.


Assuntos
Giro do Cíngulo/metabolismo , Proteínas de Homeodomínio/metabolismo , Neuropilina-1/metabolismo , Fatores de Transcrição/metabolismo , Agenesia do Corpo Caloso/embriologia , Agenesia do Corpo Caloso/genética , Animais , Axônios/metabolismo , Camundongos Endogâmicos C57BL , Camundongos Knockout , Semaforinas/metabolismo
12.
BMC Dev Biol ; 15: 34, 2015 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-26444262

RESUMO

BACKGROUND: Sex determination in mammals requires expression of the Y-linked gene Sry in the bipotential genital ridges of the XY embryo. Even minor delay of the onset of Sry expression can result in XY sex reversal, highlighting the need for accurate gene regulation during sex determination. However, the location of critical regulatory elements remains unknown. Here, we analysed Sry flanking sequences across many species, using newly available genome sequences and computational tools, to better understand Sry's genomic context and to identify conserved regions predictive of functional roles. METHODS: Flanking sequences from 17 species were analysed using both global and local sequence alignment methods. Multiple motif searches were employed to characterise common motifs in otherwise unconserved sequence. RESULTS: We identified position-specific conservation of binding motifs for multiple transcription factor families, including GATA binding factors and Oct/Sox dimers. In contrast with the landscape of extremely low sequence conservation around the Sry coding region, our analysis highlighted a strongly conserved interval of ~106 bp within the Sry promoter (which we term the Sry Proximal Conserved Interval, SPCI). We further report that inverted repeats flanking murine Sry are much larger than previously recognised. CONCLUSIONS: The unusually fast pace of sequence drift on the Y chromosome sharpens the likely functional significance of both the SPCI and the identified binding motifs, providing a basis for future studies of the role(s) of these elements in Sry regulation.


Assuntos
Mamíferos/genética , Proteína da Região Y Determinante do Sexo/genética , Animais , Sequência de Bases , Sequência Conservada , Evolução Molecular , Humanos , Mamíferos/classificação , Dados de Sequência Molecular , Alinhamento de Sequência , Fatores de Transcrição/metabolismo
13.
Brain Res ; 1616: 71-87, 2015 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-25960350

RESUMO

Nuclear factor one X (NFIX) has been shown to play a pivotal role during the development of many regions of the brain, including the neocortex, the hippocampus and the cerebellum. Mechanistically, NFIX has been shown to promote neural stem cell differentiation through the activation of astrocyte-specific genes and via the repression of genes central to progenitor cell self-renewal. Interestingly, mice lacking Nfix also exhibit other phenotypes with respect to development of the central nervous system, and whose underlying causes have yet to be determined. Here we examine one of the phenotypes displayed by Nfix(-/-) mice, namely hydrocephalus. Through the examination of embryonic and postnatal Nfix(-/-) mice we reveal that hydrocephalus is first seen at around postnatal day (P) 10 in mice lacking Nfix, and is fully penetrant by P20. Furthermore, we examined the subcommissural organ (SCO), the Sylvian aqueduct and the ependymal layer of the lateral ventricles, regions that when malformed and functionally perturbed have previously been implicated in the development of hydrocephalus. SOX3 is a factor known to regulate SCO development. Although we revealed that NFIX could repress Sox3-promoter-driven transcriptional activity in vitro, SOX3 expression within the SCO was normal within Nfix(-/-) mice, and Nfix mutant mice showed no abnormalities in the structure or function of the SCO. Moreover, these mutant mice exhibited no overt blockage of the Sylvian aqueduct. However, the ependymal layer of the lateral ventricles was frequently absent in Nfix(-/-) mice, suggesting that this phenotype may underlie the development of hydrocephalus within these knockout mice.


Assuntos
Epêndima/patologia , Regulação da Expressão Gênica no Desenvolvimento/genética , Hidrocefalia/patologia , Ventrículos Laterais/patologia , Fatores de Transcrição NFI/deficiência , Fatores Etários , Animais , Animais Recém-Nascidos , Biologia Computacional , Modelos Animais de Doenças , Embrião de Mamíferos , Epêndima/embriologia , Epêndima/crescimento & desenvolvimento , Hidrocefalia/genética , Ventrículos Laterais/embriologia , Ventrículos Laterais/crescimento & desenvolvimento , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Transgênicos , Fatores de Transcrição NFI/genética , Proteínas do Tecido Nervoso/genética , Proteínas do Tecido Nervoso/metabolismo , Fatores de Transcrição SOXB1/genética , Fatores de Transcrição SOXB1/metabolismo
14.
Nucleic Acids Res ; 43(W1): W39-49, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-25953851

RESUMO

The MEME Suite is a powerful, integrated set of web-based tools for studying sequence motifs in proteins, DNA and RNA. Such motifs encode many biological functions, and their detection and characterization is important in the study of molecular interactions in the cell, including the regulation of gene expression. Since the previous description of the MEME Suite in the 2009 Nucleic Acids Research Web Server Issue, we have added six new tools. Here we describe the capabilities of all the tools within the suite, give advice on their best use and provide several case studies to illustrate how to combine the results of various MEME Suite tools for successful motif-based analyses. The MEME Suite is freely available for academic use at http://meme-suite.org, and source code is also available for download and local installation.


Assuntos
Motivos de Aminoácidos , Motivos de Nucleotídeos , Software , DNA/química , Internet , Plasmodium falciparum , Domínios e Motivos de Interação entre Proteínas , Sinais Direcionadores de Proteínas , Proteínas de Protozoários/química , Receptores de Calcitriol/química , Análise de Sequência de DNA , Análise de Sequência de Proteína , Análise de Sequência de RNA
15.
Cereb Cortex ; 25(10): 3758-78, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-25331604

RESUMO

Transcription factors of the nuclear factor one (NFI) family play a pivotal role in the development of the nervous system. One member, NFIX, regulates the development of the neocortex, hippocampus, and cerebellum. Postnatal Nfix(-/-) mice also display abnormalities within the subventricular zone (SVZ) lining the lateral ventricles, a region of the brain comprising a neurogenic niche that provides ongoing neurogenesis throughout life. Specifically, Nfix(-/-) mice exhibit more PAX6-expressing progenitor cells within the SVZ. However, the mechanism underlying the development of this phenotype remains undefined. Here, we reveal that NFIX contributes to multiple facets of SVZ development. Postnatal Nfix(-/-) mice exhibit increased levels of proliferation within the SVZ, both in vivo and in vitro as assessed by a neurosphere assay. Furthermore, we show that the migration of SVZ-derived neuroblasts to the olfactory bulb is impaired, and that the olfactory bulbs of postnatal Nfix(-/-) mice are smaller. We also demonstrate that gliogenesis within the rostral migratory stream is delayed in the absence of Nfix, and reveal that Gdnf (glial-derived neurotrophic factor), a known attractant for SVZ-derived neuroblasts, is a target for transcriptional activation by NFIX. Collectively, these findings suggest that NFIX regulates both proliferation and migration during the development of the SVZ neurogenic niche.


Assuntos
Movimento Celular , Proliferação de Células , Ventrículos Laterais/embriologia , Fatores de Transcrição NFI/fisiologia , Células-Tronco Neurais/fisiologia , Neurogênese , Animais , Feminino , Fator Neurotrófico Derivado de Linhagem de Célula Glial/metabolismo , Interneurônios/fisiologia , Ventrículos Laterais/metabolismo , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Fatores de Transcrição NFI/genética , Fatores de Transcrição NFI/metabolismo , Neuroglia/fisiologia , Bulbo Olfatório/embriologia , Bulbo Olfatório/metabolismo , Nicho de Células-Tronco
16.
BMC Genomics ; 15: 752, 2014 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-25179504

RESUMO

BACKGROUND: Motif enrichment analysis of transcription factor ChIP-seq data can help identify transcription factors that cooperate or compete. Previously, little attention has been given to comparative motif enrichment analysis of pairs of ChIP-seq experiments, where the binding of the same transcription factor is assayed under different conditions. Such comparative analysis could potentially identify the distinct regulatory partners/competitors of the assayed transcription factor under different conditions or at different stages of development. RESULTS: We describe a new methodology for identifying sequence motifs that are differentially enriched in one set of DNA or RNA sequences relative to another set, and apply it to paired ChIP-seq experiments. We show that, using paired ChIP-seq data for a single transcription factor, differential motif enrichment analysis identifies all the known key transcription factors involved in the transformation of non-cancerous immortalized breast cells (MCF10A-ER-Src cells) into cancer stem cells whereas non-differential motif enrichment analysis does not. We also show that differential motif enrichment analysis identifies regulatory motifs that are significantly enriched at constrained locations within the bound promoters, and that these motifs are not identified by non-differential motif enrichment analysis. Our methodology differs from other approaches in that it leverages both comparative enrichment and positional enrichment of motifs in ChIP-seq peak regions or in the promoters of genes bound by the transcription factor. CONCLUSIONS: We show that differential motif enrichment analysis of paired ChIP-seq experiments offers biological insights not available from non-differential analysis. In contrast to previous approaches, our method detects motifs that are enriched in a constrained region in one set of sequences, but not enriched in the same region in the comparative set. We have enhanced the web-based CentriMo algorithm to allow it to perform the constrained differential motif enrichment analysis described in this paper, and CentriMo's on-line interface (http://meme.ebi.edu.au) provides dozens of databases of DNA- and RNA-binding motifs from a full range of organisms. All data and output files presented here are available at http://research.imb.uq.edu.au/t.bailey/supplementary\_data/Lesluyes2014.


Assuntos
Imunoprecipitação da Cromatina , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Motivos de Nucleotídeos , Sítios de Ligação , Linhagem Celular , Humanos , Matrizes de Pontuação de Posição Específica , Regiões Promotoras Genéticas , Ligação Proteica , Tamoxifeno/farmacologia , Fatores de Tempo , Fatores de Transcrição/metabolismo
17.
Nucleic Acids Res ; 42(17): 11000-10, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25200088

RESUMO

Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules-CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for 'other' tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a 'nearest neighbor' heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps.


Assuntos
Regulação da Expressão Gênica , Modelos Genéticos , Linhagem Celular , Genômica/métodos , Histonas/análise , Humanos , Modelos Lineares , Especificidade de Órgãos , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição
18.
Bioinformatics ; 30(18): 2673-5, 2014 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-24860161

RESUMO

UNLABELLED: A number of technologies, including CRISPR/Cas, transcription activator-like effector nucleases and zinc-finger nucleases, allow the user to target a chosen locus for genome editing or regulatory interference. Specificity, however, is a major problem, and the targeted locus must be chosen with care to avoid inadvertently affecting other loci ('off-targets') in the genome. To address this we have created 'Genome Target Scan' (GT-Scan), a flexible web-based tool that ranks all potential targets in a user-selected region of a genome in terms of how many off-targets they have. GT-Scan gives the user flexibility to define the desired characteristics of targets and off-targets via a simple 'target rule', and its interactive output allows detailed inspection of each of the most promising candidate targets. GT-Scan can be used to identify optimal targets for CRISPR/Cas systems, but its flexibility gives it potential to be adapted to other genome-targeting technologies as well. AVAILABILITY AND IMPLEMENTATION: GT-Scan can be run via the web at: http://gt-scan.braembl.org.au.


Assuntos
Biologia Computacional/métodos , Engenharia Genética/métodos , Genoma Humano/genética , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Endonucleases/metabolismo , Humanos , Internet
19.
Development ; 141(11): 2195-205, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24866114

RESUMO

Mammalian sex determination hinges on the development of ovaries or testes, with testis fate being triggered by the expression of the transcription factor sex-determining region Y (Sry). Reduced or delayed Sry expression impairs testis development, highlighting the importance of its accurate spatiotemporal regulation and implying a potential role for SRY dysregulation in human intersex disorders. Several epigenetic modifiers, transcription factors and kinases are implicated in regulating Sry transcription, but it remains unclear whether or how this farrago of factors acts co-ordinately. Here we review our current understanding of Sry regulation and provide a model that assembles all known regulators into three modules, each converging on a single transcription factor that binds to the Sry promoter. We also discuss potential future avenues for discovering the cis-elements and trans-factors required for Sry regulation.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Ovário/embriologia , Proteína da Região Y Determinante do Sexo/fisiologia , Testículo/embriologia , Animais , Linhagem da Célula , Epigênese Genética , Feminino , Fator de Transcrição GATA4/metabolismo , Humanos , Masculino , Camundongos , Regiões Promotoras Genéticas , Fator Esteroidogênico 1/metabolismo , Transcrição Gênica , Proteínas WT1/metabolismo , Cromossomo Y
20.
Nat Protoc ; 9(6): 1428-50, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24853928

RESUMO

MEME-ChIP is a web-based tool for analyzing motifs in large DNA or RNA data sets. It can analyze peak regions identified by ChIP-seq, cross-linking sites identified by CLIP-seq and related assays, as well as sets of genomic regions selected using other criteria. MEME-ChIP performs de novo motif discovery, motif enrichment analysis, motif location analysis and motif clustering, providing a comprehensive picture of the DNA or RNA motifs that are enriched in the input sequences. MEME-ChIP performs two complementary types of de novo motif discovery: weight matrix-based discovery for high accuracy; and word-based discovery for high sensitivity. Motif enrichment analysis using DNA or RNA motifs from human, mouse, worm, fly and other model organisms provides even greater sensitivity. MEME-ChIP's interactive HTML output groups and aligns significant motifs to ease interpretation. This protocol takes less than 3 h, and it provides motif discovery approaches that are distinct and complementary to other online methods.


Assuntos
Algoritmos , Sítios de Ligação/genética , Imunoprecipitação da Cromatina/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Motivos de Nucleotídeos/genética , Software , Análise por Conglomerados
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA