Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 81
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 37(18): 2834-2840, 2021 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-33760053

RESUMO

MOTIVATION: Sequence motif discovery algorithms can identify novel sequence patterns that perform biological functions in DNA, RNA and protein sequences-for example, the binding site motifs of DNA- and RNA-binding proteins. RESULTS: The STREME algorithm presented here advances the state-of-the-art in ab initio motif discovery in terms of both accuracy and versatility. Using in vivo DNA (ChIP-seq) and RNA (CLIP-seq) data, and validating motifs with reference motifs derived from in vitro data, we show that STREME is more accurate, sensitive and thorough than several widely used algorithms (DREME, HOMER, MEME, Peak-motifs) and two other representative algorithms (ProSampler and Weeder). STREME's capabilities include the ability to find motifs in datasets with hundreds of thousands of sequences, to find both short and long motifs (from 3 to 30 positions), to perform differential motif discovery in pairs of sequence datasets, and to find motifs in sequences over virtually any alphabet (DNA, RNA, protein and user-defined alphabets). Unlike most motif discovery algorithms, STREME reports a useful estimate of the statistical significance of each motif it discovers. STREME is easy to use individually via its web server or via the command line, and is completely integrated with the widely used MEME Suite of sequence analysis tools. The name STREME stands for 'Simple, Thorough, Rapid, Enriched Motif Elicitation'. AVAILABILITY AND IMPLEMENTATION: The STREME web server and source code are provided freely for non-commercial use at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Sequenciamento de Cromatina por Imunoprecipitação , Sítios de Ligação , Análise de Sequência de DNA , DNA , Motivos de Nucleotídeos
2.
Bioinformatics ; 36(12): 3902-3904, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32246829

RESUMO

MOTIVATION: Identifying the genes regulated by a given transcription factor (TF) (its 'target genes') is a key step in developing a comprehensive understanding of gene regulation. Previously, we developed a method (CisMapper) for predicting the target genes of a TF based solely on the correlation between a histone modification at the TF's binding site and the expression of the gene across a set of tissues or cell lines. That approach is limited to organisms for which extensive histone and expression data are available, and does not explicitly incorporate the genomic distance between the TF and the gene. RESULTS: We present the T-Gene algorithm, which overcomes these limitations. It can be used to predict which genes are most likely to be regulated by a TF, and which of the TF's binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene's promoter, achieving median precision above 60%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median precision above 40%) based on distance alone when extensive histone/expression data is not available for the organism. T-Gene provides an estimate of the statistical significance of each of its predictions. AVAILABILITY AND IMPLEMENTATION: The T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Sítios de Ligação , Imunoprecipitação da Cromatina , Regulação da Expressão Gênica , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
3.
Mol Cell ; 50(5): 613-23, 2013 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-23746349

RESUMO

Motifs rich in arginines and glycines were recognized several decades ago to play functional roles and were termed glycine-arginine-rich (GAR) domains and/or RGG boxes. We review here the evolving functions of the RGG box along with several sequence variations that we collectively term the RGG/RG motif. Greater than 1,000 human proteins harbor the RGG/RG motif, and these proteins influence numerous physiological processes such as transcription, pre-mRNA splicing, DNA damage signaling, mRNA translation, and the regulation of apoptosis. In particular, we discuss the role of the RGG/RG motif in mediating nucleic acid and protein interactions, a function that is often regulated by arginine methylation and partner-binding proteins. The physiological relevance of the RGG/RG motif is highlighted by its association with several diseases including neurological and neuromuscular diseases and cancer. Herein, we discuss the evidence for the emerging diverse functionality of this important motif.


Assuntos
Domínios e Motivos de Interação entre Proteínas , Proteínas/química , Proteínas/metabolismo , Processamento Alternativo , Motivos de Aminoácidos , Sequência de Aminoácidos , Esclerose Lateral Amiotrófica/metabolismo , Apoptose/fisiologia , Arginina/metabolismo , Dano ao DNA , Síndrome do Cromossomo X Frágil/metabolismo , Humanos , Metilação , Dados de Sequência Molecular , Neoplasias/metabolismo , Doenças Neuromusculares/metabolismo , Biossíntese de Proteínas
4.
Bioinformatics ; 35(16): 2774-2782, 2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30596994

RESUMO

MOTIVATION: Post-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called 'motifs' that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation. RESULTS: We describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate P-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo re-implements the two most widely used PTM motif discovery algorithms-motif-x and MoDL-while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing 'background' peptides from an unshuffled proteome database. Our results thus suggest that many of the papers that use motif-x to find motifs may be reporting results that lack statistical support. AVAILABILITY AND IMPLEMENTATION: The MoMo web server and source code are provided at http://meme-suite.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Processamento de Proteína Pós-Traducional , Software , Algoritmos , Motivos de Aminoácidos , Proteoma , Espectrometria de Massas em Tandem
5.
Nucleic Acids Res ; 46(21): 11381-11395, 2018 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-30335167

RESUMO

During embryogenesis, vascular development relies on a handful of transcription factors that instruct cell fate in a distinct sub-population of the endothelium (1). The SOXF proteins that comprise SOX7, 17 and 18, are molecular switches modulating arterio-venous and lymphatic endothelial differentiation (2,3). Here, we show that, in the SOX-F family, only SOX18 has the ability to switch between a monomeric and a dimeric form. We characterized the SOX18 dimer in binding assays in vitro, and using a split-GFP reporter assay in a zebrafish model system in vivo. We show that SOX18 dimerization is driven by a novel motif located in the vicinity of the C-terminus of the DNA binding region. Insertion of this motif in a SOX7 monomer forced its assembly into a dimer. Genome-wide analysis of SOX18 binding locations on the chromatin revealed enrichment for a SOX dimer binding motif, correlating with genes with a strong endothelial signature. Using a SOX18 small molecule inhibitor that disrupts dimerization, we revealed that dimerization is important for transcription. Overall, we show that dimerization is a specific feature of SOX18 that enables the recruitment of key endothelial transcription factors, and refines the selectivity of the binding to discrete genomic locations assigned to endothelial specific genes.


Assuntos
Fatores de Transcrição SOXF/química , Motivos de Aminoácidos , Animais , Técnicas Biossensoriais , Proteínas de Ligação a DNA/química , Células Endoteliais/metabolismo , Endotélio/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Proteínas de Fluorescência Verde/química , Humanos , Camundongos , Mutação , Fases de Leitura Aberta , Domínios Proteicos , Multimerização Proteica , Peixe-Zebra , Proteínas de Peixe-Zebra/química
6.
Genes Dev ; 26(24): 2802-16, 2012 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-23249739

RESUMO

In the vertebrate neural tube, regional Sonic hedgehog (Shh) signaling invokes a time- and concentration-dependent induction of six different cell populations mediated through Gli transcriptional regulators. Elsewhere in the embryo, Shh/Gli responses invoke different tissue-appropriate regulatory programs. A genome-scale analysis of DNA binding by Gli1 and Sox2, a pan-neural determinant, identified a set of shared regulatory regions associated with key factors central to cell fate determination and neural tube patterning. Functional analysis in transgenic mice validates core enhancers for each of these factors and demonstrates the dual requirement for Gli1 and Sox2 inputs for neural enhancer activity. Furthermore, through an unbiased determination of Gli-binding site preferences and analysis of binding site variants in the developing mammalian CNS, we demonstrate that differential Gli-binding affinity underlies threshold-level activator responses to Shh input. In summary, our results highlight Sox2 input as a context-specific determinant of the neural-specific Shh response and differential Gli-binding site affinity as an important cis-regulatory property critical for interpreting Shh morphogen action in the mammalian neural tube.


Assuntos
Padronização Corporal/fisiologia , Proteínas Hedgehog/metabolismo , Fatores de Transcrição Kruppel-Like/metabolismo , Fatores de Transcrição SOXB1/metabolismo , Animais , Padronização Corporal/genética , Camundongos , Camundongos Transgênicos , Tubo Neural/embriologia , Tubo Neural/metabolismo , Ligação Proteica , Proteína GLI1 em Dedos de Zinco
7.
Nucleic Acids Res ; 45(4): e19, 2017 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-28204599

RESUMO

Identifying the genomic regions and regulatory factors that control the transcription of genes is an important, unsolved problem. The current method of choice predicts transcription factor (TF) binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq), and then links the binding sites to putative target genes solely on the basis of the genomic distance between them. Evidence from chromatin conformation capture experiments shows that this approach is inadequate due to long-distance regulation via chromatin looping. We present CisMapper, which predicts the regulatory targets of a TF using the correlation between a histone mark at the TF's bound sites and the expression of each gene across a panel of tissues. Using both chromatin conformation capture and differential expression data, we show that CisMapper is more accurate at predicting the target genes of a TF than the distance-based approaches currently used, and is particularly advantageous for predicting the long-range regulatory interactions typical of tissue-specific gene expression. CisMapper also predicts which TF binding sites regulate a given gene more accurately than using genomic distance. Unlike distance-based methods, CisMapper can predict which transcription start site of a gene is regulated by a particular binding site of the TF.


Assuntos
Imunoprecipitação da Cromatina/métodos , Elementos Reguladores de Transcrição , Análise de Sequência de DNA/métodos , Software , Fatores de Transcrição/metabolismo , Algoritmos , Sítios de Ligação , Código das Histonas , Regiões Promotoras Genéticas , Sítio de Iniciação de Transcrição
8.
Nucleic Acids Res ; 45(11): 6572-6588, 2017 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-28541545

RESUMO

Krüppel-like factors (KLFs) are a family of 17 transcription factors characterized by a conserved DNA-binding domain of three zinc fingers and a variable N-terminal domain responsible for recruiting cofactors. KLFs have diverse functions in stem cell biology, embryo patterning, and tissue homoeostasis. KLF1 and related family members function as transcriptional activators via recruitment of co-activators such as EP300, whereas KLF3 and related members act as transcriptional repressors via recruitment of C-terminal Binding Proteins. KLF1 directly activates the Klf3 gene via an erythroid-specific promoter. Herein, we show KLF1 and KLF3 bind common as well as unique sites within the erythroid cell genome by ChIP-seq. We show KLF3 can displace KLF1 from key erythroid gene promoters and enhancers in vivo. Using 4sU RNA labelling and RNA-seq, we show this competition results in reciprocal transcriptional outputs for >50 important genes. Furthermore, Klf3-/- mice displayed exaggerated recovery from anemic stress and persistent cell cycling consistent with a role for KLF3 in dampening KLF1-driven proliferation. We suggest this study provides a paradigm for how KLFs work in incoherent feed-forward loops or networks to fine-tune transcription and thereby control diverse biological processes such as cell proliferation.


Assuntos
Elementos Facilitadores Genéticos , Fatores de Transcrição Kruppel-Like/metabolismo , Regiões Promotoras Genéticas , Ativação Transcricional , Animais , Linhagem Celular , Técnicas de Cocultura , Células Eritroides/metabolismo , Eritropoese , Camundongos , Transcrição Gênica
9.
Development ; 142(21): 3746-57, 2015 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-26534986

RESUMO

Transcription factors act during cortical development as master regulatory genes that specify cortical arealization and cellular identities. Although numerous transcription factors have been identified as being crucial for cortical development, little is known about their downstream targets and how they mediate the emergence of specific neuronal connections via selective axon guidance. The EMX transcription factors are essential for early patterning of the cerebral cortex, but whether EMX1 mediates interhemispheric connectivity by controlling corpus callosum formation remains unclear. Here, we demonstrate that in mice on the C57Bl/6 background EMX1 plays an essential role in the midline crossing of an axonal subpopulation of the corpus callosum derived from the anterior cingulate cortex. In the absence of EMX1, cingulate axons display reduced expression of the axon guidance receptor NRP1 and form aberrant axonal bundles within the rostral corpus callosum. EMX1 also functions as a transcriptional activator of Nrp1 expression in vitro, and overexpression of this protein in Emx1 knockout mice rescues the midline-crossing phenotype. These findings reveal a novel role for the EMX1 transcription factor in establishing cortical connectivity by regulating the interhemispheric wiring of a subpopulation of neurons within the mouse anterior cingulate cortex.


Assuntos
Giro do Cíngulo/metabolismo , Proteínas de Homeodomínio/metabolismo , Neuropilina-1/metabolismo , Fatores de Transcrição/metabolismo , Agenesia do Corpo Caloso/embriologia , Agenesia do Corpo Caloso/genética , Animais , Axônios/metabolismo , Camundongos Endogâmicos C57BL , Camundongos Knockout , Semaforinas/metabolismo
10.
Genome Res ; 24(6): 999-1011, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24501021

RESUMO

Our current understanding of how DNA is packed in the nucleus is most accurate at the fine scale of individual nucleosomes and at the large scale of chromosome territories. However, accurate modeling of DNA architecture at the intermediate scale of ∼50 kb-10 Mb is crucial for identifying functional interactions among regulatory elements and their target promoters. We describe a method, Fit-Hi-C, that assigns statistical confidence estimates to mid-range intra-chromosomal contacts by jointly modeling the random polymer looping effect and previously observed technical biases in Hi-C data sets. We demonstrate that our proposed approach computes accurate empirical null models of contact probability without any distribution assumption, corrects for binning artifacts, and provides improved statistical power relative to a previously described method. High-confidence contacts identified by Fit-Hi-C preferentially link expressed gene promoters to active enhancers identified by chromatin signatures in human embryonic stem cells (ESCs), capture 77% of RNA polymerase II-mediated enhancer-promoter interactions identified using ChIA-PET in mouse ESCs, and confirm previously validated, cell line-specific interactions in mouse cortex cells. We observe that insulators and heterochromatin regions are hubs for high-confidence contacts, while promoters and strong enhancers are involved in fewer contacts. We also observe that binding peaks of master pluripotency factors such as NANOG and POU5F1 are highly enriched in high-confidence contacts for human ESCs. Furthermore, we show that pairs of loci linked by high-confidence contacts exhibit similar replication timing in human and mouse ESCs and preferentially lie within the boundaries of topological domains for human and mouse cell lines.


Assuntos
Montagem e Desmontagem da Cromatina , Cromatina/genética , Modelos Genéticos , Sequências Reguladoras de Ácido Nucleico , Animais , Cromatina/química , Intervalos de Confiança , Células-Tronco Embrionárias/metabolismo , Código das Histonas , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , Humanos , Camundongos , Proteína Homeobox Nanog , Neurônios/metabolismo , Fator 3 de Transcrição de Octâmero/genética , Fator 3 de Transcrição de Octâmero/metabolismo , Ligação Proteica , Especificidade da Espécie , Leveduras/genética
11.
Development ; 141(11): 2195-205, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24866114

RESUMO

Mammalian sex determination hinges on the development of ovaries or testes, with testis fate being triggered by the expression of the transcription factor sex-determining region Y (Sry). Reduced or delayed Sry expression impairs testis development, highlighting the importance of its accurate spatiotemporal regulation and implying a potential role for SRY dysregulation in human intersex disorders. Several epigenetic modifiers, transcription factors and kinases are implicated in regulating Sry transcription, but it remains unclear whether or how this farrago of factors acts co-ordinately. Here we review our current understanding of Sry regulation and provide a model that assembles all known regulators into three modules, each converging on a single transcription factor that binds to the Sry promoter. We also discuss potential future avenues for discovering the cis-elements and trans-factors required for Sry regulation.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Ovário/embriologia , Proteína da Região Y Determinante do Sexo/fisiologia , Testículo/embriologia , Animais , Linhagem da Célula , Epigênese Genética , Feminino , Fator de Transcrição GATA4/metabolismo , Humanos , Masculino , Camundongos , Regiões Promotoras Genéticas , Fator Esteroidogênico 1/metabolismo , Transcrição Gênica , Proteínas WT1/metabolismo , Cromossomo Y
12.
Bioinformatics ; 32(8): 1217-9, 2016 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-26704599

RESUMO

UNLABELLED: Precise regulatory control of genes, particularly in eukaryotes, frequently requires the joint action of multiple sequence-specific transcription factors. A cis-regulatory module (CRM) is a genomic locus that is responsible for gene regulation and that contains multiple transcription factor binding sites in close proximity. Given a collection of known transcription factor binding motifs, many bioinformatics methods have been proposed over the past 15 years for identifying within a genomic sequence candidate CRMs consisting of clusters of those motifs. RESULTS: The MCAST algorithm uses a hidden Markov model with a P-value-based scoring scheme to identify candidate CRMs. Here, we introduce a new version of MCAST that offers improved graphical output, a dynamic background model, statistical confidence estimates based on false discovery rate estimation and, most significantly, the ability to predict CRMs while taking into account epigenomic data such as DNase I sensitivity or histone modification data. We demonstrate the validity of MCAST's statistical confidence estimates and the utility of epigenomic priors in identifying CRMs. AVAILABILITY AND IMPLEMENTATION: MCAST is part of the MEME Suite software toolkit. A web server and source code are available at http://meme-suite.org and http://alternate.meme-suite.org CONTACT: t.bailey@imb.uq.edu.au or william-noble@uw.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Sítios de Ligação , Análise de Sequência de DNA , Genoma , Humanos , Elementos Reguladores de Transcrição , Software , Fatores de Transcrição
13.
Nucleic Acids Res ; 43(W1): W39-49, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-25953851

RESUMO

The MEME Suite is a powerful, integrated set of web-based tools for studying sequence motifs in proteins, DNA and RNA. Such motifs encode many biological functions, and their detection and characterization is important in the study of molecular interactions in the cell, including the regulation of gene expression. Since the previous description of the MEME Suite in the 2009 Nucleic Acids Research Web Server Issue, we have added six new tools. Here we describe the capabilities of all the tools within the suite, give advice on their best use and provide several case studies to illustrate how to combine the results of various MEME Suite tools for successful motif-based analyses. The MEME Suite is freely available for academic use at http://meme-suite.org, and source code is also available for download and local installation.


Assuntos
Motivos de Aminoácidos , Motivos de Nucleotídeos , Software , DNA/química , Internet , Plasmodium falciparum , Domínios e Motivos de Interação entre Proteínas , Sinais Direcionadores de Proteínas , Proteínas de Protozoários/química , Receptores de Calcitriol/química , Análise de Sequência de DNA , Análise de Sequência de Proteína , Análise de Sequência de RNA
14.
Cereb Cortex ; 25(10): 3758-78, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-25331604

RESUMO

Transcription factors of the nuclear factor one (NFI) family play a pivotal role in the development of the nervous system. One member, NFIX, regulates the development of the neocortex, hippocampus, and cerebellum. Postnatal Nfix(-/-) mice also display abnormalities within the subventricular zone (SVZ) lining the lateral ventricles, a region of the brain comprising a neurogenic niche that provides ongoing neurogenesis throughout life. Specifically, Nfix(-/-) mice exhibit more PAX6-expressing progenitor cells within the SVZ. However, the mechanism underlying the development of this phenotype remains undefined. Here, we reveal that NFIX contributes to multiple facets of SVZ development. Postnatal Nfix(-/-) mice exhibit increased levels of proliferation within the SVZ, both in vivo and in vitro as assessed by a neurosphere assay. Furthermore, we show that the migration of SVZ-derived neuroblasts to the olfactory bulb is impaired, and that the olfactory bulbs of postnatal Nfix(-/-) mice are smaller. We also demonstrate that gliogenesis within the rostral migratory stream is delayed in the absence of Nfix, and reveal that Gdnf (glial-derived neurotrophic factor), a known attractant for SVZ-derived neuroblasts, is a target for transcriptional activation by NFIX. Collectively, these findings suggest that NFIX regulates both proliferation and migration during the development of the SVZ neurogenic niche.


Assuntos
Movimento Celular , Proliferação de Células , Ventrículos Laterais/embriologia , Fatores de Transcrição NFI/fisiologia , Células-Tronco Neurais/fisiologia , Neurogênese , Animais , Feminino , Fator Neurotrófico Derivado de Linhagem de Célula Glial/metabolismo , Interneurônios/fisiologia , Ventrículos Laterais/metabolismo , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Knockout , Fatores de Transcrição NFI/genética , Fatores de Transcrição NFI/metabolismo , Neuroglia/fisiologia , Bulbo Olfatório/embriologia , Bulbo Olfatório/metabolismo , Nicho de Células-Tronco
15.
Mol Cell Proteomics ; 13(5): 1330-40, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24532840

RESUMO

Protein synthesis is finely regulated across all organisms, from bacteria to humans, and its integrity underpins many important processes. Emerging evidence suggests that the dynamic range of protein abundance is greater than that observed at the transcript level. Technological breakthroughs now mean that sequencing-based measurement of mRNA levels is routine, but protocols for measuring protein abundance remain both complex and expensive. This paper introduces a Bayesian network that integrates transcriptomic and proteomic data to predict protein abundance and to model the effects of its determinants. We aim to use this model to follow a molecular response over time, from condition-specific data, in order to understand adaptation during processes such as the cell cycle. With microarray data now available for many conditions, the general utility of a protein abundance predictor is broad. Whereas most quantitative proteomics studies have focused on higher organisms, we developed a predictive model of protein abundance for both Saccharomyces cerevisiae and Schizosaccharomyces pombe to explore the latitude at the protein level. Our predictor primarily relies on mRNA level, mRNA-protein interaction, mRNA folding energy and half-life, and tRNA adaptation. The combination of key features, allowing for the low certainty and uneven coverage of experimental observations, gives comparatively minor but robust prediction accuracy. The model substantially improved the analysis of protein regulation during the cell cycle: predicted protein abundance identified twice as many cell-cycle-associated proteins as experimental mRNA levels. Predicted protein abundance was more dynamic than observed mRNA expression, agreeing with experimental protein abundance from a human cell line. We illustrate how the same model can be used to predict the folding energy of mRNA when protein abundance is available, lending credence to the emerging view that mRNA folding affects translation efficiency. The software and data used in this research are available at http://bioinf.scmb.uq.edu.au/proteinabundance/.


Assuntos
Teorema de Bayes , Proteínas de Ciclo Celular/metabolismo , RNA Mensageiro/química , RNA Mensageiro/metabolismo , Software , Transcriptoma , Proteínas de Ciclo Celular/genética , Humanos , Modelos Moleculares , Proteômica , Dobramento de RNA , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Proteínas de Schizosaccharomyces pombe/genética , Proteínas de Schizosaccharomyces pombe/metabolismo
16.
Nucleic Acids Res ; 42(17): 11000-10, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25200088

RESUMO

Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules-CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for 'other' tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a 'nearest neighbor' heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps.


Assuntos
Regulação da Expressão Gênica , Modelos Genéticos , Linhagem Celular , Genômica/métodos , Histonas/análise , Humanos , Modelos Lineares , Especificidade de Órgãos , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição
17.
J Neurosci ; 34(8): 2921-30, 2014 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-24553933

RESUMO

Epigenetic mechanisms are essential in regulating neural progenitor cell self-renewal, with the chromatin-modifying protein Enhancer of zeste homolog 2 (EZH2) emerging as a central player in promoting progenitor cell self-renewal during cortical development. Despite this, how Ezh2 is itself regulated remains unclear. Here, we demonstrate that the transcription factor nuclear factor IB (NFIB) plays a key role in this process. Nfib(-/-) mice exhibit an increased number of proliferative ventricular zone cells that express progenitor cell markers and upregulation of EZH2 expression within the neocortex and hippocampus. NFIB binds to the Ezh2 promoter and overexpression of NFIB represses Ezh2 transcription. Finally, key downstream targets of EZH2-mediated epigenetic repression are misregulated in Nfib(-/-) mice. Collectively, these results suggest that the downregulation of Ezh2 transcription by NFIB is an important component of the process of neural progenitor cell differentiation during cortical development.


Assuntos
Córtex Cerebral/crescimento & desenvolvimento , Epigênese Genética/fisiologia , Fatores de Transcrição NFI/genética , Fatores de Transcrição NFI/fisiologia , Complexo Repressor Polycomb 2/genética , Complexo Repressor Polycomb 2/fisiologia , Animais , Contagem de Células , Córtex Cerebral/citologia , Córtex Cerebral/fisiologia , Ensaio de Desvio de Mobilidade Eletroforética , Proteína Potenciadora do Homólogo 2 de Zeste , Feminino , Hipocampo/citologia , Hipocampo/crescimento & desenvolvimento , Imuno-Histoquímica , Masculino , Camundongos , Camundongos Knockout , Análise em Microsséries , Mutação/genética , Mutação/fisiologia , Células-Tronco Neurais/fisiologia , Cultura Primária de Células , Regiões Promotoras Genéticas/genética , Reação em Cadeia da Polimerase em Tempo Real
18.
BMC Dev Biol ; 15: 34, 2015 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-26444262

RESUMO

BACKGROUND: Sex determination in mammals requires expression of the Y-linked gene Sry in the bipotential genital ridges of the XY embryo. Even minor delay of the onset of Sry expression can result in XY sex reversal, highlighting the need for accurate gene regulation during sex determination. However, the location of critical regulatory elements remains unknown. Here, we analysed Sry flanking sequences across many species, using newly available genome sequences and computational tools, to better understand Sry's genomic context and to identify conserved regions predictive of functional roles. METHODS: Flanking sequences from 17 species were analysed using both global and local sequence alignment methods. Multiple motif searches were employed to characterise common motifs in otherwise unconserved sequence. RESULTS: We identified position-specific conservation of binding motifs for multiple transcription factor families, including GATA binding factors and Oct/Sox dimers. In contrast with the landscape of extremely low sequence conservation around the Sry coding region, our analysis highlighted a strongly conserved interval of ~106 bp within the Sry promoter (which we term the Sry Proximal Conserved Interval, SPCI). We further report that inverted repeats flanking murine Sry are much larger than previously recognised. CONCLUSIONS: The unusually fast pace of sequence drift on the Y chromosome sharpens the likely functional significance of both the SPCI and the identified binding motifs, providing a basis for future studies of the role(s) of these elements in Sry regulation.


Assuntos
Mamíferos/genética , Proteína da Região Y Determinante do Sexo/genética , Animais , Sequência de Bases , Sequência Conservada , Evolução Molecular , Humanos , Mamíferos/classificação , Dados de Sequência Molecular , Alinhamento de Sequência , Fatores de Transcrição/metabolismo
19.
Genome Res ; 22(7): 1372-81, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22550012

RESUMO

Double-stranded DNA is able to form triple-helical structures by accommodating a third nucleotide strand in its major groove. This sequence-specific process offers a potent mechanism for targeting genomic loci of interest that is of great value for biotechnological and gene-therapeutic applications. It is likely that nature has leveraged this addressing system for gene regulation, because computational studies have uncovered an abundance of putative triplex target sites in various genomes, with enrichment particularly in gene promoters. However, to draw a more complete picture of the in vivo role of triplexes, not only the putative targets but also the sequences acting as the third strand and their capability to pair with the predicted target sites need to be studied. Here we present Triplexator, the first computational framework that integrates all aspects of triplex formation, and showcase its potential by discussing research examples for which the different aspects of triplex formation are important. We find that chromatin-associated RNAs have a significantly higher fraction of sequence features able to form triplexes than expected at random, suggesting their involvement in gene regulation. We furthermore identify hundreds of human genes that contain sequence features in their promoter predicted to be able to form a triplex with a target within the same promoter, suggesting the involvement of triplexes in feedback-based gene regulation. With focus on biotechnological applications, we screen mammalian genomes for high-affinity triplex target sites that can be used to target genomic loci specifically and find that triplex formation offers a resolution of ~1300 nt.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Genômica/métodos , Oligonucleotídeos/química , Proteínas de Ligação a RNA/química , Animais , Cromatina/química , Cromatina/genética , Dicroísmo Circular , Biologia Computacional/métodos , DNA/química , DNA/genética , Loci Gênicos , Genoma Humano , Humanos , Ligação de Hidrogênio , Conformação de Ácido Nucleico , Oligonucleotídeos/genética , Regiões Promotoras Genéticas , Estabilidade de RNA , Proteínas de Ligação a RNA/genética , Fatores de Tempo
20.
Genome Res ; 22(12): 2385-98, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22835905

RESUMO

KLF1 (formerly known as EKLF) regulates the development of erythroid cells from bi-potent progenitor cells via the transcriptional activation of a diverse set of genes. Mice lacking Klf1 die in utero prior to E15 from severe anemia due to the inadequate expression of genes controlling hemoglobin production, cell membrane and cytoskeletal integrity, and the cell cycle. We have recently described the full repertoire of KLF1 binding sites in vivo by performing KLF1 ChIP-seq in primary erythroid tissue (E14.5 fetal liver). Here we describe the KLF1-dependent erythroid transcriptome by comparing mRNA-seq from Klf1(+/+) and Klf1(-/-) erythroid tissue. This has revealed novel target genes not previously obtainable by traditional microarray technology, and provided novel insights into the function of KLF1 as a transcriptional activator. We define a cis-regulatory module bound by KLF1, GATA1, TAL1, and EP300 that coordinates a core set of erythroid genes. We also describe a novel set of erythroid-specific promoters that drive high-level expression of otherwise ubiquitously expressed genes in erythroid cells. Our study has identified two novel lncRNAs that are dynamically expressed during erythroid differentiation, and discovered a role for KLF1 in directing apoptotic gene expression to drive the terminal stages of erythroid maturation.


Assuntos
Eritropoese/genética , Regulação da Expressão Gênica no Desenvolvimento , Fatores de Transcrição Kruppel-Like/genética , RNA Mensageiro/genética , Transcriptoma , Animais , Apoptose , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/metabolismo , Western Blotting , Diferenciação Celular , Mapeamento Cromossômico , Proteína p300 Associada a E1A/genética , Proteína p300 Associada a E1A/metabolismo , Células Eritroides/citologia , Células Eritroides/metabolismo , Fator de Transcrição GATA1/genética , Fator de Transcrição GATA1/metabolismo , Perfilação da Expressão Gênica , Marcação In Situ das Extremidades Cortadas , Fatores de Transcrição Kruppel-Like/metabolismo , Fígado/metabolismo , Camundongos , Camundongos Endogâmicos BALB C , Regiões Promotoras Genéticas , Proteínas Proto-Oncogênicas/genética , Proteínas Proto-Oncogênicas/metabolismo , RNA Mensageiro/metabolismo , Análise de Sequência de RNA/métodos , Proteína 1 de Leucemia Linfocítica Aguda de Células T
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA