Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
EcoSal Plus ; 6(1)2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-26442933

RESUMO

EcoCyc is a bioinformatics database available at EcoCyc.org that describes the genome and the biochemical machinery of Escherichia coli K-12 MG1655. The long-term goal of the project is to describe the complete molecular catalog of the E. coli cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of E. coli. EcoCyc is an electronic reference source for E. coli biologists and for biologists who work with related microorganisms. The database includes information pages on each E. coli gene, metabolite, reaction, operon, and metabolic pathway. The database also includes information on E. coli gene essentiality and on nutrient conditions that do or do not support the growth of E. coli. The website and downloadable software contain tools for analysis of high-throughput data sets. In addition, a steady-state metabolic flux model is generated from each new version of EcoCyc. The model can predict metabolic flux rates, nutrient uptake rates, and growth rates for different gene knockouts and nutrient conditions. This review provides a detailed description of the data content of EcoCyc and of the procedures by which this content is generated.

2.
Database (Oxford) ; 2013: bas059, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23327937

RESUMO

RegulonDB provides curated information on the transcriptional regulatory network of Escherichia coli and contains both experimental data and computationally predicted objects. To account for the heterogeneity of these data, we introduced in version 6.0, a two-tier rating system for the strength of evidence, classifying evidence as either 'weak' or 'strong' (Gama-Castro,S., Jimenez-Jacinto,V., Peralta-Gil,M. et al. RegulonDB (Version 6.0): gene regulation model of Escherichia Coli K-12 beyond transcription, active (experimental) annotated promoters and textpresso navigation. Nucleic Acids Res., 2008;36:D120-D124.). We now add to our classification scheme the classification of high-throughput evidence, including chromatin immunoprecipitation (ChIP) and RNA-seq technologies. To integrate these data into RegulonDB, we present two strategies for the evaluation of confidence, statistical validation and independent cross-validation. Statistical validation involves verification of ChIP data for transcription factor-binding sites, using tools for motif discovery and quality assessment of the discovered matrices. Independent cross-validation combines independent evidence with the intention to mutually exclude false positives. Both statistical validation and cross-validation allow to upgrade subsets of data that are supported by weak evidence to a higher confidence level. Likewise, cross-validation of strong confidence data extends our two-tier rating system to a three-tier system by introducing a third confidence score 'confirmed'. Database URL: http://regulondb.ccg.unam.mx/


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Escherichia coli/genética , Regulon/genética , Estatística como Assunto , Vias Biossintéticas/genética , Imunoprecipitação da Cromatina , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Matrizes de Pontuação de Posição Específica , Reprodutibilidade dos Testes , Sítio de Iniciação de Transcrição
3.
Nucleic Acids Res ; 41(Database issue): D203-13, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23203884

RESUMO

This article summarizes our progress with RegulonDB (http://regulondb.ccg.unam.mx/) during the past 2 years. We have kept up-to-date the knowledge from the published literature regarding transcriptional regulation in Escherichia coli K-12. We have maintained and expanded our curation efforts to improve the breadth and quality of the encoded experimental knowledge, and we have implemented criteria for the quality of our computational predictions. Regulatory phrases now provide high-level descriptions of regulatory regions. We expanded the assignment of quality to various sources of evidence, particularly for knowledge generated through high-throughput (HT) technology. Based on our analysis of most relevant methods, we defined rules for determining the quality of evidence when multiple independent sources support an entry. With this latest release of RegulonDB, we present a new highly reliable larger collection of transcription start sites, a result of our experimental HT genome-wide efforts. These improvements, together with several novel enhancements (the tracks display, uploading format and curational guidelines), address the challenges of incorporating HT-generated knowledge into RegulonDB. Information on the evolutionary conservation of regulatory elements is also available now. Altogether, RegulonDB version 8.0 is a much better home for integrating knowledge on gene regulation from the sources of information currently available.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Elementos Reguladores de Transcrição , Transcrição Gênica , Proteínas de Bactérias/metabolismo , Bases de Dados Genéticas/normas , Evolução Molecular , Genômica , Internet , Regiões Promotoras Genéticas , Regulon , Proteínas Repressoras/metabolismo , Análise de Sequência de RNA , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição
4.
Nucleic Acids Res ; 41(Database issue): D605-12, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23143106

RESUMO

EcoCyc (http://EcoCyc.org) is a model organism database built on the genome sequence of Escherichia coli K-12 MG1655. Expert manual curation of the functions of individual E. coli gene products in EcoCyc has been based on information found in the experimental literature for E. coli K-12-derived strains. Updates to EcoCyc content continue to improve the comprehensive picture of E. coli biology. The utility of EcoCyc is enhanced by new tools available on the EcoCyc web site, and the development of EcoCyc as a teaching tool is increasing the impact of the knowledge collected in EcoCyc.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Sítios de Ligação , Escherichia coli K12/metabolismo , Proteínas de Escherichia coli/classificação , Proteínas de Escherichia coli/metabolismo , Regulação Bacteriana da Expressão Gênica , Internet , Proteínas de Membrana Transportadoras/classificação , Proteínas de Membrana Transportadoras/metabolismo , Modelos Genéticos , Anotação de Sequência Molecular , Fenótipo , Matrizes de Pontuação de Posição Específica , Regiões Promotoras Genéticas , Biologia de Sistemas , Fatores de Transcrição/metabolismo , Transcrição Gênica
5.
Nucleic Acids Res ; 39(Database issue): D98-105, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21051347

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx/) is the primary reference database of the best-known regulatory network of any free-living organism, that of Escherichia coli K-12. The major conceptual change since 3 years ago is an expanded biological context so that transcriptional regulation is now part of a unit that initiates with the signal and continues with the signal transduction to the core of regulation, modifying expression of the affected target genes responsible for the response. We call these genetic sensory response units, or Gensor Units. We have initiated their high-level curation, with graphic maps and superreactions with links to other databases. Additional connectivity uses expandable submaps. RegulonDB has summaries for every transcription factor (TF) and TF-binding sites with internal symmetry. Several DNA-binding motifs and their sizes have been redefined and relocated. In addition to data from the literature, we have incorporated our own information on transcription start sites (TSSs) and transcriptional units (TUs), obtained by using high-throughput whole-genome sequencing technologies. A new portable drawing tool for genomic features is also now available, as well as new ways to download the data, including web services, files for several relational database manager systems and text files including BioPAX format.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Fatores de Transcrição/metabolismo , Sítios de Ligação , Escherichia coli K12/metabolismo , Transdução de Sinais , Integração de Sistemas , Sítio de Iniciação de Transcrição , Transcrição Gênica
6.
PLoS One ; 4(10): e7526, 2009 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-19838305

RESUMO

Despite almost 40 years of molecular genetics research in Escherichia coli a major fraction of its Transcription Start Sites (TSSs) are still unknown, limiting therefore our understanding of the regulatory circuits that control gene expression in this model organism. RegulonDB (http://regulondb.ccg.unam.mx/) is aimed at integrating the genetic regulatory network of E. coli K12 as an entirely bioinformatic project up till now. In this work, we extended its aims by generating experimental data at a genome scale on TSSs, promoters and regulatory regions. We implemented a modified 5' RACE protocol and an unbiased High Throughput Pyrosequencing Strategy (HTPS) that allowed us to map more than 1700 TSSs with high precision. From this collection, about 230 corresponded to previously reported TSSs, which helped us to benchmark both our methodologies and the accuracy of the previous mapping experiments. The other ca 1500 TSSs mapped belong to about 1000 different genes, many of them with no assigned function. We identified promoter sequences and type of sigma factors that control the expression of about 80% of these genes. As expected, the housekeeping sigma(70) was the most common type of promoter, followed by sigma(38). The majority of the putative TSSs were located between 20 to 40 nucleotides from the translational start site. Putative regulatory binding sites for transcription factors were detected upstream of many TSSs. For a few transcripts, riboswitches and small RNAs were found. Several genes also had additional TSSs within the coding region. Unexpectedly, the HTPS experiments revealed extensive antisense transcription, probably for regulatory functions. The new information in RegulonDB, now with more than 2400 experimentally determined TSSs, strengthens the accuracy of promoter prediction, operon structure, and regulatory networks and provides valuable new information that will facilitate the understanding from a global perspective the complex and intricate regulatory network that operates in E. coli.


Assuntos
Escherichia coli/genética , Genes Bacterianos/genética , Genoma Bacteriano , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição , Transcrição Gênica , Sequência de Bases , Sítios de Ligação , Mapeamento Cromossômico , Biologia Computacional/métodos , Redes Reguladoras de Genes , Modelos Genéticos , Dados de Sequência Molecular , Regiões Promotoras Genéticas , Homologia de Sequência do Ácido Nucleico
7.
Nucleic Acids Res ; 36(Database issue): D120-4, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18158297

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx/) is the primary reference database offering curated knowledge of the transcriptional regulatory network of Escherichia coli K12, currently the best-known electronically encoded database of the genetic regulatory network of any free-living organism. This paper summarizes the improvements, new biology and new features available in version 6.0. Curation of original literature is, from now on, up to date for every new release. All the objects are supported by their corresponding evidences, now classified as strong or weak. Transcription factors are classified by origin of their effectors and by gene ontology class. We have now computational predictions for sigma(54) and five different promoter types of the sigma(70) family, as well as their corresponding -10 and -35 boxes. In addition to those curated from the literature, we added about 300 experimentally mapped promoters coming from our own high-throughput mapping efforts. RegulonDB v.6.0 now expands beyond transcription initiation, including RNA regulatory elements, specifically riboswitches, attenuators and small RNAs, with their known associated targets. The data can be accessed through overviews of correlations about gene regulation. RegulonDB associated original literature, together with more than 4000 curation notes, can now be searched with the Textpresso text mining engine.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Biologia Computacional , Internet , Modelos Genéticos , Regiões Promotoras Genéticas , Sequências Reguladoras de Ácido Ribonucleico , Regulon , Fator sigma/metabolismo , Software , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição , Transcrição Gênica
8.
PLoS Genet ; 2(11): e185, 2006 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-17096598

RESUMO

The evolutionary processes operating in the DNA regions that participate in the regulation of gene expression are poorly understood. In Escherichia coli, we have established a sequence pattern that distinguishes regulatory from nonregulatory regions. The density of promoter-like sequences, that could be recognizable by RNA polymerase and may function as potential promoters, is high within regulatory regions, in contrast to coding regions and regions located between convergently transcribed genes. Moreover, functional promoter sites identified experimentally are often found in the subregions of highest density of promoter-like signals, even when individual sites with higher binding affinity for RNA polymerase exist elsewhere within the regulatory region. In order to see the generality of this pattern, we have analyzed 43 additional genomes belonging to most established bacterial phyla. Differential densities between regulatory and nonregulatory regions are detectable in most of the analyzed genomes, with the exception of those that have evolved toward extreme genome reduction. Thus, presence of this pattern follows that of genes and other genomic features that require weak selection to be effective in order to persist. On this basis, we suggest that the loss of differential densities in the reduced genomes of host-restricted pathogens and symbionts is an outcome of the process of genome degradation resulting from the decreased efficiency of purifying selection in highly structured small populations. This implies that the differential distribution of promoter-like signals between regulatory and nonregulatory regions detected in large bacterial genomes confers a significant, although small, fitness advantage. This study paves the way for further identification of the specific types of selective constraints that affect the organization of regulatory regions and the overall distribution of promoter-like signals through more detailed comparative analyses among closely related bacterial genomes.


Assuntos
RNA Polimerases Dirigidas por DNA/metabolismo , Genoma Bacteriano/genética , Regiões Promotoras Genéticas/genética , Sequências Reguladoras de Ácido Nucleico/genética , Seleção Genética , Fator sigma/metabolismo , Motivos de Aminoácidos , Sequência de Bases , Sequência Consenso , DNA Bacteriano/genética , Escherichia coli/genética , Dados de Sequência Molecular , Mycobacterium leprae/genética , Mycobacterium tuberculosis/genética , Alinhamento de Sequência
9.
Nucleic Acids Res ; 34(14): 3980-7, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-16914446

RESUMO

Here we show that regions upstream of first transcribed genes have oligonucleotide signatures that distinguish them from regions upstream of genes in the middle of operons. Databases of experimentally confirmed transcription units do not exist for most genomes. Thus, to expand the analyses into genomes with no experimentally confirmed data, we used genes conserved adjacent in evolutionarily distant genomes as representatives of genes inside operons. Likewise, we used divergently transcribed genes as representative examples of first transcribed genes. In model organisms, the trinucleotide signatures of regions upstream of these representative genes allow for operon predictions with accuracies close to those obtained with known operon data (0.8). Signature-based operon predictions have more similar phylogenetic profiles and higher proportions of genes in the same pathways than predicted transcription unit boundaries (TUBs). These results confirm that we are separating genes with related functions, as expected for operons, from genes not necessarily related, as expected for genes in different transcription units. We also test the quality of the predictions using microarray data in six genomes and show that the signature-predicted operons tend to have high correlations of expression. Oligonucleotide signatures should expand the number of tools available to identify operons even in poorly characterized genomes.


Assuntos
Genoma Bacteriano , Genômica/métodos , Óperon , Regiões Promotoras Genéticas , Bacillus subtilis/genética , Bactérias/genética , Biologia Computacional/métodos , RNA Polimerases Dirigidas por DNA/metabolismo , Escherichia coli/genética , Expressão Gênica , Genes Bacterianos , Genoma Arqueal , Filogenia , Fator sigma/metabolismo
10.
Mol Biol Evol ; 23(5): 997-1010, 2006 May.
Artigo em Inglês | MEDLINE | ID: mdl-16547149

RESUMO

The selective mechanisms operating in regulatory regions of bacterial genomes are poorly understood. We have previously shown that, in most bacterial genomes, regulatory regions contain high densities of sigma70 promoter-like signals that are significantly above the densities detected in nonregulatory genomic regions. In order to investigate the molecular evolutionary forces that operate in bacterial regulatory regions and how they affect the observed redundancy of promoter-like signals, we have undertaken a comparative analysis across the completely sequenced genomes of enteric gamma-proteobacteria. This analysis detects significant positional conservation of promoter-like signal clusters across enterics, some times in spite of strong primary sequence divergence. This suggests that the conservation of the nature and exact position of specific nucleotides is not necessarily the priority of selection for maintaining the transcriptional function in these bacteria. We have further characterized the structural conservation of the regulatory regions of dnaQ and crp across all enterics. These two regions differ in essentiality and mode of regulation, the regulation of crp being more complex and involving interactions with several transcription factors. This results in substantially different modes of evolution, with the dnaQ region appearing to evolve under stronger purifying selection and the crp region showing the likely effects of stabilizing selection for a complex pattern of gene expression. The higher flexibility of the crp region is consistent with the observed less conservation of global regulators in evolution. Patterns of regulatory evolution are also found to be markedly different in endosymbiotic bacteria, in a manner consistent with regulatory regions suffering some level of degradation, as has been observed for many other characters in these genomes. Therefore, the mode of evolution of bacterial regulatory regions appears to be highly dependent on both the lifestyle of the bacterium and the specific regulatory requirements of different genes. In fact, in many bacteria, the mode of evolution of genes requiring significant physiological adaptability in expression levels may follow patterns similar to those operating in the more complex regulatory regions of eukaryotic genomes.


Assuntos
Enterobacteriaceae/genética , Genoma , Regiões Promotoras Genéticas , Sequência de Aminoácidos , Evolução Biológica , Análise por Conglomerados , Evolução Molecular , Genes Bacterianos , Genoma Bacteriano , Modelos Genéticos , Modelos Estatísticos , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos
11.
J Mol Biol ; 354(1): 184-99, 2005 Nov 18.
Artigo em Inglês | MEDLINE | ID: mdl-16236313

RESUMO

Experimental data on the Escherichia coli transcriptional regulation has enabled the construction of statistical models to predict new regulatory elements within its genome. Far less is known about the transcriptional regulatory elements in other gamma-proteobacteria with sequenced genomes, so it is of great interest to conduct comparative genomic studies oriented to extracting biologically relevant information about transcriptional regulation in these less studied organisms using the knowledge from E. coli. In this work, we use the information stored in the TRACTOR_DB database to conduct a comparative study on the mechanisms of transcriptional regulation in eight gamma-proteobacteria and 38 regulons. We assess the conservation of transcription factors binding specificity across all the eight genomes and show a correlation between the conservation of a regulatory site and the structure of the transcription unit it regulates. We also find a marked conservation of site-promoter distances across the eight organisms and a correspondence of the statistical significance of co-occurrence of pairs of transcription factor binding sites in the regulatory regions, which is probably related to a conserved architecture of higher-order regulatory complexes in the organisms studied. The results obtained in this study using the information on transcriptional regulation in E. coli enable us to conclude that not only transcription factor-binding sites are conserved across related species but also several of the transcriptional regulatory mechanisms previously identified in E. coli.


Assuntos
Biologia Computacional , Gammaproteobacteria/genética , Regulação Bacteriana da Expressão Gênica , Genoma Bacteriano , Transcrição Gênica , Sítios de Ligação/genética , Regiões Promotoras Genéticas , Regulon , Sintenia , Fatores de Transcrição/genética
12.
Genome Res ; 13(11): 2435-43, 2003 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-14597655

RESUMO

The transcriptional network of Escherichia coli may well be the most complete experimentally characterized network of a single cell. A rule-based approach was built to assess the degree of consistency between whole-genome microarray experiments in different experimental conditions and the accumulated knowledge in the literature compiled in RegulonDB, a data base of transcriptional regulation and operon organization in E. coli. We observed a high and statistical significant level of consistency, ranging from 70%-87%. When effector metabolites of regulatory proteins are not considered in the prediction of the active or inactive state of the regulators, consistency falls by up to 40%. Similarly, consistency decreases when rules for multiple regulatory interactions are altered or when "on" and "off" entries were assigned randomly. We modified the initial state of regulators and evaluated the propagation of errors in the network that do not correlate linearly with the connectivity of regulators. We interpret this deviation mainly as a result of the existence of redundant regulatory interactions. Consistency evaluation opens a new space of dialogue between theory and experiment, as the consequences of different assumptions can be evaluated and compared.


Assuntos
Escherichia coli/genética , Perfilação da Expressão Gênica , Regulação Bacteriana da Expressão Gênica/genética , Análise de Sequência com Séries de Oligonucleotídeos , Projetos de Pesquisa , Bases de Dados Genéticas , Perfilação da Expressão Gênica/estatística & dados numéricos , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Óperon/genética , Valor Preditivo dos Testes , Regulon/genética
13.
J Mol Biol ; 333(2): 261-78, 2003 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-14529615

RESUMO

We present here a computational analysis showing that sigma70 house-keeping promoters are located within zones with high densities of promoter-like signals in Escherichia coli, and we introduce strategies that allow for the correct computer prediction of sigma70 promoters. Based on 599 experimentally verified promoters of E.coli K-12, we generated and evaluated more than 200 weight matrices optimizing different criteria to obtain the best recognition matrices. The alignments generating the best statistical models did not fully correspond with the canonical sigma70 model. However, matrices that correspond to such a canonical model performed better as tools for prediction. We tested the predictive capacity of these matrices on 250 bp long regions upstream of gene starts, where 90% of the known promoters occur. The computational matrix models generated an average of 38 promoter-like signals within each 250 bp region. In more than 50% of the cases, the true promoter does not have the best score within the region. We observed, in fact, that real promoters occur mostly within regions with high densities of overlapping putative promoters. We evaluated several strategies to identify promoters. The best one uses an intrinsic score of the -10 and -35 hexamers that form the promoter as well as an extrinsic score that uses the distribution of promoters from the start of the gene. We were able to identify 86% true promoters correctly, generating an average of 4.7 putative promoters per region as output, of which 3.7, on average, exist in clusters, as a series of overlapping potentially competing RNA polymerase-binding sites. As far as we know, this is the highest predictive capability reported so far. This high signal density is found mainly within regions upstream of genes, contrasting with coding regions and regions located between convergently transcribed genes. These results are consistent with experimental evidence that show the existence of multiple overlapping promoter sites that become functional under particular conditions. This density is probably the consequence of a rich number of vestiges of promoters in evolution. We suggest that transcriptional regulators as well as other functional promoters play an important role in keeping these latent signals suppressed.


Assuntos
Proteínas de Bactérias/genética , RNA Polimerases Dirigidas por DNA/genética , Escherichia coli/enzimologia , Regiões Promotoras Genéticas , Fator sigma/genética , Transcrição Gênica , Proteínas de Bactérias/metabolismo , Sequência Conservada , RNA Polimerases Dirigidas por DNA/metabolismo , Regulação Bacteriana da Expressão Gênica , Genes Bacterianos , Homologia de Genes , Variação Genética , Fator sigma/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA