Pesquisa | Portal de Pesquisa da BVS

1.

RegulonDB v12.0: a comprehensive resource of transcriptional regulation in E. coli K-12.

Salgado, Heladia; Gama-Castro, Socorro; Lara, Paloma; Mejia-Almonte, Citlalli; Alarcón-Carranza, Gabriel; López-Almazo, Andrés G; Betancourt-Figueroa, Felipe; Peña-Loredo, Pablo; Alquicira-Hernández, Shirley; Ledezma-Tejeida, Daniela; Arizmendi-Zagal, Lizeth; Mendez-Hernandez, Francisco; Diaz-Gomez, Ana K; Ochoa-Praxedis, Elizabeth; Muñiz-Rascado, Luis J; García-Sotelo, Jair S; Flores-Gallegos, Fanny A; Gómez, Laura; Bonavides-Martínez, César; Del Moral-Chávez, Víctor M; Hernández-Alvarez, Alfredo J; Santos-Zavaleta, Alberto; Capella-Gutierrez, Salvador; Gelpi, Josep Lluis; Collado-Vides, Julio.

Nucleic Acids Res ; 52(D1): D255-D264, 2024 Jan 05.

Artigo em Inglês | MEDLINE | ID: mdl-37971353

RESUMO

RegulonDB is a database that contains the most comprehensive corpus of knowledge of the regulation of transcription initiation of Escherichia coli K-12, including data from both classical molecular biology and high-throughput methodologies. Here, we describe biological advances since our last NAR paper of 2019. We explain the changes to satisfy FAIR requirements. We also present a full reconstruction of the RegulonDB computational infrastructure, which has significantly improved data storage, retrieval and accessibility and thus supports a more intuitive and user-friendly experience. The integration of graphical tools provides clear visual representations of genetic regulation data, facilitating data interpretation and knowledge integration. RegulonDB version 12.0 can be accessed at https://regulondb.ccg.unam.mx.

Assuntos

Bases de Dados Genéticas , Escherichia coli K12 , Regulação Bacteriana da Expressão Gênica , Biologia Computacional/métodos , Escherichia coli K12/genética , Internet , Transcrição Gênica

2.

Programmatic access to bacterial regulatory networks with regutools.

Chávez, Joselyn; Barberena-Jonas, Carmina; Sotelo-Fonseca, Jesus E; Alquicira-Hernández, José; Salgado, Heladia; Collado-Torres, Leonardo; Reyes, Alejandro.

Bioinformatics ; 36(16): 4532-4534, 2020 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-32573705

RESUMO

SUMMARY: RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools gives researchers the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks. AVAILABILITY AND IMPLEMENTATION: regutools is an R package available through Bioconductor at bioconductor.org/packages/regutools.

Assuntos

Ecossistema , Escherichia coli K12 , Biologia Computacional , Escherichia coli K12/genética , Redes Reguladoras de Genes , Software

3.

RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12.

Santos-Zavaleta, Alberto; Salgado, Heladia; Gama-Castro, Socorro; Sánchez-Pérez, Mishael; Gómez-Romero, Laura; Ledezma-Tejeida, Daniela; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Muñiz-Rascado, Luis José; Peña-Loredo, Pablo; Ishida-Gutiérrez, Cecilia; Velázquez-Ramírez, David A; Del Moral-Chávez, Víctor; Bonavides-Martínez, César; Méndez-Cruz, Carlos-Francisco; Galagan, James; Collado-Vides, Julio.

Nucleic Acids Res ; 47(D1): D212-D220, 2019 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-30395280

RESUMO

RegulonDB, first published 20 years ago, is a comprehensive electronic resource about regulation of transcription initiation of Escherichia coli K-12 with decades of knowledge from classic molecular biology experiments, and recently also from high-throughput genomic methodologies. We curated the literature to keep RegulonDB up to date, and initiated curation of ChIP and gSELEX experiments. We estimate that current knowledge describes between 10% and 30% of the expected total number of transcription factor- gene regulatory interactions in E. coli. RegulonDB provides datasets for interactions for which there is no evidence that they affect expression, as well as expression datasets. We developed a proof of concept pipeline to merge binding and expression evidence to identify regulatory interactions. These datasets can be visualized in the RegulonDB JBrowse. We developed the Microbial Conditions Ontology with a controlled vocabulary for the minimal properties to reproduce an experiment, which contributes to integrate data from high throughput and classic literature. At a higher level of integration, we report Genetic Sensory-Response Units for 200 transcription factors, including their regulation at the metabolic level, and include summaries for 70 of them. Finally, we summarize our research with Natural language processing strategies to enhance our biocuration work.

Assuntos

Biologia Computacional/métodos , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Genômica , Ontologia Genética , Redes Reguladoras de Genes , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala

4.

The transcriptional regulator SsrB is involved in a molecular switch controlling virulence lifestyles of Salmonella.

Pérez-Morales, Deyanira; Banda, María M; Chau, N Y Elizabeth; Salgado, Heladia; Martínez-Flores, Irma; Ibarra, J Antonio; Ilyas, Bushra; Coombes, Brian K; Bustamante, Víctor H.

PLoS Pathog ; 13(7): e1006497, 2017 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-28704543

RESUMO

The evolution of bacterial pathogenicity, heavily influenced by horizontal gene transfer, provides new virulence factors and regulatory connections that alter bacterial phenotypes. Salmonella pathogenicity islands 1 and 2 (SPI-1 and SPI-2) are chromosomal regions that were acquired at different evolutionary times and are essential for Salmonella virulence. In the intestine of mammalian hosts, Salmonella expresses the SPI-1 genes that mediate its invasion to the gut epithelium. Once inside the cells, Salmonella down-regulates the SPI-1 genes and induces the expression of the SPI-2 genes, which favor its intracellular replication. The mechanism by which the invasion machinery is deactivated following successful invasion of host cells is not known. Here, we show that the SPI-2 encoded transcriptional regulator SsrB, which positively controls SPI-2, acts as a dual regulator that represses expression of SPI-1 during intracellular stages of infection. The mechanism of this SPI-1 repression by SsrB was direct and acts upon the hilD and hilA regulatory genes. The phenotypic effect of this molecular switch activity was a significant reduction in invasion ability of S. enterica serovar Typhimurium while promoting the expression of genes required for intracellular survival. During mouse infections, Salmonella mutants lacking SsrB had high levels of hilA (SPI-1) transcriptional activity whereas introducing a constitutively active SsrB led to significant hilA repression. Thus, our results reveal a novel SsrB-mediated mechanism of transcriptional crosstalk between SPI-1 and SPI-2 that helps Salmonella transition to the intracellular lifestyle.

Assuntos

Proteínas de Bactérias/metabolismo , Regulação Bacteriana da Expressão Gênica , Salmonella typhimurium/metabolismo , Salmonella typhimurium/patogenicidade , Fatores de Transcrição/metabolismo , Animais , Proteínas de Bactérias/genética , Ilhas Genômicas , Humanos , Camundongos , Salmonella typhimurium/genética , Fatores de Transcrição/genética , Virulência

5.

A unified resource for transcriptional regulation in Escherichia coli K-12 incorporating high-throughput-generated binding data into RegulonDB version 10.0.

Santos-Zavaleta, Alberto; Sánchez-Pérez, Mishael; Salgado, Heladia; Velázquez-Ramírez, David A; Gama-Castro, Socorro; Tierrafría, Víctor H; Busby, Stephen J W; Aquino, Patricia; Fang, Xin; Palsson, Bernhard O; Galagan, James E; Collado-Vides, Julio.

BMC Biol ; 16(1): 91, 2018 08 16.

Artigo em Inglês | MEDLINE | ID: mdl-30115066

RESUMO

BACKGROUND: Our understanding of the regulation of gene expression has benefited from the availability of high-throughput technologies that interrogate the whole genome for the binding of specific transcription factors and gene expression profiles. In the case of widely used model organisms, such as Escherichia coli K-12, the new knowledge gained from these approaches needs to be integrated with the legacy of accumulated knowledge from genetic and molecular biology experiments conducted in the pre-genomic era in order to attain the deepest level of understanding possible based on the available data. RESULTS: In this paper, we describe an expansion of RegulonDB, the database containing the rich legacy of decades of classic molecular biology experiments supporting what we know about gene regulation and operon organization in E. coli K-12, to include the genome-wide dataset collections from 32 ChIP and 19 gSELEX publications, in addition to around 60 genome-wide expression profiles relevant to the functional significance of these datasets and used in their curation. Three essential features for the integration of this information coming from different methodological approaches are: first, a controlled vocabulary within an ontology for precisely defining growth conditions; second, the criteria to separate elements with enough evidence to consider them involved in gene regulation from isolated transcription factor binding sites without such support; and third, an expanded computational model supporting this knowledge. Altogether, this constitutes the basis for adequately gathering and enabling the comparisons and integration needed to manage and access such wealth of knowledge. CONCLUSIONS: This version 10.0 of RegulonDB is a first step toward what should become the unifying access point for current and future knowledge on gene regulation in E. coli K-12. Furthermore, this model platform and associated methodologies and criteria can be emulated for gathering knowledge on other microbial organisms.

Assuntos

Bases de Dados como Assunto , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Transcrição Gênica

6.

RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond.

Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Del Moral-Chávez, Víctor; Rinaldi, Fabio; Collado-Vides, Julio.

Nucleic Acids Res ; 44(D1): D133-43, 2016 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-26527724

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for 'neighborhood' genes to known operons and regulons, and computational developments.

Assuntos

Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Regulon , Análise por Conglomerados , Escherichia coli K12/metabolismo , Redes Reguladoras de Genes , Óperon , Matrizes de Pontuação de Posição Específica , Pequeno RNA não Traduzido/metabolismo , Fatores de Transcrição/classificação

7.

RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more.

Salgado, Heladia; Peralta-Gil, Martin; Gama-Castro, Socorro; Santos-Zavaleta, Alberto; Muñiz-Rascado, Luis; García-Sotelo, Jair S; Weiss, Verena; Solano-Lira, Hilda; Martínez-Flores, Irma; Medina-Rivera, Alejandra; Salgado-Osorio, Gerardo; Alquicira-Hernández, Shirley; Alquicira-Hernández, Kevin; López-Fuentes, Alejandra; Porrón-Sotelo, Liliana; Huerta, Araceli M; Bonavides-Martínez, César; Balderas-Martínez, Yalbi I; Pannier, Lucia; Olvera, Maricela; Labastida, Aurora; Jiménez-Jacinto, Verónica; Vega-Alvarado, Leticia; Del Moral-Chávez, Victor; Hernández-Alvarez, Alfredo; Morett, Enrique; Collado-Vides, Julio.

Nucleic Acids Res ; 41(Database issue): D203-13, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23203884

RESUMO

This article summarizes our progress with RegulonDB (http://regulondb.ccg.unam.mx/) during the past 2 years. We have kept up-to-date the knowledge from the published literature regarding transcriptional regulation in Escherichia coli K-12. We have maintained and expanded our curation efforts to improve the breadth and quality of the encoded experimental knowledge, and we have implemented criteria for the quality of our computational predictions. Regulatory phrases now provide high-level descriptions of regulatory regions. We expanded the assignment of quality to various sources of evidence, particularly for knowledge generated through high-throughput (HT) technology. Based on our analysis of most relevant methods, we defined rules for determining the quality of evidence when multiple independent sources support an entry. With this latest release of RegulonDB, we present a new highly reliable larger collection of transcription start sites, a result of our experimental HT genome-wide efforts. These improvements, together with several novel enhancements (the tracks display, uploading format and curational guidelines), address the challenges of incorporating HT-generated knowledge into RegulonDB. Information on the evolutionary conservation of regulatory elements is also available now. Altogether, RegulonDB version 8.0 is a much better home for integrating knowledge on gene regulation from the sources of information currently available.

Assuntos

Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Elementos Reguladores de Transcrição , Transcrição Gênica , Proteínas de Bactérias/metabolismo , Bases de Dados Genéticas/normas , Evolução Molecular , Genômica , Internet , Regiões Promotoras Genéticas , Regulon , Proteínas Repressoras/metabolismo , Análise de Sequência de RNA , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição

8.

In silico identification and experimental characterization of regulatory elements controlling the expression of the Salmonella csrB and csrC genes.

Martínez, Luary C; Martínez-Flores, Irma; Salgado, Heladia; Fernández-Mora, Marcos; Medina-Rivera, Alejandra; Puente, José L; Collado-Vides, Julio; Bustamante, Víctor H.

J Bacteriol ; 196(2): 325-36, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24187088

RESUMO

The small RNAs CsrB and CsrC of Salmonella indirectly control the expression of numerous genes encoding widespread cellular functions, including virulence. The expression of csrB and csrC genes, which are located in different chromosomal regions, is coordinated by positive transcriptional control mediated by the two-component regulatory system BarA/SirA. Here, we identified by computational analysis an 18-bp inverted repeat (IR) sequence located far upstream from the promoter of Salmonella enterica serovar Typhimurium csrB and csrC genes. Deletion analysis and site-directed mutagenesis of the csrB and csrC regulatory regions revealed that this IR sequence is required for transcriptional activation of both genes. Protein-DNA and protein-protein interaction assays showed that the response regulator SirA specifically binds to the IR sequence and provide evidence that SirA acts as a dimer. Interestingly, whereas the IR sequence was essential for the SirA-mediated expression of csrB, our results revealed that SirA controls the expression of csrC not only by binding to the IR sequence but also by an indirect mode involving the Csr system. Additional computational, biochemical, and genetic analyses demonstrated that the integration host factor (IHF) global regulator positively controls the expression of csrB, but not of csrC, by interacting with a sequence located between the promoter and the SirA-binding site. These findings contribute to the better understanding of the regulatory mechanism controlling the expression of CsrB and CsrC.

Assuntos

Regulação Bacteriana da Expressão Gênica , Genes Bacterianos , Pequeno RNA não Traduzido/biossíntese , Elementos Reguladores de Transcrição , Salmonella typhimurium/genética , Proteínas de Bactérias/metabolismo , Biologia Computacional , Análise Mutacional de DNA , DNA Bacteriano/genética , DNA Bacteriano/metabolismo , Mutagênese Sítio-Dirigida , Ligação Proteica , Multimerização Proteica , Pequeno RNA não Traduzido/genética , Deleção de Sequência , Transativadores/metabolismo

9.

Flexible gold standards for transcription factor regulatory interactions in Escherichia coli K-12: architecture of evidence types.

Lara, Paloma; Gama-Castro, Socorro; Salgado, Heladia; Rioualen, Claire; Tierrafría, Víctor H; Muñiz-Rascado, Luis J; Bonavides-Martínez, César; Collado-Vides, Julio.

Front Genet ; 15: 1353553, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38505828

RESUMO

Post-genomic implementations have expanded the experimental strategies to identify elements involved in the regulation of transcription initiation. Here, we present for the first time a detailed analysis of the sources of knowledge supporting the collection of transcriptional regulatory interactions (RIs) of Escherichia coli K-12. An RI groups the transcription factor, its effect (positive or negative) and the regulated target, a promoter, a gene or transcription unit. We improved the evidence codes so that specific methods are incorporated and classified into independent groups. On this basis we updated the computation of confidence levels, weak, strong, or confirmed, for the collection of RIs. These updates enabled us to map the RI set to the current collection of HT TF-binding datasets from ChIP-seq, ChIP-exo, gSELEX and DAP-seq in RegulonDB, enriching in this way the evidence of close to one-quarter (1329) of RIs from the current total 5446 RIs. Based on the new computational capabilities of our improved annotation of evidence sources, we can now analyze the internal architecture of evidence, their categories (experimental, classical, HT, computational), and confidence levels. This is how we know that the joint contribution of HT and computational methods increase the overall fraction of reliable RIs (the sum of confirmed and strong evidence) from 49% to 71%. Thus, the current collection has 3912 reliable RIs, with 2718 or 70% of them with classical evidence which can be used to benchmark novel HT methods. Users can selectively exclude the method they want to benchmark, or keep for instance only the confirmed interactions. The recovery of regulatory sites in RegulonDB by the different HT methods ranges between 33% by ChIP-exo to 76% by ChIP-seq although as discussed, many potential confounding factors limit their interpretation. The collection of improvements reported here provides a solid foundation to incorporate new methods and data, and to further integrate the diverse sources of knowledge of the different components of the transcriptional regulatory network. There is no other genomic database that offers this comprehensive high-quality architecture of knowledge supporting a corpus of transcriptional regulatory interactions.

10.

Theoretical and empirical quality assessment of transcription factor-binding motifs.

Medina-Rivera, Alejandra; Abreu-Goodger, Cei; Thomas-Chollier, Morgane; Salgado, Heladia; Collado-Vides, Julio; van Helden, Jacques.

Nucleic Acids Res ; 39(3): 808-24, 2011 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-20923783

RESUMO

Position-specific scoring matrices (PSSMs) are routinely used to predict transcription factor (TF)-binding sites in genome sequences. However, their reliability to predict novel binding sites can be far from optimum, due to the use of a small number of training sites or the inappropriate choice of parameters when building the matrix or when scanning sequences with it. Measures of matrix quality such as E-value and information content rely on theoretical models, and may fail in the context of full genome sequences. We propose a method, implemented in the program 'matrix-quality', that combines theoretical and empirical score distributions to assess reliability of PSSMs for predicting TF-binding sites. We applied 'matrix-quality' to estimate the predictive capacity of matrices for bacterial, yeast and mouse TFs. The evaluation of matrices from RegulonDB revealed some poorly predictive motifs, and allowed us to quantify the improvements obtained by applying multi-genome motif discovery. Interestingly, the method reveals differences between global and specific regulators. It also highlights the enrichment of binding sites in sequence sets obtained from high-throughput ChIP-chip (bacterial and yeast TFs), and ChIP-seq and experiments (mouse TFs). The method presented here has many applications, including: selecting reliable motifs before scanning sequences; improving motif collections in TFs databases; evaluating motifs discovered using high-throughput data sets.

Assuntos

Matrizes de Pontuação de Posição Específica , Regiões Promotoras Genéticas , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo , Animais , Proteínas de Bactérias/metabolismo , Sítios de Ligação , Imunoprecipitação da Cromatina , Genômica , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Curva ROC , Proteínas Repressoras/metabolismo , Serina Endopeptidases/metabolismo , Software

11.

RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units).

Gama-Castro, Socorro; Salgado, Heladia; Peralta-Gil, Martin; Santos-Zavaleta, Alberto; Muñiz-Rascado, Luis; Solano-Lira, Hilda; Jimenez-Jacinto, Verónica; Weiss, Verena; García-Sotelo, Jair S; López-Fuentes, Alejandra; Porrón-Sotelo, Liliana; Alquicira-Hernández, Shirley; Medina-Rivera, Alejandra; Martínez-Flores, Irma; Alquicira-Hernández, Kevin; Martínez-Adame, Ruth; Bonavides-Martínez, César; Miranda-Ríos, Juan; Huerta, Araceli M; Mendoza-Vargas, Alfredo; Collado-Torres, Leonardo; Taboada, Blanca; Vega-Alvarado, Leticia; Olvera, Maricela; Olvera, Leticia; Grande, Ricardo; Morett, Enrique; Collado-Vides, Julio.

Nucleic Acids Res ; 39(Database issue): D98-105, 2011 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21051347

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx/) is the primary reference database of the best-known regulatory network of any free-living organism, that of Escherichia coli K-12. The major conceptual change since 3 years ago is an expanded biological context so that transcriptional regulation is now part of a unit that initiates with the signal and continues with the signal transduction to the core of regulation, modifying expression of the affected target genes responsible for the response. We call these genetic sensory response units, or Gensor Units. We have initiated their high-level curation, with graphic maps and superreactions with links to other databases. Additional connectivity uses expandable submaps. RegulonDB has summaries for every transcription factor (TF) and TF-binding sites with internal symmetry. Several DNA-binding motifs and their sizes have been redefined and relocated. In addition to data from the literature, we have incorporated our own information on transcription start sites (TSSs) and transcriptional units (TUs), obtained by using high-throughput whole-genome sequencing technologies. A new portable drawing tool for genomic features is also now available, as well as new ways to download the data, including web services, files for several relational database manager systems and text files including BioPAX format.

Assuntos

Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Fatores de Transcrição/metabolismo , Sítios de Ligação , Escherichia coli K12/metabolismo , Transdução de Sinais , Integração de Sistemas , Sítio de Iniciação de Transcrição , Transcrição Gênica

12.

A Gold Standard for Transcription Factor Regulatory Interactions in Escherichia coli K-12: Architecture of Evidence Types.

Lara, Paloma; Gama-Castro, Socorro; Salgado, Heladia; Rioualen, Claire; Tierrafría, Víctor H; Muñiz-Rascado, Luis J; Bonavides-Martínez, César; Collado-Vides, Julio.

bioRxiv ; 2023 Dec 11.

Artigo em Inglês | MEDLINE | ID: mdl-37163020

RESUMO

Post-genomic implementations have expanded the experimental strategies to identify elements involved in the regulation of transcription initiation. As new methodologies emerge, a natural step is to compare their results with those from established methodologies, such as the classic methods of molecular biology used to characterize transcription factor binding sites, promoters, or transcription units. In the case of Escherichia coli K-12, the best-studied microorganism, for the last 30 years we have continuously gathered such knowledge from original scientific publications, and have organized it in two databases, RegulonDB and EcoCyc. Furthermore, since RegulonDB version 11.0 (1), we offer comprehensive datasets of binding sites from chromatin immunoprecipitation combined with sequencing (ChIP-seq), ChIP combined with exonuclease digestion and next-generation sequencing (ChIP-exo), genomic SELEX screening (gSELEX), and DNA affinity purification sequencing (DAP-seq) HT technologies, as well as additional datasets for transcription start sites, transcription units and RNA sequencing (RNA-seq) expression profiles. Here, we present for the first time an analysis of the sources of knowledge supporting the collection of transcriptional regulatory interactions (RIs) of E. coli K-12. An RI is formed by the transcription factor, its positive or negative effect on a promoter, a gene or transcription unit. We improved the evidence codes so that the specific methods are described, and we classified them into seven independent groups. This is the basis for our updated computation of confidence levels, weak, strong, or confirmed, for the collection of RIs. We compare the confidence levels of the RI collection before and after adding HT evidence illustrating how knowledge will change as more HT data and methods appear in the future. Users can generate subsets filtering out the method they want to benchmark and avoid circularity, or keep for instance only the confirmed interactions. The comparison of different HT methods with the available datasets indicate that ChIP-seq recovers the highest fraction (>70%) of binding sites present in RegulonDB followed by gSELEX, DAP-seq and ChIP-exo. There is no other genomic database that offers this comprehensive high-quality anatomy of evidence supporting a corpus of transcriptional regulatory interactions.

13.

RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12.

Tierrafría, Víctor H; Rioualen, Claire; Salgado, Heladia; Lara, Paloma; Gama-Castro, Socorro; Lally, Patrick; Gómez-Romero, Laura; Peña-Loredo, Pablo; López-Almazo, Andrés G; Alarcón-Carranza, Gabriel; Betancourt-Figueroa, Felipe; Alquicira-Hernández, Shirley; Polanco-Morelos, J Enrique; García-Sotelo, Jair; Gaytan-Nuñez, Estefani; Méndez-Cruz, Carlos-Francisco; Muñiz, Luis J; Bonavides-Martínez, César; Moreno-Hagelsieb, Gabriel; Galagan, James E; Wade, Joseph T; Collado-Vides, Julio.

Microb Genom ; 8(5)2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35584008

RESUMO

Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB (https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12.

Assuntos

Escherichia coli K12 , Escherichia coli , Escherichia coli/genética , Escherichia coli K12/genética , Escherichia coli K12/metabolismo , Regulação Bacteriana da Expressão Gênica , Óperon/genética , Reprodutibilidade dos Testes

14.

Transcriptional regulation shapes the organization of genes on bacterial chromosomes.

Janga, Sarath Chandra; Salgado, Heladia; Martínez-Antonio, Agustino.

Nucleic Acids Res ; 37(11): 3680-8, 2009 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-19372274

RESUMO

Transcription factors (TFs) are the key elements responsible for controlling the expression of genes in bacterial genomes and when visualized on a genomic scale form a dense network of transcriptional interactions among themselves and with other protein coding genes. Although the structure of transcriptional regulatory networks (TRNs) is well understood, it is not clear what constrains govern them. Here, we explore this question using the TRNs of model prokaryotes and provide a link between the transcriptional hierarchy of regulons and their genome organization. We show that, to drive the kinetics and concentration gradients, TFs belonging to big and small regulons, depending on the number of genes they regulate, organize themselves differently on the genome with respect to their targets. We then propose a conceptual model that can explain how the hierarchical structure of TRNs might be ultimately governed by the dynamic biophysical requirements for targeting DNA-binding sites by TFs. Our results suggest that the main parameters defining the position of a TF in the network hierarchy are the number and chromosomal distances of the genes they regulate and their protein concentration gradients. These observations give insights into how the hierarchical structure of transcriptional networks can be encoded on the chromosome to drive the kinetics and concentration gradients of TFs depending on the number of genes they regulate and could be a common theme valid for other prokaryotes, proposing the role of transcriptional regulation in shaping the organization of genes on a chromosome.

Assuntos

Cromossomos Bacterianos , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Modelos Genéticos , Transcrição Gênica , Sítios de Ligação , Escherichia coli/genética , Escherichia coli/metabolismo , Ordem dos Genes , Genes Bacterianos , Genoma Bacteriano , RNA Mensageiro/metabolismo , Regulon , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo

15.

Lisen&Curate: A platform to facilitate gathering textual evidence for curation of regulation of transcription initiation in bacteria.

Díaz-Rodríguez, Martín; Lithgow-Serrano, Oscar; Guadarrama-García, Francisco; Tierrafría, Víctor H; Gama-Castro, Socorro; Solano-Lira, Hilda; Salgado, Heladia; Rinaldi, Fabio; Méndez-Cruz, Carlos-Francisco; Collado-Vides, Julio.

Biochim Biophys Acta Gene Regul Mech ; 1864(11-12): 194753, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34461312

RESUMO

The number of published papers in biomedical research makes it rather impossible for a researcher to keep up to date. This is where manually curated databases contribute facilitating the access to knowledge. However, the structure required by databases strongly limits the type of valuable information that can be incorporated. Here, we present Lisen&Curate, a curation system that facilitates linking sentences or part of sentences (both considered sources) in articles with their corresponding curated objects, so that rich additional information of these objects is easily available to users. These sources are going to be offered both within RegulonDB and a new database, L-Regulon. To show the relevance of our work, two senior curators performed a curation of 31 articles on the regulation of transcription initiation of E. coli using Lisen&Curate. As a result, 194 objects were curated and 781 sources were recorded. We also found that these sources are useful to develop automatic approaches to detect objects in articles by observing word frequency patterns and by carrying out an open information extraction task. Sources may help to elaborate a controlled vocabulary of experimental methods. Finally, we discuss our ecosystem of interconnected applications, RegulonDB, L-Regulon, and Lisen&Curate, to facilitate the access to knowledge on regulation of transcription initiation in bacteria. We see our proposal as the starting point to change the way experimentalists connect a piece of knowledge with its evidence using RegulonDB.

Assuntos

Curadoria de Dados/métodos , Bases de Dados Genéticas , Regulação Bacteriana da Expressão Gênica , Iniciação da Transcrição Genética , Escherichia coli/genética

16.

RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation.

Gama-Castro, Socorro; Jiménez-Jacinto, Verónica; Peralta-Gil, Martín; Santos-Zavaleta, Alberto; Peñaloza-Spinola, Mónica I; Contreras-Moreira, Bruno; Segura-Salazar, Juan; Muñiz-Rascado, Luis; Martínez-Flores, Irma; Salgado, Heladia; Bonavides-Martínez, César; Abreu-Goodger, Cei; Rodríguez-Penagos, Carlos; Miranda-Ríos, Juan; Morett, Enrique; Merino, Enrique; Huerta, Araceli M; Treviño-Quintanilla, Luis; Collado-Vides, Julio.

Nucleic Acids Res ; 36(Database issue): D120-4, 2008 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-18158297

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx/) is the primary reference database offering curated knowledge of the transcriptional regulatory network of Escherichia coli K12, currently the best-known electronically encoded database of the genetic regulatory network of any free-living organism. This paper summarizes the improvements, new biology and new features available in version 6.0. Curation of original literature is, from now on, up to date for every new release. All the objects are supported by their corresponding evidences, now classified as strong or weak. Transcription factors are classified by origin of their effectors and by gene ontology class. We have now computational predictions for sigma(54) and five different promoter types of the sigma(70) family, as well as their corresponding -10 and -35 boxes. In addition to those curated from the literature, we added about 300 experimentally mapped promoters coming from our own high-throughput mapping efforts. RegulonDB v.6.0 now expands beyond transcription initiation, including RNA regulatory elements, specifically riboswitches, attenuators and small RNAs, with their known associated targets. The data can be accessed through overviews of correlations about gene regulation. RegulonDB associated original literature, together with more than 4000 curation notes, can now be searched with the Textpresso text mining engine.

Assuntos

Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Biologia Computacional , Internet , Modelos Genéticos , Regiões Promotoras Genéticas , Sequências Reguladoras de Ácido Ribonucleico , Regulon , Fator sigma/metabolismo , Software , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição , Transcrição Gênica

17.

Coordination logic of the sensing machinery in the transcriptional regulatory network of Escherichia coli.

Janga, Sarath Chandra; Salgado, Heladia; Martínez-Antonio, Agustino; Collado-Vides, Julio.

Nucleic Acids Res ; 35(20): 6963-72, 2007.

Artigo em Inglês | MEDLINE | ID: mdl-17933780

RESUMO

The active and inactive state of transcription factors in growing cells is usually directed by allosteric physicochemical signals or metabolites, which are in turn either produced in the cell or obtained from the environment by the activity of the products of effector genes. To understand the regulatory dynamics and to improve our knowledge about how transcription factors (TFs) respond to endogenous and exogenous signals in the bacterial model, Escherichia coli, we previously proposed to classify TFs into external, internal and hybrid sensing classes depending on the source of their allosteric or equivalent metabolite. Here we analyze how a cell uses its topological structures in the context of sensing machinery and show that, while feed forward loops (FFLs) tightly integrate internal and external sensing TFs connecting TFs from different layers of the hierarchical transcriptional regulatory network (TRN), bifan motifs frequently connect TFs belonging to the same sensing class and could act as a bridge between TFs originating from the same level in the hierarchy. We observe that modules identified in the regulatory network of E. coli are heterogeneous in sensing context with a clear combination of internal and external sensing categories depending on the physiological role played by the module. We also note that propensity of two-component response regulators increases at promoters, as the number of TFs regulating a target operon increases. Finally we show that evolutionary families of TFs do not show a tendency to preserve their sensing abilities. Our results provide a detailed panorama of the topological structures of E. coli TRN and the way TFs they compose off, sense their surroundings by coordinating responses.

Assuntos

Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Proteínas de Escherichia coli/genética , Evolução Molecular , Óperon , Fatores de Transcrição/genética

18.

Immunity related genes in dipterans share common enrichment of AT-rich motifs in their 5' regulatory regions that are potentially involved in nucleosome formation.

Hernandez-Romano, Jesus; Carlos-Rivera, Francisco J; Salgado, Heladia; Lamadrid-Figueroa, Hector; Valverde-Garduño, Veronica; Rodriguez, Mario H; Martinez-Barnetche, Jesus.

BMC Genomics ; 9: 326, 2008 Jul 09.

Artigo em Inglês | MEDLINE | ID: mdl-18613977

RESUMO

BACKGROUND: Understanding the transcriptional regulation mechanisms in response to environmental challenges is of fundamental importance in biology. Transcription factors associated to response elements and the chromatin structure had proven to play important roles in gene expression regulation. We have analyzed promoter regions of dipteran genes induced in response to immune challenge, in search for particular sequence patterns involved in their transcriptional regulation. RESULTS: 5' upstream regions of D. melanogaster and A. gambiae immunity-induced genes and their corresponding orthologous genes in 11 non-melanogaster drosophilid species and Ae. aegypti share enrichment in AT-rich short motifs. AT-rich motifs are associated with nucleosome formation as predicted by two different algorithms. In A. gambiae and D. melanogaster, many immunity genes 5' upstream sequences also showed NFkappaB response elements, located within 500 bp from the transcription start site. In A. gambiae, the frequency of ATAA motif near the NFkappaB response elements was increased, suggesting a functional link between nucleosome formation/remodelling and NFkappaB regulation of transcription. CONCLUSION: AT-rich motif enrichment in 5' upstream sequences in A. gambiae, Ae. aegypti and the Drosophila genus immunity genes suggests a particular pattern of nucleosome formation/chromatin organization. The co-occurrence of such motifs with the NFkappaB response elements suggests that these sequence signatures may be functionally involved in transcriptional activation during dipteran immune response. AT-rich motif enrichment in regulatory regions in this group of co-regulated genes could represent an evolutionary constrained signature in dipterans and perhaps other distantly species.

Assuntos

Sequência Rica em At , Aedes/genética , Anopheles/genética , Drosophila melanogaster/genética , Genes de Insetos , Imunidade/genética , Nucleossomos/genética , Região 5'-Flanqueadora/genética , Aedes/imunologia , Animais , Anopheles/imunologia , Bases de Dados de Ácidos Nucleicos , Drosophila melanogaster/imunologia , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , NF-kappa B/genética , Nucleossomos/imunologia , Análise de Sequência com Séries de Oligonucleotídeos , Elementos de Resposta/genética , Análise de Sequência de DNA , Transcrição Gênica

19.

Internal versus external effector and transcription factor gene pairs differ in their relative chromosomal position in Escherichia coli.

Janga, Sarath Chandra; Salgado, Heladia; Collado-Vides, Julio; Martínez-Antonio, Agustino.

J Mol Biol ; 368(1): 263-72, 2007 Apr 20.

Artigo em Inglês | MEDLINE | ID: mdl-17321548

RESUMO

Transcription factors (TFs) play an important role in the genetic regulation of transcription in response to internal and external cellular stimuli. However, little is known about their functional and dynamic aspects on a large scale, even in a well-studied bacterium like Escherichia coli. To understand the regulatory dynamics and to improve our knowledge about how TFs respond to endogenous and exogenous signals in this simple bacterium model, we previously proposed that TFs can be classified into three classes, depending on how they sense their allosteric or equivalent metabolite: external class, internal class, and hybrid sensing class. Classification of these groups was done without considering the relative chromosomal positions of the TFs and their corresponding effector genes. Here, we analyze the genome organization of the genetic components of these sensing systems, using the classification described earlier. We report the chromosomal proximity of transcription factors and their effector genes to sense periplasmic signals or transported metabolites (i.e. transcriptional sensing systems from the external class) in contrast to the components for sensing internally synthesized metabolites, which tend to be distant on the chromosome. We strengthen our finding that external sensing genetic machinery behaves like chromosomal modules of regulation to respond rapidly to variations in external conditions through co-expression of their genetic components, which is corroborated with microarray data for E. coli. Furthermore, we show several lines of evidence supporting the need for the coordinated activity of external sensing systems in contrast to that of internal sensing machinery, which can explain their close chromosomal organization. The observed functional correlation between the chromosomal organization and the genetic machinery for environmental sensing should contribute to our understanding of the logical functioning and evolution of the transcriptional regulatory networks in bacteria.

Assuntos

Cromossomos Bacterianos , Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes/fisiologia , Genes Bacterianos , Fatores de Transcrição/genética , Mapeamento Cromossômico , Escherichia coli/enzimologia , Proteínas de Escherichia coli/genética , Ordem dos Genes , Modelos Biológicos , Sensação/genética , Transcrição Gênica

20.

RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions.

Salgado, Heladia; Gama-Castro, Socorro; Peralta-Gil, Martín; Díaz-Peredo, Edgar; Sánchez-Solano, Fabiola; Santos-Zavaleta, Alberto; Martínez-Flores, Irma; Jiménez-Jacinto, Verónica; Bonavides-Martínez, César; Segura-Salazar, Juan; Martínez-Antonio, Agustino; Collado-Vides, Julio.

Nucleic Acids Res ; 34(Database issue): D394-7, 2006 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-16381895

RESUMO

RegulonDB is the internationally recognized reference database of Escherichia coli K-12 offering curated knowledge of the regulatory network and operon organization. It is currently the largest electronically-encoded database of the regulatory network of any free-living organism. We present here the recently launched RegulonDB version 5.0 radically different in content, interface design and capabilities. Continuous curation of original scientific literature provides the evidence behind every single object and feature. This knowledge is complemented with comprehensive computational predictions across the complete genome. Literature-based and predicted data are clearly distinguished in the database. Starting with this version, RegulonDB public releases are synchronized with those of EcoCyc since our curation supports both databases. The complex biology of regulation is simplified in a navigation scheme based on three major streams: genes, operons and regulons. Regulatory knowledge is directly available in every navigation step. Displays combine graphic and textual information and are organized allowing different levels of detail and biological context. This knowledge is the backbone of an integrated system for the graphic display of the network, graphic and tabular microarray comparisons with curated and predicted objects, as well as predictions across bacterial genomes, and predicted networks of functionally related gene products. Access RegulonDB at http://regulondb.ccg.unam.mx.

Assuntos

Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Óperon , Regulon , Escherichia coli K12/crescimento & desenvolvimento , Genoma Bacteriano , Internet , Software , Transcrição Gênica , Interface Usuário-Computador

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA