Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
bioRxiv ; 2024 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-38826350

RESUMEN

The DNA binding of most Escherichia coli Transcription Factors (TFs) has not been comprehensively mapped, and few have models that can quantitatively predict binding affinity. We report the global mapping of in vivo DNA binding for 139 E. coli TFs using ChIP-Seq. We used these data to train BoltzNet, a novel neural network that predicts TF binding energy from DNA sequence. BoltzNet mirrors a quantitative biophysical model and provides directly interpretable predictions genome-wide at nucleotide resolution. We used BoltzNet to quantitatively design novel binding sites, which we validated with biophysical experiments on purified protein. We have generated models for 125 TFs that provide insight into global features of TF binding, including clustering of sites, the role of accessory bases, the relevance of weak sites, and the background affinity of the genome. Our paper provides new paradigms for studying TF-DNA binding and for the development of biophysically motivated neural networks.

2.
Front Genet ; 15: 1353553, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38505828

RESUMEN

Post-genomic implementations have expanded the experimental strategies to identify elements involved in the regulation of transcription initiation. Here, we present for the first time a detailed analysis of the sources of knowledge supporting the collection of transcriptional regulatory interactions (RIs) of Escherichia coli K-12. An RI groups the transcription factor, its effect (positive or negative) and the regulated target, a promoter, a gene or transcription unit. We improved the evidence codes so that specific methods are incorporated and classified into independent groups. On this basis we updated the computation of confidence levels, weak, strong, or confirmed, for the collection of RIs. These updates enabled us to map the RI set to the current collection of HT TF-binding datasets from ChIP-seq, ChIP-exo, gSELEX and DAP-seq in RegulonDB, enriching in this way the evidence of close to one-quarter (1329) of RIs from the current total 5446 RIs. Based on the new computational capabilities of our improved annotation of evidence sources, we can now analyze the internal architecture of evidence, their categories (experimental, classical, HT, computational), and confidence levels. This is how we know that the joint contribution of HT and computational methods increase the overall fraction of reliable RIs (the sum of confirmed and strong evidence) from 49% to 71%. Thus, the current collection has 3912 reliable RIs, with 2718 or 70% of them with classical evidence which can be used to benchmark novel HT methods. Users can selectively exclude the method they want to benchmark, or keep for instance only the confirmed interactions. The recovery of regulatory sites in RegulonDB by the different HT methods ranges between 33% by ChIP-exo to 76% by ChIP-seq although as discussed, many potential confounding factors limit their interpretation. The collection of improvements reported here provides a solid foundation to incorporate new methods and data, and to further integrate the diverse sources of knowledge of the different components of the transcriptional regulatory network. There is no other genomic database that offers this comprehensive high-quality architecture of knowledge supporting a corpus of transcriptional regulatory interactions.

3.
EcoSal Plus ; 11(1): eesp00022023, 2023 Dec 12.
Artículo en Inglés | MEDLINE | ID: mdl-37220074

RESUMEN

EcoCyc is a bioinformatics database available online at EcoCyc.org that describes the genome and the biochemical machinery of Escherichia coli K-12 MG1655. The long-term goal of the project is to describe the complete molecular catalog of the E. coli cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of E. coli. EcoCyc is an electronic reference source for E. coli biologists and for biologists who work with related microorganisms. The database includes information pages on each E. coli gene product, metabolite, reaction, operon, and metabolic pathway. The database also includes information on the regulation of gene expression, E. coli gene essentiality, and nutrient conditions that do or do not support the growth of E. coli. The website and downloadable software contain tools for the analysis of high-throughput data sets. In addition, a steady-state metabolic flux model is generated from each new version of EcoCyc and can be executed online. The model can predict metabolic flux rates, nutrient uptake rates, and growth rates for different gene knockouts and nutrient conditions. Data generated from a whole-cell model that is parameterized from the latest data on EcoCyc are also available. This review outlines the data content of EcoCyc and of the procedures by which this content is generated.


Asunto(s)
Escherichia coli K12 , Proteínas de Escherichia coli , Escherichia coli/genética , Escherichia coli/metabolismo , Escherichia coli K12/genética , Bases de Datos Genéticas , Programas Informáticos , Biología Computacional , Proteínas de Escherichia coli/metabolismo
4.
bioRxiv ; 2023 Dec 11.
Artículo en Inglés | MEDLINE | ID: mdl-37163020

RESUMEN

Post-genomic implementations have expanded the experimental strategies to identify elements involved in the regulation of transcription initiation. As new methodologies emerge, a natural step is to compare their results with those from established methodologies, such as the classic methods of molecular biology used to characterize transcription factor binding sites, promoters, or transcription units. In the case of Escherichia coli K-12, the best-studied microorganism, for the last 30 years we have continuously gathered such knowledge from original scientific publications, and have organized it in two databases, RegulonDB and EcoCyc. Furthermore, since RegulonDB version 11.0 (1), we offer comprehensive datasets of binding sites from chromatin immunoprecipitation combined with sequencing (ChIP-seq), ChIP combined with exonuclease digestion and next-generation sequencing (ChIP-exo), genomic SELEX screening (gSELEX), and DNA affinity purification sequencing (DAP-seq) HT technologies, as well as additional datasets for transcription start sites, transcription units and RNA sequencing (RNA-seq) expression profiles. Here, we present for the first time an analysis of the sources of knowledge supporting the collection of transcriptional regulatory interactions (RIs) of E. coli K-12. An RI is formed by the transcription factor, its positive or negative effect on a promoter, a gene or transcription unit. We improved the evidence codes so that the specific methods are described, and we classified them into seven independent groups. This is the basis for our updated computation of confidence levels, weak, strong, or confirmed, for the collection of RIs. We compare the confidence levels of the RI collection before and after adding HT evidence illustrating how knowledge will change as more HT data and methods appear in the future. Users can generate subsets filtering out the method they want to benchmark and avoid circularity, or keep for instance only the confirmed interactions. The comparison of different HT methods with the available datasets indicate that ChIP-seq recovers the highest fraction (>70%) of binding sites present in RegulonDB followed by gSELEX, DAP-seq and ChIP-exo. There is no other genomic database that offers this comprehensive high-quality anatomy of evidence supporting a corpus of transcriptional regulatory interactions.

5.
Microb Genom ; 8(5)2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35584008

RESUMEN

Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB (https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12.


Asunto(s)
Escherichia coli K12 , Escherichia coli , Escherichia coli/genética , Escherichia coli K12/genética , Escherichia coli K12/metabolismo , Regulación Bacteriana de la Expresión Génica , Operón/genética , Reproducibilidad de los Resultados
6.
Front Bioeng Biotechnol ; 10: 823240, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35237580

RESUMEN

In free-living bacteria, the ability to regulate gene expression is at the core of adapting and interacting with the environment. For these systems to have a logic, a signal must trigger a genetic change that helps the cell to deal with what implies its presence in the environment; briefly, the response is expected to include a feedback to the signal. Thus, it makes sense to think of genetic sensory mechanisms of gene regulation. Escherichia coli K-12 is the bacterium model for which the largest number of regulatory systems and its sensing capabilities have been studied in detail at the molecular level. In this special issue focused on biomolecular sensing systems, we offer an overview of the transcriptional regulatory corpus of knowledge for E. coli that has been gathered in our database, RegulonDB, from the perspective of sensing regulatory systems. Thus, we start with the beginning of the information flux, which is the signal's chemical or physical elements detected by the cell as changes in the environment; these signals are internally transduced to transcription factors and alter their conformation. Signals transduced to effectors bind allosterically to transcription factors, and this defines the dominant sensing mechanism in E. coli. We offer an updated list of the repertoire of known allosteric effectors, as well as a list of the currently known different mechanisms of this sensing capability. Our previous definition of elementary genetic sensory-response units, GENSOR units for short, that integrate signals, transport, gene regulation, and the biochemical response of the regulated gene products of a given transcriptional factor fit perfectly with the purpose of this overview. We summarize the functional heterogeneity of their response, based on our updated collection of GENSORs, and we use them to identify the expected feedback as part of their response. Finally, we address the question of multiple sensing in the regulatory network of E. coli. This overview introduces the architecture of sensing and regulation of native components in E.coli K-12, which might be a source of inspiration to bioengineering applications.

7.
Biochim Biophys Acta Gene Regul Mech ; 1864(11-12): 194753, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34461312

RESUMEN

The number of published papers in biomedical research makes it rather impossible for a researcher to keep up to date. This is where manually curated databases contribute facilitating the access to knowledge. However, the structure required by databases strongly limits the type of valuable information that can be incorporated. Here, we present Lisen&Curate, a curation system that facilitates linking sentences or part of sentences (both considered sources) in articles with their corresponding curated objects, so that rich additional information of these objects is easily available to users. These sources are going to be offered both within RegulonDB and a new database, L-Regulon. To show the relevance of our work, two senior curators performed a curation of 31 articles on the regulation of transcription initiation of E. coli using Lisen&Curate. As a result, 194 objects were curated and 781 sources were recorded. We also found that these sources are useful to develop automatic approaches to detect objects in articles by observing word frequency patterns and by carrying out an open information extraction task. Sources may help to elaborate a controlled vocabulary of experimental methods. Finally, we discuss our ecosystem of interconnected applications, RegulonDB, L-Regulon, and Lisen&Curate, to facilitate the access to knowledge on regulation of transcription initiation in bacteria. We see our proposal as the starting point to change the way experimentalists connect a piece of knowledge with its evidence using RegulonDB.


Asunto(s)
Curaduría de Datos/métodos , Bases de Datos Genéticas , Regulación Bacteriana de la Expresión Génica , Iniciación de la Transcripción Genética , Escherichia coli/genética
8.
Front Microbiol ; 12: 711077, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34394059

RESUMEN

The EcoCyc model-organism database collects and summarizes experimental data for Escherichia coli K-12. EcoCyc is regularly updated by the manual curation of individual database entries, such as genes, proteins, and metabolic pathways, and by the programmatic addition of results from select high-throughput analyses. Updates to the Pathway Tools software that supports EcoCyc and to the web interface that enables user access have continuously improved its usability and expanded its functionality. This article highlights recent improvements to the curated data in the areas of metabolism, transport, DNA repair, and regulation of gene expression. New and revised data analysis and visualization tools include an interactive metabolic network explorer, a circular genome viewer, and various improvements to the speed and usability of existing tools.

10.
J Biomed Semantics ; 10(1): 8, 2019 05 22.
Artículo en Inglés | MEDLINE | ID: mdl-31118102

RESUMEN

BACKGROUND: The ability to express the same meaning in different ways is a well-known property of natural language. This amazing property is the source of major difficulties in natural language processing. Given the constant increase in published literature, its curation and information extraction would strongly benefit from efficient automatic processes, for which corpora of sentences evaluated by experts are a valuable resource. RESULTS: Given our interest in applying such approaches to the benefit of curation of the biomedical literature, specifically that about gene regulation in microbial organisms, we decided to build a corpus with graded textual similarity evaluated by curators and that was designed specifically oriented to our purposes. Based on the predefined statistical power of future analyses, we defined features of the design, including sampling, selection criteria, balance, and size, among others. A non-fully crossed study design was applied. Each pair of sentences was evaluated by 3 annotators from a total of 7; the scale used in the semantic similarity assessment task within the Semantic Evaluation workshop (SEMEVAL) was adapted to our goals in four successive iterative sessions with clear improvements in the agreed guidelines and interrater reliability results. Alternatives for such a corpus evaluation have been widely discussed. CONCLUSIONS: To the best of our knowledge, this is the first similarity corpus-a dataset of pairs of sentences for which human experts rate the semantic similarity of each pair-in this domain of knowledge. We have initiated its incorporation in our research towards high-throughput curation strategies based on natural language processing.


Asunto(s)
Regulación de la Expresión Génica , Microbiología , Procesamiento de Lenguaje Natural , Transcripción Genética/genética
11.
J Struct Biol ; 207(1): 29-39, 2019 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-30981884

RESUMEN

The labdane-related diterpenoids (LRDs) are a large group of natural products with a broad range of biological activities. They are synthesized through two consecutive reactions catalyzed by class II and I diterpene synthases (DTSs). The structural complexity of LRDs mainly depends on the catalytic activity of class I DTSs, which catalyze the formation of bicyclic to pentacyclic LRDs, using as a substrate the catalytic product of class II DTSs. To date, the structural and mechanistic details for the biosynthesis of bicyclic LRDs skeletons catalyzed by class I DTSs remain unclear. This work presents the first X-ray crystal structure of an (E)-biformene synthase, LrdC, from the soil bacterium Streptomyces sp. strain K155. LrdC was identified as a part of an LRD cluster of five genes and was found to be a class I DTS that catalyzes the Mg2+-dependent synthesis of bicyclic LRD (E)-biformene by the dephosphorylation and rearrangement of normal copalyl pyrophosphate (CPP). Structural analysis of LrdC coupled with docking studies suggests that Phe189 prevents cyclization beyond the bicyclic LRD product through a strong stabilization of the allylic carbocation intermediate, while Tyr317 functions as a general base catalyst to deprotonate the CPP substrate. Structural comparisons of LrdC with homology models of bacterial bicyclic LRD-forming enzymes (CldD, RmnD and SclSS), as well as with the crystallographic structure of bacterial tetracyclic LRD ent-kaurene synthase (BjKS), provide further structural insights into the biosynthesis of bacterial LRD natural products.


Asunto(s)
Bacterias/química , Diterpenos/metabolismo , Streptomyces/enzimología , Transferasas Alquil y Aril/química , Bacterias/enzimología , Proteínas Bacterianas/química , Cristalografía por Rayos X , Estructura Molecular , Organofosfatos/química
12.
BMC Biol ; 16(1): 91, 2018 08 16.
Artículo en Inglés | MEDLINE | ID: mdl-30115066

RESUMEN

BACKGROUND: Our understanding of the regulation of gene expression has benefited from the availability of high-throughput technologies that interrogate the whole genome for the binding of specific transcription factors and gene expression profiles. In the case of widely used model organisms, such as Escherichia coli K-12, the new knowledge gained from these approaches needs to be integrated with the legacy of accumulated knowledge from genetic and molecular biology experiments conducted in the pre-genomic era in order to attain the deepest level of understanding possible based on the available data. RESULTS: In this paper, we describe an expansion of RegulonDB, the database containing the rich legacy of decades of classic molecular biology experiments supporting what we know about gene regulation and operon organization in E. coli K-12, to include the genome-wide dataset collections from 32 ChIP and 19 gSELEX publications, in addition to around 60 genome-wide expression profiles relevant to the functional significance of these datasets and used in their curation. Three essential features for the integration of this information coming from different methodological approaches are: first, a controlled vocabulary within an ontology for precisely defining growth conditions; second, the criteria to separate elements with enough evidence to consider them involved in gene regulation from isolated transcription factor binding sites without such support; and third, an expanded computational model supporting this knowledge. Altogether, this constitutes the basis for adequately gathering and enabling the comparisons and integration needed to manage and access such wealth of knowledge. CONCLUSIONS: This version 10.0 of RegulonDB is a first step toward what should become the unifying access point for current and future knowledge on gene regulation in E. coli K-12. Furthermore, this model platform and associated methodologies and criteria can be emulated for gathering knowledge on other microbial organisms.


Asunto(s)
Bases de Datos como Asunto , Escherichia coli K12/genética , Regulación Bacteriana de la Expresión Génica , Transcripción Genética
13.
Appl Microbiol Biotechnol ; 100(21): 9229-9237, 2016 11.
Artículo en Inglés | MEDLINE | ID: mdl-27604626

RESUMEN

Although the specific function of SCO2127 remains elusive, it has been assumed that this hypothetical protein plays an important role in carbon catabolite regulation and therefore in antibiotic biosynthesis in Streptomyces coelicolor. To shed light on the functional relationship of SCO2127 to the biosynthesis of actinorhodin, a detailed analysis of the proteins differentially produced between the strain M145 and the Δsco2127 mutant of S. coelicolor was performed. The delayed morphological differentiation and impaired production of actinorhodin showed by the deletion strain were accompanied by increased abundance of gluconeogenic enzymes, as well as downregulation of both glycolysis and acetyl-CoA carboxylase. Repression of mycothiol biosynthetic enzymes was further observed in the absence of SCO2127, in addition to upregulation of hydroxyectoine biosynthetic enzymes and SCO0204, which controls nitrite formation. The data generated in this study reveal that the response regulator SCO0204 greatly contributes to prevent the formation of actinorhodin in the ∆sco2127 mutant, likely through the activation of some proteins associated with oxidative stress that include the nitrite producer SCO0216.


Asunto(s)
Antibacterianos/metabolismo , Proteínas Bacterianas/genética , Eliminación de Gen , Regulación Bacteriana de la Expresión Génica , Streptomyces coelicolor/genética , Streptomyces coelicolor/metabolismo , Antraquinonas/metabolismo
14.
BMC Microbiol ; 16: 77, 2016 Apr 27.
Artículo en Inglés | MEDLINE | ID: mdl-27121083

RESUMEN

BACKGROUND: In the genus Streptomyces, one of the most remarkable control mechanisms of physiological processes is carbon catabolite repression (CCR). This mechanism regulates the expression of genes involved in the uptake and utilization of alternative carbon sources. CCR also affects the synthesis of secondary metabolites and morphological differentiation. Even when the outcome effect of CCR in different bacteria is the same, their essential mechanisms can be quite different. In several streptomycetes glucose kinase (Glk) represents the main glucose phosphorylating enzyme and has been regarded as a regulatory protein in CCR. To evaluate the paradigmatic model proposed for CCR in Streptomyces, a high-density microarray approach was applied to Streptomyces coelicolor M145, under repressed and non-repressed conditions. The transcriptomic study was extended to assess the ScGlk role in this model by comparing the transcriptomic profile of S. coelicolor M145 with that of a ∆glk mutant derived from the wild-type strain, complemented with a heterologous glk gene from Zymomonas mobilis (Zmglk), insensitive to CCR but able to grow in glucose (ScoZm strain). RESULTS: Microarray experiments revealed that glucose influenced the expression of 651 genes. Interestingly, even when the ScGlk protein does not have DNA binding domains and the glycolytic flux was restored by a heterologous glucokinase, the ScGlk replacement modified the expression of 134 genes. From these, 91 were also affected by glucose while 43 appeared to be under the control of ScGlk. This work identified the expression of S. coelicolor genes involved in primary metabolism that were influenced by glucose and/or ScGlk. Aside from describing the metabolic pathways influenced by glucose and/or ScGlk, several unexplored transcriptional regulators involved in the CCR mechanism were disclosed. CONCLUSIONS: The transcriptome of a classical model of CCR was studied in S. coelicolor to differentiate between the effects due to glucose or ScGlk in this regulatory mechanism. Glucose elicited important metabolic and transcriptional changes in this microorganism. While its entry and flow through glycolysis and pentose phosphate pathway were stimulated, the gluconeogenesis was inhibited. Glucose also triggered the CCR by repressing transporter systems and the transcription of enzymes required for secondary carbon sources utilization. Our results confirm and update the agar model of the CCR in Streptomyces and its dependence on the ScGlk per se. Surprisingly, the expected regulatory function of ScGlk was not found to be as global as thought before (only 43 out of 779 genes were affected), although may be accompanied or coordinated by other transcriptional regulators. Aside from describing the metabolic pathways influenced by glucose and/or ScGlk, several unexplored transcriptional regulators involved in the CCR mechanism were disclosed. These findings offer new opportunities to study and understand the CCR in S. coelicolor by increasing the number of known glucose and ScGlk -regulated pathways and a new set of putative regulatory proteins possibly involved or controlling the CCR.


Asunto(s)
Represión Catabólica , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Streptomyces coelicolor/crecimiento & desarrollo , Proteínas Bacterianas/genética , Carbono/metabolismo , Regulación Bacteriana de la Expresión Génica , Glucoquinasa/genética , Modelos Biológicos , Mutación , Metabolismo Secundario , Streptomyces coelicolor/genética
15.
Microb Biotechnol ; 4(2): 275-85, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21342472

RESUMEN

Ferrioxamines-mediated iron acquisition by Streptomyces coelicolor A3(2) has recently received increased attention. In addition to the biological role of desferrioxamines (dFOs) as hydroxamate siderophores, and the pharmaceutical application of dFO-B as an iron-chelator, the ferrioxamines have been shown to mediate microbial interactions. In S. coelicolor the siderophore-binding receptors DesE (Sco2780) and CdtB (Sco7399) have been postulated to specifically recognize and uptake FO-E (cyclic) and FO-B (linear) respectively. Here, disruption of the desE gene in S. coelicolor, and subsequent phenotypic analysis, is used to demonstrate a link between iron metabolism and physiological and morphological development. Streptomyces coelicolor desE mutants, isolated in both wild-type (M145) and a coelichelin biosynthesis and transport minus background (mutant W3), a second hydroxamate siderophore system only found in S. coelicolor and related species, resulted in impaired growth and lack of sporulation. This phenotype could only be partially rescued by expression in trans of either desE and cdtB genes, which contrasted with the ability of FO-E, and to a lesser extent of FO-B, to fully restore growth at µM concentrations, with a concomitant induction of a marked phenotypic response involving precocious synthesis of actinorhodin and sporulation. Moreover, growth restoration of the desE mutant by complementation with desE and cdtB showed that DesE, which is universally conserved in Streptomyces, and CdtB, only present in certain streptomycetes, have partial equivalent functional roles under laboratory conditions, implying overlapping ferrioxamine specificities. The biotechnological and ecological implications of these observations are discussed.


Asunto(s)
Proteínas de la Membrana Bacteriana Externa/genética , Proteínas de la Membrana Bacteriana Externa/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Silenciador del Gen , Hierro/metabolismo , Receptores de Superficie Celular/genética , Receptores de Superficie Celular/metabolismo , Sideróforos/metabolismo , Streptomyces coelicolor/crecimiento & desarrollo , Transporte Biológico , Unión Proteica , Streptomyces coelicolor/genética , Streptomyces coelicolor/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...