Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nucleic Acids Res ; 51(19): 10176-10193, 2023 10 27.
Artículo en Inglés | MEDLINE | ID: mdl-37713610

RESUMEN

Transcriptomic data is accumulating rapidly; thus, scalable methods for extracting knowledge from this data are critical. Here, we assembled a top-down expression and regulation knowledge base for Escherichia coli. The expression component is a 1035-sample, high-quality RNA-seq compendium consisting of data generated in our lab using a single experimental protocol. The compendium contains diverse growth conditions, including: 9 media; 39 supplements, including antibiotics; 42 heterologous proteins; and 76 gene knockouts. Using this resource, we elucidated global expression patterns. We used machine learning to extract 201 modules that account for 86% of known regulatory interactions, creating the regulatory component. With these modules, we identified two novel regulons and quantified systems-level regulatory responses. We also integrated 1675 curated, publicly-available transcriptomes into the resource. We demonstrated workflows for analyzing new data against this knowledge base via deconstruction of regulation during aerobic transition. This resource illuminates the E. coli transcriptome at scale and provides a blueprint for top-down transcriptomic analysis of non-model organisms.


Asunto(s)
Escherichia coli , Bases del Conocimiento , Escherichia coli/genética , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Perfilación de la Expresión Génica , Regulación Bacteriana de la Expresión Génica , Transcriptoma
2.
Nucleic Acids Res ; 50(7): 3658-3672, 2022 04 22.
Artículo en Inglés | MEDLINE | ID: mdl-35357493

RESUMEN

The transcriptional regulatory network (TRN) of Pseudomonas aeruginosa coordinates cellular processes in response to stimuli. We used 364 transcriptomes (281 publicly available + 83 in-house generated) to reconstruct the TRN of P. aeruginosa using independent component analysis. We identified 104 independently modulated sets of genes (iModulons) among which 81 reflect the effects of known transcriptional regulators. We identified iModulons that (i) play an important role in defining the genomic boundaries of biosynthetic gene clusters (BGCs), (ii) show increased expression of the BGCs and associated secretion systems in nutrient conditions that are important in cystic fibrosis, (iii) show the presence of a novel ribosomally synthesized and post-translationally modified peptide (RiPP) BGC which might have a role in P. aeruginosa virulence, (iv) exhibit interplay of amino acid metabolism regulation and central metabolism across different carbon sources and (v) clustered according to their activity changes to define iron and sulfur stimulons. Finally, we compared the identified iModulons of P. aeruginosa with those previously described in Escherichia coli to observe conserved regulons across two Gram-negative species. This comprehensive TRN framework encompasses the majority of the transcriptional regulatory machinery in P. aeruginosa, and thus should prove foundational for future research into its physiological functions.


Asunto(s)
Pseudomonas aeruginosa , Transcriptoma , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Regulación Bacteriana de la Expresión Génica , Aprendizaje Automático , Pseudomonas aeruginosa/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Transcriptoma/genética
3.
Nucleic Acids Res ; 50(17): 9675-9688, 2022 09 23.
Artículo en Inglés | MEDLINE | ID: mdl-36095122

RESUMEN

Pseudomonas aeruginosa is an opportunistic pathogen and major cause of hospital-acquired infections. The virulence of P. aeruginosa is largely determined by its transcriptional regulatory network (TRN). We used 411 transcription profiles of P. aeruginosa from diverse growth conditions to construct a quantitative TRN by identifying independently modulated sets of genes (called iModulons) and their condition-specific activity levels. The current study focused on the use of iModulons to analyze the biofilm production and antibiotic resistance of P. aeruginosa. Our analysis revealed: (i) 116 iModulons, 81 of which show strong association with known regulators; (ii) novel roles of regulators in modulating antibiotics efflux pumps; (iii) substrate-efflux pump associations; (iv) differential iModulon activity in response to beta-lactam antibiotics in bacteriological and physiological media; (v) differential activation of 'Cell Division' iModulon resulting from exposure to different beta-lactam antibiotics and (vi) a role of the PprB iModulon in the stress-induced transition from planktonic to biofilm lifestyle. In light of these results, the construction of an iModulon-based TRN provides a transcriptional regulatory basis for key aspects of P. aeruginosa infection, such as antibiotic stress responses and biofilm formation. Taken together, our results offer a novel mechanistic understanding of P. aeruginosa virulence.


Asunto(s)
Pseudomonas aeruginosa , Antibacterianos/farmacología , Proteínas Bacterianas/metabolismo , Biopelículas , Perfilación de la Expresión Génica , Humanos , Infecciones por Pseudomonas , Pseudomonas aeruginosa/efectos de los fármacos , Pseudomonas aeruginosa/metabolismo , beta-Lactamas
4.
PLoS Genet ; 17(9): e1009821, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34570751

RESUMEN

RNA sequencing techniques have enabled the systematic elucidation of gene expression (RNA-Seq), transcription start sites (differential RNA-Seq), transcript 3' ends (Term-Seq), and post-transcriptional processes (ribosome profiling). The main challenge of transcriptomic studies is to remove ribosomal RNAs (rRNAs), which comprise more than 90% of the total RNA in a cell. Here, we report a low-cost and robust bacterial rRNA depletion method, RiboRid, based on the enzymatic degradation of rRNA by thermostable RNase H. This method implemented experimental considerations to minimize nonspecific degradation of mRNA and is capable of depleting pre-rRNAs that often comprise a large portion of RNA, even after rRNA depletion. We demonstrated the highly efficient removal of rRNA up to a removal efficiency of 99.99% for various transcriptome studies, including RNA-Seq, Term-Seq, and ribosome profiling, with a cost of approximately $10 per sample. This method is expected to be a robust method for large-scale high-throughput bacterial transcriptomic studies.


Asunto(s)
Bacterias/genética , Costos y Análisis de Costo , ARN Bacteriano/aislamiento & purificación , ARN Ribosómico/aislamiento & purificación , Transcriptoma , ARN Bacteriano/genética , ARN Ribosómico/genética , Análisis de Secuencia de ARN/métodos
5.
Nucleic Acids Res ; 49(D1): D112-D120, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33045728

RESUMEN

Independent component analysis (ICA) of bacterial transcriptomes has emerged as a powerful tool for obtaining co-regulated, independently-modulated gene sets (iModulons), inferring their activities across a range of conditions, and enabling their association to known genetic regulators. By grouping and analyzing genes based on observations from big data alone, iModulons can provide a novel perspective into how the composition of the transcriptome adapts to environmental conditions. Here, we present iModulonDB (imodulondb.org), a knowledgebase of prokaryotic transcriptional regulation computed from high-quality transcriptomic datasets using ICA. Users select an organism from the home page and then search or browse the curated iModulons that make up its transcriptome. Each iModulon and gene has its own interactive dashboard, featuring plots and tables with clickable, hoverable, and downloadable features. This site enhances research by presenting scientists of all backgrounds with co-expressed gene sets and their activity levels, which lead to improved understanding of regulator-gene relationships, discovery of transcription factors, and the elucidation of unexpected relationships between conditions and genetic regulatory activity. The current release of iModulonDB covers three organisms (Escherichia coli, Staphylococcus aureus and Bacillus subtilis) with 204 iModulons, and can be expanded to cover many additional organisms.


Asunto(s)
Proteínas Bacterianas/genética , Bases de Datos Genéticas , Regulación Bacteriana de la Expresión Génica , Bases del Conocimiento , Aprendizaje Automático , Transcriptoma , Bacillus subtilis/genética , Bacillus subtilis/metabolismo , Proteínas Bacterianas/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Internet , Programas Informáticos , Staphylococcus aureus/genética , Staphylococcus aureus/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Transcripción Genética
6.
Proc Natl Acad Sci U S A ; 117(29): 17228-17239, 2020 07 21.
Artículo en Inglés | MEDLINE | ID: mdl-32616573

RESUMEN

The ability of Staphylococcus aureus to infect many different tissue sites is enabled, in part, by its transcriptional regulatory network (TRN) that coordinates its gene expression to respond to different environments. We elucidated the organization and activity of this TRN by applying independent component analysis to a compendium of 108 RNA-sequencing expression profiles from two S. aureus clinical strains (TCH1516 and LAC). ICA decomposed the S. aureus transcriptome into 29 independently modulated sets of genes (i-modulons) that revealed: 1) High confidence associations between 21 i-modulons and known regulators; 2) an association between an i-modulon and σS, whose regulatory role was previously undefined; 3) the regulatory organization of 65 virulence factors in the form of three i-modulons associated with AgrR, SaeR, and Vim-3; 4) the roles of three key transcription factors (CodY, Fur, and CcpA) in coordinating the metabolic and regulatory networks; and 5) a low-dimensional representation, involving the function of few transcription factors of changes in gene expression between two laboratory media (RPMI, cation adjust Mueller Hinton broth) and two physiological media (blood and serum). This representation of the TRN covers 842 genes representing 76% of the variance in gene expression that provides a quantitative reconstruction of transcriptional modules in S. aureus, and a platform enabling its full elucidation.


Asunto(s)
Regulación Bacteriana de la Expresión Génica , Redes Reguladoras de Genes/genética , Staphylococcus aureus/genética , Staphylococcus aureus/fisiología , Transcriptoma , Proteínas Bacterianas/genética , Proteínas de Unión al ADN/genética , Redes y Vías Metabólicas , Proteínas Represoras/genética , Análisis de Secuencia de ARN , Factor sigma/genética , Infecciones Estafilocócicas , Virulencia/genética , Factores de Virulencia/genética
7.
Int J Neurosci ; 133(11): 1262-1270, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-35698427

RESUMEN

BACKGROUND: The aim of the study was to investigate the clinical profile, disease burden, quality of life, and treatment patterns of various headache subtypes. METHOD: In this prospective observational study, 815 patients presenting with chief complaints of headache between January 2020 to September 2021 were registered. After a detailed history, clinical examination, and subtyping, they were assessed at baseline with well-validated scales for severity (Visual Analogue Scale-VAS), disability burden (Migraine Disability Assessment- MIDAS), Humanistic burden (Headache Impact Test-HIT-6), and quality of life (World health organization-quality of life-WHO-QoL-8) scores. After initiating adequate management, parameters were reassessed at 3 and 6 months. RESULTS: 549 (67.7%) patients had migraine (395-episodic migraine, 144-chronic migraine), 266 (32.2%) patients had tension-type headache (TTH). Loss of sleep, prolonged working hours, and stress were common triggers. Disease burden, severity, and poor life quality was quite high in migraine patients (76.5% with moderate to severe disability, 61.7% with severe headache at onset, and 72% with poor life quality). All parameters had statistically significant improvement with preventive medication and lifestyle changes. CONCLUSION: In our study, we found migraine was the most common primary headache followed by TTH. Migraine patients had more severity, disease burdens, and inferior quality of life at onset compared to other headaches. With early and proper diagnosis as well as preventive treatment (including lifestyle modifications), all parameters could be reversed positively in a brief time. This is the first study on headache burden and its effect on the quality of life in the north Indian population.

8.
Metab Eng ; 72: 376-390, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35598887

RESUMEN

Membrane transport proteins are potential targets for medical and biotechnological applications. However, more than 30% of reported membrane transporter families are either poorly characterized or lack adequate functional annotation. Here, adaptive laboratory evolution was leveraged to identify membrane transporters for a set of four amino acids as well as specific mutations that modulate the activities of these transporters. Specifically, Escherichia coli was adaptively evolved under increasing concentrations of L-histidine, L-phenylalanine, L-threonine, and L-methionine separately with multiple replicate evolutions. Evolved populations and isolated clones displayed growth rates comparable to the unstressed ancestral strain at elevated concentrations (four-to six-fold increases) of the targeted amino acids. Whole genome sequencing of the evolved strains revealed a diverse number of key mutations, including SNPs, small deletions, and copy number variants targeting the transporters leuE for histidine, yddG for phenylalanine, yedA for methionine, and brnQ and rhtC for threonine. Reverse engineering of the mutations in the ancestral strain established mutation causality of the specific mutations for the tolerant phenotypes. The functional roles of yedA and brnQ in the transport of methionine and threonine, respectively, are novel assignments and their functional roles were validated using a flow cytometry cellular accumulation assay. To demonstrate how the identified transporters can be leveraged for production, an L-phenylalanine overproduction strain was shown to be a superior producer when the identified yddG exporter was overexpressed. Overall, the results revealed the striking efficiency of laboratory evolution to identify transporters and specific mutational mechanisms to modulate their activities, thereby demonstrating promising applicability in transporter discovery efforts and strain engineering.


Asunto(s)
Sistemas de Transporte de Aminoácidos Neutros , Proteínas de Escherichia coli , Sistemas de Transporte de Aminoácidos Neutros/genética , Sistemas de Transporte de Aminoácidos Neutros/metabolismo , Aminoácidos/genética , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Regulación Bacteriana de la Expresión Génica , Proteínas de Transporte de Membrana/genética , Metionina/genética , Fenilalanina/genética , Treonina/genética
9.
Metab Eng ; 72: 297-310, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35489688

RESUMEN

Bacterial gene expression is orchestrated by numerous transcription factors (TFs). Elucidating how gene expression is regulated is fundamental to understanding bacterial physiology and engineering it for practical use. In this study, a machine-learning approach was applied to uncover the genome-scale transcriptional regulatory network (TRN) in Pseudomonas putida KT2440, an important organism for bioproduction. We performed independent component analysis of a compendium of 321 high-quality gene expression profiles, which were previously published or newly generated in this study. We identified 84 groups of independently modulated genes (iModulons) that explain 75.7% of the total variance in the compendium. With these iModulons, we (i) expand our understanding of the regulatory functions of 39 iModulon associated TFs (e.g., HexR, Zur) by systematic comparison with 1993 previously reported TF-gene interactions; (ii) outline transcriptional changes after the transition from the exponential growth to stationary phases; (iii) capture group of genes required for utilizing diverse carbon sources and increased stationary response with slower growth rates; (iv) unveil multiple evolutionary strategies of transcriptome reallocation to achieve fast growth rates; and (v) define an osmotic stimulon, which includes the Type VI secretion system, as coordination of multiple iModulon activity changes. Taken together, this study provides the first quantitative genome-scale TRN for P. putida KT2440 and a basis for a comprehensive understanding of its complex transcriptome changes in a variety of physiological states.


Asunto(s)
Pseudomonas putida , Regulación Bacteriana de la Expresión Génica , Redes Reguladoras de Genes , Aprendizaje Automático , Pseudomonas putida/genética , Pseudomonas putida/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Transcriptoma
10.
PLoS Comput Biol ; 17(2): e1008647, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33529205

RESUMEN

The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of existing data. Here, we show that the underlying structure of the E. coli transcriptome, as determined by Independent Component Analysis (ICA), is conserved across multiple independent datasets, including both RNA-seq and microarray datasets. We subsequently combined five transcriptomics datasets into a large compendium containing over 800 expression profiles and discovered that its underlying ICA-based structure was still comparable to that of the individual datasets. With this understanding, we expanded our analysis to over 3,000 E. coli expression profiles and predicted three high-impact regulons that respond to oxidative stress, anaerobiosis, and antibiotic treatment. ICA thus enables deep analysis of disparate data to uncover new insights that were not visible in the individual datasets.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Escherichia coli/genética , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Transcriptoma , Algoritmos , Modelos Lineales , Análisis de Componente Principal , RNA-Seq
11.
Nucleic Acids Res ; 48(18): 10157-10163, 2020 10 09.
Artículo en Inglés | MEDLINE | ID: mdl-32976587

RESUMEN

A genome contains the information underlying an organism's form and function. Yet, we lack formal framework to represent and study this information. Here, we introduce the Bitome, a matrix composed of binary digits (bits) representing the genomic positions of genomic features. We form a Bitome for the genome of Escherichia coli K-12 MG1655. We find that: (i) genomic features are encoded unevenly, both spatially and categorically; (ii) coding and intergenic features are recapitulated at high resolution; (iii) adaptive mutations are skewed towards genomic positions with fewer features; and (iv) the Bitome enhances prediction of adaptively mutated and essential genes. The Bitome is a formal representation of a genome and may be used to study its fundamental organizational properties.


Asunto(s)
Escherichia coli K12/genética , Genoma Bacteriano , Genómica
12.
Proc Natl Acad Sci U S A ; 116(50): 25287-25292, 2019 12 10.
Artículo en Inglés | MEDLINE | ID: mdl-31767748

RESUMEN

Evolution fine-tunes biological pathways to achieve a robust cellular physiology. Two and a half billion years ago, rapidly rising levels of oxygen as a byproduct of blooming cyanobacterial photosynthesis resulted in a redox upshift in microbial energetics. The appearance of higher-redox-potential respiratory quinone, ubiquinone (UQ), is believed to be an adaptive response to this environmental transition. However, the majority of bacterial species are still dependent on the ancient respiratory quinone, naphthoquinone (NQ). Gammaproteobacteria can biosynthesize both of these respiratory quinones, where UQ has been associated with aerobic lifestyle and NQ with anaerobic lifestyle. We engineered an obligate NQ-dependent γ-proteobacterium, Escherichia coli ΔubiC, and performed adaptive laboratory evolution to understand the selection against the use of NQ in an oxic environment and also the adaptation required to support the NQ-driven aerobic electron transport chain. A comparative systems-level analysis of pre- and postevolved NQ-dependent strains revealed a clear shift from fermentative to oxidative metabolism enabled by higher periplasmic superoxide defense. This metabolic shift was driven by the concerted activity of 3 transcriptional regulators (PdhR, RpoS, and Fur). Analysis of these findings using a genome-scale model suggested that resource allocation to reactive oxygen species (ROS) mitigation results in lower growth rates. These results provide a direct elucidation of a resource allocation tradeoff between growth rate and ROS mitigation costs associated with NQ usage under oxygen-replete condition.


Asunto(s)
Escherichia coli/crecimiento & desarrollo , Escherichia coli/metabolismo , Naftoquinonas/metabolismo , Estrés Oxidativo , Oxígeno/metabolismo , Aerobiosis , Evolución Biológica , Transporte de Electrón , Escherichia coli/genética , Oxo-Ácido-Liasas/genética , Oxo-Ácido-Liasas/metabolismo , Especies Reactivas de Oxígeno/metabolismo
13.
Proc Natl Acad Sci U S A ; 116(28): 14368-14373, 2019 07 09.
Artículo en Inglés | MEDLINE | ID: mdl-31270234

RESUMEN

Catalysis using iron-sulfur clusters and transition metals can be traced back to the last universal common ancestor. The damage to metalloproteins caused by reactive oxygen species (ROS) can prevent cell growth and survival when unmanaged, thus eliciting an essential stress response that is universal and fundamental in biology. Here we develop a computable multiscale description of the ROS stress response in Escherichia coli, called OxidizeME. We use OxidizeME to explain four key responses to oxidative stress: 1) ROS-induced auxotrophy for branched-chain, aromatic, and sulfurous amino acids; 2) nutrient-dependent sensitivity of growth rate to ROS; 3) ROS-specific differential gene expression separate from global growth-associated differential expression; and 4) coordinated expression of iron-sulfur cluster (ISC) and sulfur assimilation (SUF) systems for iron-sulfur cluster biosynthesis. These results show that we can now develop fundamental and quantitative genotype-phenotype relationships for stress responses on a genome-wide basis.


Asunto(s)
Proteínas Hierro-Azufre/genética , Hierro/metabolismo , Metaloproteínas/genética , Especies Reactivas de Oxígeno/metabolismo , Catálisis , Proliferación Celular/genética , Escherichia coli/genética , Escherichia coli/metabolismo , Regulación de la Expresión Génica/genética , Peróxido de Hidrógeno/metabolismo , Operón/genética , Estrés Oxidativo/genética , Azufre/metabolismo
14.
BMC Bioinformatics ; 22(1): 584, 2021 Dec 08.
Artículo en Inglés | MEDLINE | ID: mdl-34879815

RESUMEN

BACKGROUND: Independent component analysis is an unsupervised machine learning algorithm that separates a set of mixed signals into a set of statistically independent source signals. Applied to high-quality gene expression datasets, independent component analysis effectively reveals both the source signals of the transcriptome as co-regulated gene sets, and the activity levels of the underlying regulators across diverse experimental conditions. Two major variables that affect the final gene sets are the diversity of the expression profiles contained in the underlying data, and the user-defined number of independent components, or dimensionality, to compute. Availability of high-quality transcriptomic datasets has grown exponentially as high-throughput technologies have advanced; however, optimal dimensionality selection remains an open question. METHODS: We computed independent components across a range of dimensionalities for four gene expression datasets with varying dimensions (both in terms of number of genes and number of samples). We computed the correlation between independent components across different dimensionalities to understand how the overall structure evolves as the number of user-defined components increases. We then measured how well the resulting gene clusters reflected known regulatory mechanisms, and developed a set of metrics to assess the accuracy of the decomposition at a given dimension. RESULTS: We found that over-decomposition results in many independent components dominated by a single gene, whereas under-decomposition results in independent components that poorly capture the known regulatory structure. From these results, we developed a new method, called OptICA, for finding the optimal dimensionality that controls for both over- and under-decomposition. Specifically, OptICA selects the highest dimension that produces a low number of components that are dominated by a single gene. We show that OptICA outperforms two previously proposed methods for selecting the number of independent components across four transcriptomic databases of varying sizes. CONCLUSIONS: OptICA avoids both over-decomposition and under-decomposition of transcriptomic datasets resulting in the best representation of the organism's underlying transcriptional regulatory network.


Asunto(s)
Redes Reguladoras de Genes , Transcriptoma , Algoritmos , Bases de Datos Factuales , Perfilación de la Expresión Génica
15.
Mol Biol Evol ; 37(3): 660-667, 2020 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-31651953

RESUMEN

Oxidative stress is concomitant with aerobic metabolism. Thus, bacterial genomes encode elaborate mechanisms to achieve redox homeostasis. Here we report that the peroxide-sensing transcription factor, oxyR, is a common mutational target using bacterial species belonging to two genera, Escherichia coli and Vibrio natriegens, in separate growth conditions implemented during laboratory evolution. The mutations clustered in the redox active site, dimer interface, and flexible redox loop of the protein. These mutations favor the oxidized conformation of OxyR that results in constitutive expression of the genes it regulates. Independent component analysis of the transcriptome revealed that the constitutive activity of OxyR reduces DNA damage from reactive oxygen species, as inferred from the activity of the SOS response regulator LexA. This adaptation to peroxide stress came at a cost of lower growth, as revealed by calculations of proteome allocation using genome-scale models of metabolism and macromolecular expression. Further, identification of similar sequence changes in natural isolates of E. coli indicates that adaptation to oxidative stress through genetic changes in oxyR can be a common occurrence.


Asunto(s)
Proteínas de Escherichia coli/genética , Escherichia coli/crecimiento & desarrollo , Proteínas Represoras/genética , Factores de Transcripción/genética , Vibrio/crecimiento & desarrollo , Adaptación Fisiológica , Proteínas Bacterianas/genética , Dominio Catalítico , Evolución Molecular Dirigida , Escherichia coli/genética , Proteínas de Escherichia coli/química , Regulación Bacteriana de la Expresión Génica , Modelos Moleculares , Mutación , Estrés Oxidativo , Conformación Proteica , Especies Reactivas de Oxígeno/metabolismo , Proteínas Represoras/química , Factores de Transcripción/química , Vibrio/genética
16.
Nucleic Acids Res ; 47(5): 2446-2454, 2019 03 18.
Artículo en Inglés | MEDLINE | ID: mdl-30698741

RESUMEN

Experimental studies of Escherichia coli K-12 MG1655 often implicate poorly annotated genes in cellular phenotypes. However, we lack a systematic understanding of these genes. How many are there? What information is available for them? And what features do they share that could explain the gap in our understanding? Efforts to build predictive, whole-cell models of E. coli inevitably face this knowledge gap. We approached these questions systematically by assembling annotations from the knowledge bases EcoCyc, EcoGene, UniProt and RegulonDB. We identified the genes that lack experimental evidence of function (the 'y-ome') which include 1600 of 4623 unique genes (34.6%), of which 111 have absolutely no evidence of function. An additional 220 genes (4.7%) are pseudogenes or phantom genes. y-ome genes tend to have lower expression levels and are enriched in the termination region of the E. coli chromosome. Where evidence is available for y-ome genes, it most often points to them being membrane proteins and transporters. We resolve the misconception that a gene in E. coli whose primary name starts with 'y' is unannotated, and we discuss the value of the y-ome for systematic improvement of E. coli knowledge bases and its extension to other organisms.


Asunto(s)
Escherichia coli K12/genética , Proteínas de Escherichia coli/genética , Genoma Bacteriano/genética , Programas Informáticos , Bases de Datos Genéticas , Regulación Bacteriana de la Expresión Génica , Anotación de Secuencia Molecular
17.
Microbiology (Reading) ; 166(2): 141-148, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31625833

RESUMEN

The ability of Escherichia coli to tolerate acid stress is important for its survival and colonization in the human digestive tract. Here, we performed adaptive laboratory evolution of the laboratory strain E. coli K-12 MG1655 at pH 5.5 in glucose minimal medium. After 800 generations, six independent populations under evolution had reached 18.0 % higher growth rates than their starting strain at pH 5.5, while maintaining comparable growth rates to the starting strain at pH 7. We characterized the evolved strains and found that: (1) whole genome sequencing of isolated clones from each evolved population revealed mutations in rpoC appearing in five of six sequenced clones; and (2) gene expression profiles revealed different strategies to mitigate acid stress, which are related to amino acid metabolism and energy production and conversion. Thus, a combination of adaptive laboratory evolution, genome resequencing and expression profiling revealed, on a genome scale, the strategies that E. coli uses to mitigate acid stress.


Asunto(s)
Ácidos/metabolismo , Adaptación Fisiológica/fisiología , Escherichia coli/fisiología , Adaptación Fisiológica/genética , Evolución Biológica , Medios de Cultivo/química , Medios de Cultivo/metabolismo , Escherichia coli/genética , Escherichia coli/crecimiento & desarrollo , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Perfilación de la Expresión Génica , Regulación Bacteriana de la Expresión Génica , Genoma Bacteriano/genética , Glucosa/metabolismo , Redes y Vías Metabólicas/genética , Mutación
18.
Metab Eng ; 61: 360-368, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32710928

RESUMEN

Achieving the predictable expression of heterologous genes in a production host has proven difficult. Each heterologous gene expressed in the same host seems to elicit a different host response governed by unknown mechanisms. Historically, most studies have approached this challenge by manipulating the properties of the heterologous gene through methods like codon optimization. Here we approach this challenge from the host side. We express a set of 45 heterologous genes in the same Escherichia coli strain, using the same expression system and culture conditions. We collect a comprehensive RNAseq set to characterize the host's transcriptional response. Independent Component Analysis of the RNAseq data set reveals independently modulated gene sets (iModulons) that characterize the host response to heterologous gene expression. We relate 55% of variation of the host response to: Fear vs Greed (16.5%), Metal Homeostasis (19.0%), Respiration (6.0%), Protein folding (4.5%), and Amino acid and nucleotide biosynthesis (9.0%). If these responses can be controlled, then the success rate with predicting heterologous gene expression should increase.


Asunto(s)
Escherichia coli , Regulación Bacteriana de la Expresión Génica , RNA-Seq , Transcriptoma , Escherichia coli/genética , Escherichia coli/metabolismo
19.
PLoS Comput Biol ; 15(4): e1006971, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-31009451

RESUMEN

Genome-scale metabolic models (GEMs) are mathematically structured knowledge bases of metabolism that provide phenotypic predictions from genomic information. GEM-guided predictions of growth phenotypes rely on the accurate definition of a biomass objective function (BOF) that is designed to include key cellular biomass components such as the major macromolecules (DNA, RNA, proteins), lipids, coenzymes, inorganic ions and species-specific components. Despite its importance, no standardized computational platform is currently available to generate species-specific biomass objective functions in a data-driven, unbiased fashion. To fill this gap in the metabolic modeling software ecosystem, we implemented BOFdat, a Python package for the definition of a Biomass Objective Function from experimental data. BOFdat has a modular implementation that divides the BOF definition process into three independent modules defined here as steps: 1) the coefficients for major macromolecules are calculated, 2) coenzymes and inorganic ions are identified and their stoichiometric coefficients estimated, 3) the remaining species-specific metabolic biomass precursors are algorithmically extracted in an unbiased way from experimental data. We used BOFdat to reconstruct the BOF of the Escherichia coli model iML1515, a gold standard in the field. The BOF generated by BOFdat resulted in the most concordant biomass composition, growth rate, and gene essentiality prediction accuracy when compared to other methods. Installation instructions for BOFdat are available in the documentation and the source code is available on GitHub (https://github.com/jclachance/BOFdat).


Asunto(s)
Biomasa , Genómica/métodos , Redes y Vías Metabólicas , Modelos Biológicos , Programas Informáticos , Escherichia coli/genética , Escherichia coli/metabolismo , Genoma Bacteriano
20.
Nucleic Acids Res ; 46(20): 10682-10696, 2018 11 16.
Artículo en Inglés | MEDLINE | ID: mdl-30137486

RESUMEN

Transcriptional regulation enables cells to respond to environmental changes. Of the estimated 304 candidate transcription factors (TFs) in Escherichia coli K-12 MG1655, 185 have been experimentally identified, but ChIP methods have been used to fully characterize only a few dozen. Identifying these remaining TFs is key to improving our knowledge of the E. coli transcriptional regulatory network (TRN). Here, we developed an integrated workflow for the computational prediction and comprehensive experimental validation of TFs using a suite of genome-wide experiments. We applied this workflow to (i) identify 16 candidate TFs from over a hundred uncharacterized genes; (ii) capture a total of 255 DNA binding peaks for ten candidate TFs resulting in six high-confidence binding motifs; (iii) reconstruct the regulons of these ten TFs by determining gene expression changes upon deletion of each TF and (iv) identify the regulatory roles of three TFs (YiaJ, YdcI, and YeiE) as regulators of l-ascorbate utilization, proton transfer and acetate metabolism, and iron homeostasis under iron-limited conditions, respectively. Together, these results demonstrate how this workflow can be used to discover, characterize, and elucidate regulatory functions of uncharacterized TFs in parallel.


Asunto(s)
Escherichia coli K12/genética , Proteínas de Escherichia coli/genética , Perfilación de la Expresión Génica , Factores de Transcripción/genética , Escherichia coli K12/metabolismo , Proteínas de Escherichia coli/metabolismo , Regulación Bacteriana de la Expresión Génica , Redes Reguladoras de Genes , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Factores de Transcripción/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA