RESUMEN
Current machine learning techniques enable robust association of biological signals with measured phenotypes, but these approaches are incapable of identifying causal relationships. Here, we develop an integrated "white-box" biochemical screening, network modeling, and machine learning approach for revealing causal mechanisms and apply this approach to understanding antibiotic efficacy. We counter-screen diverse metabolites against bactericidal antibiotics in Escherichia coli and simulate their corresponding metabolic states using a genome-scale metabolic network model. Regression of the measured screening data on model simulations reveals that purine biosynthesis participates in antibiotic lethality, which we validate experimentally. We show that antibiotic-induced adenine limitation increases ATP demand, which elevates central carbon metabolism activity and oxygen consumption, enhancing the killing effects of antibiotics. This work demonstrates how prospective network modeling can couple with machine learning to identify complex causal mechanisms underlying drug efficacy.
Asunto(s)
Antibacterianos/metabolismo , Antibacterianos/farmacología , Redes y Vías Metabólicas/efectos de los fármacos , Adenina/metabolismo , Biología Computacional/métodos , Evaluación Preclínica de Medicamentos/métodos , Escherichia coli/metabolismo , Aprendizaje Automático , Redes y Vías Metabólicas/inmunología , Modelos Teóricos , Purinas/metabolismoRESUMEN
Metabologenomics integrates metabolomics with other omics data types to comprehensively study the genetic and environmental factors that influence metabolism. These multi-omics data can be incorporated into genome-scale metabolic models (GEMs), which are highly curated knowledge bases that explicitly account for genes, transcripts, proteins and metabolites. By including all known biochemical reactions catalysed by enzymes and transporters encoded in the human genome, GEMs analyse and predict the behaviour of complex metabolic networks. Continued advancements to the scale and scope of GEMs - from cells and tissues to microbiomes and the whole body - have helped to design effective treatments and develop better diagnostic tools for metabolic diseases. Furthermore, increasing amounts of multi-omics data are incorporated into GEMs to better identify the underlying mechanisms, biomarkers and potential drug targets of metabolic diseases.
RESUMEN
Constraint-based reconstruction and analysis (COBRA) methods at the genome scale have been under development since the first whole-genome sequences appeared in the mid-1990s. A few years ago, this approach began to demonstrate the ability to predict a range of cellular functions, including cellular growth capabilities on various substrates and the effect of gene knockouts at the genome scale. Thus, much interest has developed in understanding and applying these methods to areas such as metabolic engineering, antibiotic design, and organismal and enzyme evolution. This Primer will get you started.
Asunto(s)
Modelos Genéticos , Biología de Sistemas/métodos , Simulación por Computador , Escherichia coli/genética , Humanos , Ingeniería Metabólica , Mapas de Interacción de Proteínas , Thermotoga maritima/genética , Levaduras/genéticaRESUMEN
Synechococcus elongatus is an important cyanobacterium that serves as a versatile and robust model for studying circadian biology and photosynthetic metabolism. Its transcriptional regulatory network (TRN) is of fundamental interest, as it orchestrates the cell's adaptation to the environment, including its response to sunlight. Despite the previous characterization of constituent parts of the S. elongatus TRN, a comprehensive layout of its topology remains to be established. Here, we decomposed a compendium of 300 high-quality RNA sequencing datasets of the model strain PCC 7942 using independent component analysis. We obtained 57 independently modulated gene sets, or iModulons, that explain 67% of the variance in the transcriptional response and 1) accurately reflect the activity of known transcriptional regulations, 2) capture functional components of photosynthesis, 3) provide hypotheses for regulon structures and functional annotations of poorly characterized genes, and 4) describe the transcriptional shifts under dynamic light conditions. This transcriptome-wide analysis of S. elongatus provides a quantitative reconstruction of the TRN and presents a knowledge base that can guide future investigations. Our systems-level analysis also provides a global TRN structure for S. elongatus PCC 7942.
Asunto(s)
Ritmo Circadiano , Regulación Bacteriana de la Expresión Génica , Redes Reguladoras de Genes , Aprendizaje Automático , Synechococcus , Synechococcus/genética , Synechococcus/metabolismo , Ritmo Circadiano/genética , Ritmo Circadiano/fisiología , Fotosíntesis/genética , Transcriptoma , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismoRESUMEN
Despite enormous progress in understanding the fundamentals of bacterial gene regulation, our knowledge remains limited when compared with the number of bacterial genomes and regulatory systems to be discovered. Derived from a small number of initial studies, classic definitions for concepts of gene regulation have evolved as the number of characterized promoters has increased. Together with discoveries made using new technologies, this knowledge has led to revised generalizations and principles. In this Expert Recommendation, we suggest precise, updated definitions that support a logical, consistent conceptual framework of bacterial gene regulation, focusing on transcription initiation. The resulting concepts can be formalized by ontologies for computational modelling, laying the foundation for improved bioinformatics tools, knowledge-based resources and scientific communication. Thus, this work will help researchers construct better predictive models, with different formalisms, that will be useful in engineering, synthetic biology, microbiology and genetics.
Asunto(s)
Bacterias/genética , Regulación Bacteriana de la Expresión Génica , Iniciación de la Transcripción Genética , Operón , Regiones Promotoras Genéticas , Regulón , Factores de Transcripción/fisiologíaRESUMEN
Synthetic biology enables the reprogramming of cellular functions for various applications. However, challenges in scalability and predictability persist due to context-dependent performance and complex circuit-host interactions. This study introduces an iModulon-based engineering approach, utilizing machine learning-defined co-regulated gene groups (iModulons) as design parts containing essential genes for specific functions. This approach identifies the necessary components for genetic circuits across different contexts, enhancing genome engineering by improving target selection and predicting module behavior. We demonstrate several distinct uses of iModulons: (i) discovery of unknown iModulons to increase protein productivity, heat tolerance and fructose utilization; (ii) an iModulon boosting approach, which amplifies the activity of specific iModulons, improved cell growth under osmotic stress with minimal host regulation disruption; (iii) an iModulon rebalancing strategy, which adjusts the activity levels of iModulons to balance cellular functions, significantly increased oxidative stress tolerance while minimizing trade-offs and (iv) iModulon-based gene annotation enabled natural competence activation by predictably rewiring iModulons. Comparative experiments with traditional methods showed our approach offers advantages in efficiency and predictability of strain engineering. This study demonstrates the potential of iModulon-based strategies to systematically and predictably reprogram cellular functions, offering refined and adaptable control over complex regulatory networks.
Asunto(s)
Regulación Bacteriana de la Expresión Génica , Redes Reguladoras de Genes , Bases del Conocimiento , Biología Sintética/métodos , Escherichia coli/genética , Escherichia coli/metabolismo , Aprendizaje Automático , Ingeniería Genética/métodos , Transcripción Genética , Estrés Oxidativo/genéticaRESUMEN
Genome mining is revolutionizing natural products discovery efforts. The rapid increase in available genomes demands comprehensive computational platforms to effectively extract biosynthetic knowledge encoded across bacterial pangenomes. Here, we present BGCFlow, a novel systematic workflow integrating analytics for large-scale genome mining of bacterial pangenomes. BGCFlow incorporates several genome analytics and mining tools grouped into five common stages of analysis such as: (i) data selection, (ii) functional annotation, (iii) phylogenetic analysis, (iv) genome mining, and (v) comparative analysis. Furthermore, BGCFlow provides easy configuration of different projects, parallel distribution, scheduled job monitoring, an interactive database to visualize tables, exploratory Jupyter Notebooks, and customized reports. Here, we demonstrate the application of BGCFlow by investigating the phylogenetic distribution of various biosynthetic gene clusters detected across 42 genomes of the Saccharopolyspora genus, known to produce industrially important secondary/specialized metabolites. The BGCFlow-guided analysis predicted more accurate dereplication of BGCs and guided the targeted comparative analysis of selected RiPPs. The scalable, interoperable, adaptable, re-entrant, and reproducible nature of the BGCFlow will provide an effective novel way to extract the biosynthetic knowledge from the ever-growing genomic datasets of biotechnologically relevant bacterial species.
Asunto(s)
Vías Biosintéticas , Genómica , Familia de Multigenes , Flujo de Trabajo , Vías Biosintéticas/genética , Minería de Datos , Bases de Datos Genéticas , Genoma Bacteriano , Genómica/métodos , Filogenia , Programas InformáticosRESUMEN
Filamentous Actinobacteria, recently renamed Actinomycetia, are the most prolific source of microbial bioactive natural products. Studies on biosynthetic gene clusters benefit from or require chromosome-level assemblies. Here, we provide DNA sequences from >1000 isolates: 881 complete genomes and 153 near-complete genomes, representing 28 genera and 389 species, including 244 likely novel species. All genomes are from filamentous isolates of the class Actinomycetia from the NBC culture collection. The largest genus is Streptomyces with 886 genomes including 742 complete assemblies. We use this data to show that analysis of complete genomes can bring biological understanding not previously derived from more fragmented sequences or less systematic datasets. We document the central and structured location of core genes and distal location of specialized metabolite biosynthetic gene clusters and duplicate core genes on the linear Streptomyces chromosome, and analyze the content and length of the terminal inverted repeats which are characteristic for Streptomyces. We then analyze the diversity of trans-AT polyketide synthase biosynthetic gene clusters, which encodes the machinery of a biotechnologically highly interesting compound class. These insights have both ecological and biotechnological implications in understanding the importance of high quality genomic resources and the complex role synteny plays in Actinomycetia biology.
Asunto(s)
Actinobacteria , Genoma Bacteriano , Familia de Multigenes , Sintasas Poliquetidas , Genoma Bacteriano/genética , Actinobacteria/genética , Actinobacteria/clasificación , Actinobacteria/metabolismo , Sintasas Poliquetidas/genética , Sintasas Poliquetidas/metabolismo , Streptomyces/genética , Streptomyces/clasificación , Streptomyces/metabolismo , Filogenia , Genómica/métodosRESUMEN
Metabolism constitutes the core chemistry of life. How it began on the early Earth and whether it had a cellular origin are still uncertain. A leading hypothesis for life's origins postulates that metabolism arose from geochemical CO2-fixing pathways, driven by inorganic catalysts and energy sources, long before enzymes or genes existed. The acetyl-CoA pathway and the reductive tricarboxylic acid cycle are considered ancient reaction networks that hold relics of early carbon-fixing pathways. Although transition metals can promote many steps of these pathways, whether they form a functional metabolic network in abiotic cells has not been demonstrated. Here, we formulate a nonenzymatic carbon-fixing network from these pathways and determine its functional feasibility in abiotic cells by imposing fundamental physicochemical constraints. Using first principles, we show that abiotic cells can sustain a steady carbon-fixing cycle that performs a systemic function over a relatively narrow range of conditions. Furthermore, we find that in all feasible steady states, the operation of the cycle elevates the osmotic pressure, leading to volume expansion. These results suggest that achieving homeostatic metabolic states under prebiotic conditions was possible, but challenging, and volume growth was a fundamental property of early metabolism.
Asunto(s)
Ciclo del Ácido Cítrico , Redes y Vías Metabólicas , Ciclo del Carbono , Homeostasis , Carbono/metabolismoRESUMEN
The genomic diversity across strains of a species forms the genetic basis for differences in their behavior. A large-scale assessment of sequence variation has been made possible by the growing availability of strain-specific whole-genome sequences (WGS) and with the advent of large-scale databases of laboratory-acquired mutations. We define the Escherichia coli "alleleome" through a genome-scale assessment of amino acid (AA) sequence diversity in open reading frames across 2,661 WGS from wild-type strains. We observe a highly conserved alleleome enriched in mutations unlikely to affect protein function. In contrast, 33,000 mutations acquired in laboratory evolution experiments result in more severe AA substitutions that are rarely achieved by natural selection. Large-scale assessment of the alleleome establishes a method for the quantification of bacterial allelic diversity, reveals opportunities for synthetic biology to explore novel sequence space, and offers insights into the constraints governing evolution.
Asunto(s)
Escherichia coli , Variación Genética , Mutación , Escherichia coli/genética , Genoma Bacteriano/genética , Secuencia de AminoácidosRESUMEN
Recognizing binding sites of DNA-binding proteins is a key factor for elucidating transcriptional regulation in organisms. ChIP-exo enables researchers to delineate genome-wide binding landscapes of DNA-binding proteins with near single base-pair resolution. However, the peak calling step hinders ChIP-exo application since the published algorithms tend to generate false-positive and false-negative predictions. Here, we report the development of DEOCSU (DEep-learning Optimized ChIP-exo peak calling SUite), a novel machine learning-based ChIP-exo peak calling suite. DEOCSU entails the deep convolutional neural network model which was trained with curated ChIP-exo peak data to distinguish the visualized data of bona fide peaks from false ones. Performance validation of the trained deep-learning model indicated its high accuracy, high precision and high recall of over 95%. Applying the new suite to both in-house and publicly available ChIP-exo datasets obtained from bacteria, eukaryotes and archaea revealed an accurate prediction of peaks containing canonical motifs, highlighting the versatility and efficiency of DEOCSU. Furthermore, DEOCSU can be executed on a cloud computing platform or the local environment. With visualization software included in the suite, adjustable options such as the threshold of peak probability, and iterable updating of the pre-trained model, DEOCSU can be optimized for users' specific needs.
Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Aprendizaje Profundo , Inmunoprecipitación de Cromatina , Proteínas de Unión al ADN/metabolismo , Programas Informáticos , Algoritmos , Sitios de Unión , Análisis de Secuencia de ADNRESUMEN
Generalist microbes have adapted to a multitude of environmental stresses through their integrated stress response system. Individual stress responses have been quantified by E. coli metabolism and expression (ME) models under thermal, oxidative and acid stress, respectively. However, the systematic quantification of cross-stress & cross-talk among these stress responses remains lacking. Here, we present StressME: the unified stress response model of E. coli combining thermal (FoldME), oxidative (OxidizeME) and acid (AcidifyME) stress responses. StressME is the most up to date ME model for E. coli and it reproduces all published single-stress ME models. Additionally, it includes refined rate constants to improve prediction accuracy for wild-type and stress-evolved strains. StressME revealed certain optimal proteome allocation strategies associated with cross-stress and cross-talk responses. These stress-optimal proteomes were shaped by trade-offs between protective vs. metabolic enzymes; cytoplasmic vs. periplasmic chaperones; and expression of stress-specific proteins. As StressME is tuned to compute metabolic and gene expression responses under mild acid, oxidative, and thermal stresses, it is useful for engineering and health applications. The modular design of our open-source package also facilitates model expansion (e.g., to new stress mechanisms) by the computational biology community.
Asunto(s)
Proteínas de Escherichia coli , Escherichia coli , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Estrés Fisiológico/genética , Oxidación-Reducción , Proteínas de Choque Térmico/metabolismo , Ácidos/metabolismo , Expresión GénicaRESUMEN
The transcriptional regulatory network (TRN) of E. coli consists of thousands of interactions between regulators and DNA sequences. Regulons are typically determined either from resource-intensive experimental measurement of functional binding sites, or inferred from analysis of high-throughput gene expression datasets. Recently, independent component analysis (ICA) of RNA-seq compendia has shown to be a powerful method for inferring bacterial regulons. However, it remains unclear to what extent regulons predicted by ICA structure have a biochemical basis in promoter sequences. Here, we address this question by developing machine learning models that predict inferred regulon structures in E. coli based on promoter sequence features. Models were constructed successfully (cross-validation AUROC > = 0.8) for 85% (40/47) of ICA-inferred E. coli regulons. We found that: 1) The presence of a high scoring regulator motif in the promoter region was sufficient to specify regulatory activity in 40% (19/47) of the regulons, 2) Additional features, such as DNA shape and extended motifs that can account for regulator multimeric binding, helped to specify regulon structure for the remaining 60% of regulons (28/47); 3) investigating regulons where initial machine learning models failed revealed new regulator-specific sequence features that improved model accuracy. Finally, we found that strong regulatory binding sequences underlie both the genes shared between ICA-inferred and experimental regulons as well as genes in the E. coli core pan-regulon of Fur. This work demonstrates that the structure of ICA-inferred regulons largely can be understood through the strength of regulator binding sites in promoter regions, reinforcing the utility of top-down inference for regulon discovery.
Asunto(s)
Escherichia coli , Regulón , Regulón/genética , Escherichia coli/genética , Escherichia coli/metabolismo , Bacterias/genética , Sitios de Unión/genética , Regiones Promotoras Genéticas/genética , Regulación Bacteriana de la Expresión Génica/genética , Proteínas Bacterianas/metabolismoRESUMEN
Public gene expression databases are a rapidly expanding resource of organism responses to diverse perturbations, presenting both an opportunity and a challenge for bioinformatics workflows to extract actionable knowledge of transcription regulatory network function. Here, we introduce a five-step computational pipeline, called iModulonMiner, to compile, process, curate, analyze, and characterize the totality of RNA-seq data for a given organism or cell type. This workflow is centered around the data-driven computation of co-regulated gene sets using Independent Component Analysis, called iModulons, which have been shown to have broad applications. As a demonstration, we applied this workflow to generate the iModulon structure of Bacillus subtilis using all high-quality, publicly-available RNA-seq data. Using this structure, we predicted regulatory interactions for multiple transcription factors, identified groups of co-expressed genes that are putatively regulated by undiscovered transcription factors, and predicted properties of a recently discovered single-subunit phage RNA polymerase. We also present a Python package, PyModulon, with functions to characterize, visualize, and explore computed iModulons. The pipeline, available at https://github.com/SBRG/iModulonMiner, can be readily applied to diverse organisms to gain a rapid understanding of their transcriptional regulatory network structure and condition-specific activity.
RESUMEN
While global transcription factors (TFs) have been studied extensively in Escherichia coli model strains, conservation and diversity in TF regulation between strains is still unknown. Here we use a combination of ChIP-exo-to define ferric uptake regulator (Fur) binding sites-and differential gene expression-to define the Fur regulon in nine E. coli strains. We then define a pan-regulon consisting of 469 target genes that includes all Fur target genes in all nine strains. The pan-regulon is then divided into the core regulon (target genes found in all the strains, n = 36), the accessory regulon (target found in two to eight strains, n = 158) and the unique regulon (target genes found in one strain, n = 275). Thus, there is a small set of Fur regulated genes common to all nine strains, but a large number of regulatory targets unique to a particular strain. Many of the unique regulatory targets are genes unique to that strain. This first-established pan-regulon reveals a common core of conserved regulatory targets and significant diversity in transcriptional regulation amongst E. coli strains, reflecting diverse niche specification and strain history.
Asunto(s)
Proteínas de Escherichia coli , Escherichia coli , Regulón , Proteínas Represoras , Escherichia coli/genética , Escherichia coli/metabolismo , Regulación Bacteriana de la Expresión Génica , Hierro/metabolismo , Regulón/genética , Proteínas Represoras/genética , Proteínas Represoras/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Factores de TranscripciónRESUMEN
Transcriptomic data is accumulating rapidly; thus, scalable methods for extracting knowledge from this data are critical. Here, we assembled a top-down expression and regulation knowledge base for Escherichia coli. The expression component is a 1035-sample, high-quality RNA-seq compendium consisting of data generated in our lab using a single experimental protocol. The compendium contains diverse growth conditions, including: 9 media; 39 supplements, including antibiotics; 42 heterologous proteins; and 76 gene knockouts. Using this resource, we elucidated global expression patterns. We used machine learning to extract 201 modules that account for 86% of known regulatory interactions, creating the regulatory component. With these modules, we identified two novel regulons and quantified systems-level regulatory responses. We also integrated 1675 curated, publicly-available transcriptomes into the resource. We demonstrated workflows for analyzing new data against this knowledge base via deconstruction of regulation during aerobic transition. This resource illuminates the E. coli transcriptome at scale and provides a blueprint for top-down transcriptomic analysis of non-model organisms.
Asunto(s)
Escherichia coli , Bases del Conocimiento , Escherichia coli/genética , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Perfilación de la Expresión Génica , Regulación Bacteriana de la Expresión Génica , TranscriptomaRESUMEN
UV radiation (UVR) has significant physiological effects on organisms living at or near the Earth's surface, yet the full suite of genes required for fitness of a photosynthetic organism in a UVR-rich environment remains unknown. This study reports a genome-wide fitness assessment of the genes that affect UVR tolerance under environmentally relevant UVR dosages in the model cyanobacterium Synechococcus elongatus PCC 7942. Our results highlight the importance of specific genes that encode proteins involved in DNA repair, glutathione synthesis, and the assembly and maintenance of photosystem II, as well as genes that encode hypothetical proteins and others without an obvious connection to canonical methods of UVR tolerance. Disruption of a gene that encodes a leucyl aminopeptidase (LAP) conferred the greatest UVR-specific decrease in fitness. Enzymatic assays demonstrated a strong pH-dependent affinity of the LAP for the dipeptide cysteinyl-glycine, suggesting an involvement in glutathione catabolism as a function of night-time cytosolic pH level. A low differential expression of the LAP gene under acute UVR exposure suggests that its relative importance would be overlooked in transcript-dependent screens. Subsequent experiments revealed a similar UVR-sensitivity phenotype in LAP knockouts of other organisms, indicating conservation of the functional role of LAPs in UVR tolerance.
Asunto(s)
Leucil Aminopeptidasa , Rayos Ultravioleta , Fotosíntesis/efectos de la radiación , Reparación del ADN , GlutatiónRESUMEN
Human infections with methicillin-resistant Staphylococcus aureus (MRSA) are commonly treated with vancomycin, and strains with decreased susceptibility, designated as vancomycin-intermediate S. aureus (VISA), are associated with treatment failure. Here, we profiled the phenotypic, mutational, and transcriptional landscape of 10 VISA strains adapted by laboratory evolution from one common MRSA ancestor, the USA300 strain JE2. Using functional and independent component analysis, we found that: 1) despite the common genetic background and environmental conditions, the mutational landscape diverged between evolved strains and included mutations previously associated with vancomycin resistance (in vraT, graS, vraFG, walKR, and rpoBCD) as well as novel adaptive mutations (SAUSA300_RS04225, ssaA, pitAR, and sagB); 2) the first wave of mutations affected transcriptional regulators and the second affected genes involved in membrane biosynthesis; 3) expression profiles were predominantly strain-specific except for sceD and lukG, which were the only two genes significantly differentially expressed in all clones; 4) three independent virulence systems (φSa3, SaeR, and T7SS) featured as the most transcriptionally perturbed gene sets across clones; 5) there was a striking variation in oxacillin susceptibility across the evolved lineages (from a 10-fold increase to a 63-fold decrease) that also arose in clinical MRSA isolates exposed to vancomycin and correlated with susceptibility to teichoic acid inhibitors; and 6) constitutive expression of the VraR regulon explained cross-susceptibility, while mutations in walK were associated with cross-resistance. Our results show that adaptation to vancomycin involves a surprising breadth of mutational and transcriptional pathways that affect antibiotic susceptibility and possibly the clinical outcome of infections.
Asunto(s)
Antibacterianos , Staphylococcus aureus Resistente a Meticilina , Infecciones Estafilocócicas , Staphylococcus aureus , Resistencia a la Vancomicina , Vancomicina , Antibacterianos/química , Antibacterianos/farmacología , Antibacterianos/uso terapéutico , Evolución Molecular , Humanos , Staphylococcus aureus Resistente a Meticilina/metabolismo , Pruebas de Sensibilidad Microbiana , Oxacilina/química , Oxacilina/farmacología , Infecciones Estafilocócicas/tratamiento farmacológico , Staphylococcus aureus/efectos de los fármacos , Staphylococcus aureus/genética , Staphylococcus aureus/patogenicidad , Vancomicina/química , Vancomicina/farmacología , Vancomicina/uso terapéutico , Resistencia a la Vancomicina/genética , Virulencia/genéticaRESUMEN
Combatting Clostridioides difficile infections, a dominant cause of hospital-associated infections with incidence and resulting deaths increasing worldwide, is complicated by the frequent emergence of new virulent strains. Here, we employ whole-genome sequencing, high-throughput phenotypic screenings, and genome-scale models of metabolism to evaluate the genetic diversity of 451 strains of C. difficile. Constructing the C. difficile pangenome based on this set revealed 9,924 distinct gene clusters, of which 2,899 (29%) are defined as core, 2,968 (30%) are defined as unique, and the remaining 4,057 (41%) are defined as accessory. We develop a strain typing method, sequence typing by accessory genome (STAG), that identifies 176 genetically distinct groups of strains and allows for explicit interrogation of accessory gene content. Thirty-five strains representative of the overall set were experimentally profiled on 95 different nutrient sources, revealing 26 distinct growth profiles and unique nutrient preferences; 451 strain-specific genome scale models of metabolism were constructed, allowing us to computationally probe phenotypic diversity in 28,864 unique conditions. The models create a mechanistic link between the observed phenotypes and strain-specific genetic differences and exhibit an ability to correctly predict growth in 76% of measured cases. The typing and model predictions are used to identify and contextualize discriminating genetic features and phenotypes that may contribute to the emergence of new problematic strains.
Asunto(s)
Clostridioides difficile , Infección Hospitalaria , Clostridioides , Clostridioides difficile/genética , Variación Genética , Humanos , Biología de SistemasRESUMEN
Understanding diverse bacterial nutritional requirements and responses is foundational in microbial research and biotechnology. In this study, we employed knowledge-enriched transcriptomic analytics to decipher complex stress responses of Vibrio natriegens to supplied nutrients, aiming to enhance microbial engineering efforts. We computed 64 independently modulated gene sets that comprise a quantitative basis for transcriptome dynamics across a comprehensive transcriptomics dataset containing a broad array of nutrient conditions. Our approach led to the i) identification of novel transporter systems for diverse substrates, ii) a detailed understanding of how trace elements affect metabolism and growth, and iii) extensive characterization of nutrient-induced stress responses, including osmotic stress, low glycolytic flux, proteostasis, and altered protein expression. By clarifying the relationship between the acetate-associated regulon and glycolytic flux status of various nutrients, we have showcased its vital role in directing optimal carbon source selection. Our findings offer deep insights into the transcriptional landscape of bacterial nutrition and underscore its significance in tailoring strain engineering strategies, thereby facilitating the development of more efficient and robust microbial systems for biotechnological applications.