Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 545
Filtrar
1.
NAR Genom Bioinform ; 6(2): lqae041, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38774514

RESUMEN

Microbial genome sequences are rapidly accumulating, enabling large-scale studies of sequence variation. Existing studies primarily focus on coding regions to study amino acid substitution patterns in proteins. However, non-coding regulatory regions also play a distinct role in determining physiologic responses. To investigate intergenic sequence variation on a large-scale, we identified non-coding regulatory region alleles across 2350 Escherichia coli strains. This 'alleleome' consists of 117 781 unique alleles for 1169 reference regulatory regions (transcribing 1975 genes) at single base-pair resolution. We find that 64% of nucleotide positions are invariant, and variant positions vary in a median of just 0.6% of strains. Additionally, non-coding alleles are sufficient to recover E. coli phylogroups. We find that core promoter elements and transcription factor binding sites are significantly conserved, especially those located upstream of essential or highly-expressed genes. However, variability in conservation of transcription factor binding sites is significant both within and across regulons. Finally, we contrast mutations acquired during adaptive laboratory evolution with wild-type variation, finding that the former preferentially alter positions that the latter conserves. Overall, this analysis elucidates the wealth of information found in E. coli non-coding sequence variation and expands pangenomic studies to non-coding regulatory regions at single-nucleotide resolution.

2.
Metab Eng Commun ; 18: e00234, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38711578

RESUMEN

Kinetic models of metabolism are promising platforms for studying complex metabolic systems and designing production strains. Given the availability of enzyme kinetic data from historical experiments and machine learning estimation tools, a straightforward modeling approach is to assemble kinetic data enzyme by enzyme until a desired scale is reached. However, this type of 'bottom up' parameterization of kinetic models has been difficult due to a number of issues including gaps in kinetic parameters, the complexity of enzyme mechanisms, inconsistencies between parameters obtained from different sources, and in vitro-in vivo differences. Here, we present a computational workflow for the robust estimation of kinetic parameters for detailed mass action enzyme models while taking into account parameter uncertainty. The resulting software package, termed MASSef (the Mass Action Stoichiometry Simulation Enzyme Fitting package), can handle standard 'macroscopic' kinetic parameters, including Km, kcat, Ki, Keq, and nh, as well as diverse reaction mechanisms defined in terms of mass action reactions and 'microscopic' rate constants. We provide three enzyme case studies demonstrating that this approach can identify and reconcile inconsistent data either within in vitro experiments or between in vitro and in vivo enzyme function. We further demonstrate how parameterized enzyme modules can be used to assemble pathway-scale kinetic models consistent with in vivo behavior. This work builds on the legacy of knowledge on kinetic behavior of enzymes by enabling robust parameterization of enzyme kinetic models at scale utilizing the abundance of historical literature data and machine learning parameter estimates.

3.
bioRxiv ; 2024 Apr 28.
Artículo en Inglés | MEDLINE | ID: mdl-38712217

RESUMEN

A critical body of knowledge has developed through advances in protein microscopy, protein-fold modeling, structural biology software, availability of sequenced bacterial genomes, large-scale mutation databases, and genome-scale models. Based on these recent advances, we develop a computational framework that; i) identifies the oligomeric structural proteome encoded by an organism's genome from available structural resources; ii) maps multi-strain alleleomic variation, resulting in the structural proteome for a species; and iii) calculates the 3D orientation of proteins across subcellular compartments with residue-level precision. Using the platform, we; iv) compute the quaternary E. coli K-12 MG1655 structural proteome; v) use a dataset of 12,000 mutations to build Random Forest classifiers that can predict the severity of mutations; and, in combination with a genome-scale model that computes proteome allocation, vi) obtain the spatial allocation of the E. coli proteome. Thus, in conjunction with relevant datasets and increasingly accurate computational models, we can now annotate quaternary structural proteomes, at genome-scale, to obtain a molecular-level understanding of whole-cell functions. Significance: Advancements in experimental and computational methods have revealed the shapes of multi-subunit proteins. The absence of a unified platform that maps actionable datatypes onto these increasingly accurate structures creates a barrier to structural analyses, especially at the genome-scale. Here, we describe QSPACE, a computational annotation platform that evaluates existing resources to identify the best-available structure for each protein in a user's query, maps the 3D location of actionable datatypes ( e.g. , active sites, published mutations) onto the selected structures, and uses third-party APIs to determine the subcellular compartment of all amino acids of a protein. As proof-of-concept, we deployed QSPACE to generate the quaternary structural proteome of E. coli MG1655 and demonstrate two use-cases involving large-scale mutant analysis and genome-scale modelling.

4.
Metab Eng ; 83: 160-171, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38636729

RESUMEN

Microbes have inherent capacities for utilizing various carbon sources, however they often exhibit sub-par fitness due to low metabolic efficiency. To test whether a bacterial strain can optimally utilize multiple carbon sources, Escherichia coli was serially evolved in L-lactate and glycerol. This yielded two end-point strains that evolved first in L-lactate then in glycerol, and vice versa. The end-point strains displayed a universal growth advantage on single and a mixture of adaptive carbon sources, enabled by a concerted action of carbon source-specialists and generalist mutants. The combination of just four variants of glpK, ppsA, ydcI, and rph-pyrE, accounted for more than 80% of end-point strain fitness. In addition, machine learning analysis revealed a coordinated activity of transcriptional regulators imparting condition-specific regulation of gene expression. The effectiveness of the serial adaptive laboratory evolution (ALE) scheme in bioproduction applications was assessed under single and mixed-carbon culture conditions, in which serial ALE strain exhibited superior productivity of acetoin compared to ancestral strains. Together, systems-level analysis elucidated the molecular basis of serial evolution, which hold potential utility in bioproduction applications.


Asunto(s)
Carbono , Evolución Molecular Dirigida , Escherichia coli , Escherichia coli/genética , Escherichia coli/metabolismo , Carbono/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Glicerol/metabolismo , Ácido Láctico/metabolismo , Ingeniería Metabólica
5.
Nucleic Acids Res ; 2024 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-38686794

RESUMEN

Genome mining is revolutionizing natural products discovery efforts. The rapid increase in available genomes demands comprehensive computational platforms to effectively extract biosynthetic knowledge encoded across bacterial pangenomes. Here, we present BGCFlow, a novel systematic workflow integrating analytics for large-scale genome mining of bacterial pangenomes. BGCFlow incorporates several genome analytics and mining tools grouped into five common stages of analysis such as: (i) data selection, (ii) functional annotation, (iii) phylogenetic analysis, (iv) genome mining, and (v) comparative analysis. Furthermore, BGCFlow provides easy configuration of different projects, parallel distribution, scheduled job monitoring, an interactive database to visualize tables, exploratory Jupyter Notebooks, and customized reports. Here, we demonstrate the application of BGCFlow by investigating the phylogenetic distribution of various biosynthetic gene clusters detected across 42 genomes of the Saccharopolyspora genus, known to produce industrially important secondary/specialized metabolites. The BGCFlow-guided analysis predicted more accurate dereplication of BGCs and guided the targeted comparative analysis of selected RiPPs. The scalable, interoperable, adaptable, re-entrant, and reproducible nature of the BGCFlow will provide an effective novel way to extract the biosynthetic knowledge from the ever-growing genomic datasets of biotechnologically relevant bacterial species.

6.
Nat Commun ; 15(1): 2356, 2024 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-38490991

RESUMEN

Machine learning applied to large compendia of transcriptomic data has enabled the decomposition of bacterial transcriptomes to identify independently modulated sets of genes, such iModulons represent specific cellular functions. The identification of iModulons enables accurate identification of genes necessary and sufficient for cross-species transfer of cellular functions. We demonstrate cross-species transfer of: 1) the biotransformation of vanillate to protocatechuate, 2) a malonate catabolic pathway, 3) a catabolic pathway for 2,3-butanediol, and 4) an antimicrobial resistance to ampicillin found in multiple Pseudomonas species to Escherichia coli. iModulon-based engineering is a transformative strategy as it includes all genes comprising the transferred cellular function, including genes without functional annotation. Adaptive laboratory evolution was deployed to optimize the cellular function transferred, revealing mutations in the host. Combining big data analytics and laboratory evolution thus enhances the level of understanding of systems biology, and synthetic biology for strain design and development.


Asunto(s)
Escherichia coli , Biología Sintética , Escherichia coli/genética , Escherichia coli/metabolismo , Genes Bacterianos , Pseudomonas/genética
7.
Artículo en Inglés | MEDLINE | ID: mdl-38439699

RESUMEN

The demand for discovering novel microbial secondary metabolites is growing to address the limitations in bioactivities such as antibacterial, antifungal, anticancer, anthelmintic, and immunosuppressive functions. Among microbes, the genus Streptomyces holds particular significance for secondary metabolite discovery. Each Streptomyces species typically encodes approximately 30 secondary metabolite biosynthetic gene clusters (smBGCs) within its genome, which are mostly uncharacterized in terms of their products and bioactivities. The development of next-generation sequencing has enabled the identification of a large number of potent smBGCs for novel secondary metabolites that are imbalanced in number compared with discovered secondary metabolites. The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated (Cas) system has revolutionized the translation of enormous genomic potential into the discovery of secondary metabolites as the most efficient genetic engineering tool for Streptomyces. In this review, the current status of CRISPR/Cas applications in Streptomyces is summarized, with particular focus on the identification of secondary metabolite biosynthesis gene clusters and their potential applications.This review summarizes the broad range of CRISPR/Cas applications in Streptomyces for natural product discovery and production. ONE-SENTENCE SUMMARY: This review summarizes the broad range of CRISPR/Cas applications in Streptomyces for natural product discovery and production.


Asunto(s)
Productos Biológicos , Streptomyces , Streptomyces/genética , Streptomyces/metabolismo , Sistemas CRISPR-Cas , Ingeniería Genética , Genoma Bacteriano , Productos Biológicos/metabolismo , Edición Génica
8.
PLoS Comput Biol ; 20(2): e1011865, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38346086

RESUMEN

Generalist microbes have adapted to a multitude of environmental stresses through their integrated stress response system. Individual stress responses have been quantified by E. coli metabolism and expression (ME) models under thermal, oxidative and acid stress, respectively. However, the systematic quantification of cross-stress & cross-talk among these stress responses remains lacking. Here, we present StressME: the unified stress response model of E. coli combining thermal (FoldME), oxidative (OxidizeME) and acid (AcidifyME) stress responses. StressME is the most up to date ME model for E. coli and it reproduces all published single-stress ME models. Additionally, it includes refined rate constants to improve prediction accuracy for wild-type and stress-evolved strains. StressME revealed certain optimal proteome allocation strategies associated with cross-stress and cross-talk responses. These stress-optimal proteomes were shaped by trade-offs between protective vs. metabolic enzymes; cytoplasmic vs. periplasmic chaperones; and expression of stress-specific proteins. As StressME is tuned to compute metabolic and gene expression responses under mild acid, oxidative, and thermal stresses, it is useful for engineering and health applications. The modular design of our open-source package also facilitates model expansion (e.g., to new stress mechanisms) by the computational biology community.


Asunto(s)
Proteínas de Escherichia coli , Escherichia coli , Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Estrés Fisiológico/genética , Oxidación-Reducción , Proteínas de Choque Térmico/metabolismo , Ácidos/metabolismo , Expresión Génica
9.
mSystems ; 9(3): e0094223, 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38323821

RESUMEN

There is growing interest in engineering Pseudomonas putida KT2440 as a microbial chassis for the conversion of renewable and waste-based feedstocks, and metabolic engineering of P. putida relies on the understanding of the functional relationships between genes. In this work, independent component analysis (ICA) was applied to a compendium of existing fitness data from randomly barcoded transposon insertion sequencing (RB-TnSeq) of P. putida KT2440 grown in 179 unique experimental conditions. ICA identified 84 independent groups of genes, which we call fModules ("functional modules"), where gene members displayed shared functional influence in a specific cellular process. This machine learning-based approach both successfully recapitulated previously characterized functional relationships and established hitherto unknown associations between genes. Selected gene members from fModules for hydroxycinnamate metabolism and stress resistance, acetyl coenzyme A assimilation, and nitrogen metabolism were validated with engineered mutants of P. putida. Additionally, functional gene clusters from ICA of RB-TnSeq data sets were compared with regulatory gene clusters from prior ICA of RNAseq data sets to draw connections between gene regulation and function. Because ICA profiles the functional role of several distinct gene networks simultaneously, it can reduce the time required to annotate gene function relative to manual curation of RB-TnSeq data sets. IMPORTANCE: This study demonstrates a rapid, automated approach for elucidating functional modules within complex genetic networks. While Pseudomonas putida randomly barcoded transposon insertion sequencing data were used as a proof of concept, this approach is applicable to any organism with existing functional genomics data sets and may serve as a useful tool for many valuable applications, such as guiding metabolic engineering efforts in other microbes or understanding functional relationships between virulence-associated genes in pathogenic microbes. Furthermore, this work demonstrates that comparison of data obtained from independent component analysis of transcriptomics and gene fitness datasets can elucidate regulatory-functional relationships between genes, which may have utility in a variety of applications, such as metabolic modeling, strain engineering, or identification of antimicrobial drug targets.


Asunto(s)
Pseudomonas putida , Pseudomonas putida/genética , Redes Reguladoras de Genes , Genómica
10.
mSystems ; 9(3): e0125723, 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38349131

RESUMEN

Limosilactobacillus reuteri, a probiotic microbe instrumental to human health and sustainable food production, adapts to diverse environmental shifts via dynamic gene expression. We applied the independent component analysis (ICA) to 117 RNA-seq data sets to decode its transcriptional regulatory network (TRN), identifying 35 distinct signals that modulate specific gene sets. Our findings indicate that the ICA provides a qualitative advancement and captures nuanced relationships within gene clusters that other methods may miss. This study uncovers the fundamental properties of L. reuteri's TRN and deepens our understanding of its arginine metabolism and the co-regulation of riboflavin metabolism and fatty acid conversion. It also sheds light on conditions that regulate genes within a specific biosynthetic gene cluster and allows for the speculation of the potential role of isoprenoid biosynthesis in L. reuteri's adaptive response to environmental changes. By integrating transcriptomics and machine learning, we provide a system-level understanding of L. reuteri's response mechanism to environmental fluctuations, thus setting the stage for modeling the probiotic transcriptome for applications in microbial food production. IMPORTANCE: We have studied Limosilactobacillus reuteri, a beneficial probiotic microbe that plays a significant role in our health and production of sustainable foods, a type of foods that are nutritionally dense and healthier and have low-carbon emissions compared to traditional foods. Similar to how humans adapt their lifestyles to different environments, this microbe adjusts its behavior by modulating the expression of genes. We applied machine learning to analyze large-scale data sets on how these genes behave across diverse conditions. From this, we identified 35 unique patterns demonstrating how L. reuteri adjusts its genes based on 50 unique environmental conditions (such as various sugars, salts, microbial cocultures, human milk, and fruit juice). This research helps us understand better how L. reuteri functions, especially in processes like breaking down certain nutrients and adapting to stressful changes. More importantly, with our findings, we become closer to using this knowledge to improve how we produce more sustainable and healthier foods with the help of microbes.


Asunto(s)
Limosilactobacillus reuteri , Probióticos , Humanos , Limosilactobacillus reuteri/genética , Perfilación de la Expresión Génica , Transcriptoma/genética , Aprendizaje Automático
11.
Trends Biotechnol ; 2024 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-38423803

RESUMEN

Advances in systems and synthetic biology have propelled the construction of reduced bacterial genomes. Genome reduction was initially focused on exploring properties of minimal genomes, but more recently it has been deployed as an engineering strategy to enhance strain performance. This review provides the latest updates on reduced genomes, focusing on dual-track approaches of top-down reduction and bottom-up synthesis for their construction. Using cases from studies that are based on established industrial workhorse strains, we discuss the construction of a series of synthetic phenotypes that are candidates for biotechnological applications. Finally, we address the possible uses of reduced genomes for biotechnological applications and the needed future research directions that may ultimately lead to the total synthesis of rationally designed genomes.

12.
Artículo en Inglés | MEDLINE | ID: mdl-38415197

RESUMEN

Over the past two decades Biomedical Engineering has emerged as a major discipline that bridges societal needs of human health care with the development of novel technologies. Every medical institution is now equipped at varying degrees of sophistication with the ability to monitor human health in both non-invasive and invasive modes. The multiple scales at which human physiology can be interrogated provide a profound perspective on health and disease. We are at the nexus of creating "avatars" (herein defined as an extension of "digital twins") of human patho/physiology to serve as paradigms for interrogation and potential intervention. Motivated by the emergence of these new capabilities, the IEEE Engineering in Medicine and Biology Society, the Departments of Biomedical Engineering at Johns Hopkins University and Bioengineering at University of California at San Diego sponsored an interdisciplinary workshop to define the grand challenges that face biomedical engineering and the mechanisms to address these challenges. The Workshop identified five grand challenges with cross-cutting themes and provided a roadmap for new technologies, identified new training needs, and defined the types of interdisciplinary teams needed for addressing these challenges. The themes presented in this paper include: 1) accumedicine through creation of avatars of cells, tissues, organs and whole human; 2) development of smart and responsive devices for human function augmentation; 3) exocortical technologies to understand brain function and treat neuropathologies; 4) the development of approaches to harness the human immune system for health and wellness; and 5) new strategies to engineer genomes and cells.

13.
mSystems ; 9(2): e0060623, 2024 Feb 20.
Artículo en Inglés | MEDLINE | ID: mdl-38189271

RESUMEN

Acinetobacter baumannii causes severe infections in humans, resists multiple antibiotics, and survives in stressful environmental conditions due to modulations of its complex transcriptional regulatory network (TRN). Unfortunately, our global understanding of the TRN in this emerging opportunistic pathogen is limited. Here, we apply independent component analysis, an unsupervised machine learning method, to a compendium of 139 RNA-seq data sets of three multidrug-resistant A. baumannii international clonal complex I strains (AB5075, AYE, and AB0057). This analysis allows us to define 49 independently modulated gene sets, which we call iModulons. Analysis of the identified A. baumannii iModulons reveals validating parallels to previously defined biological operons/regulons and provides a framework for defining unknown regulons. By utilizing the iModulons, we uncover potential mechanisms for a RpoS-independent general stress response, define global stress-virulence trade-offs, and identify conditions that may induce plasmid-borne multidrug resistance. The iModulons provide a model of the TRN that emphasizes the importance of transcriptional regulation of virulence phenotypes in A. baumannii. Furthermore, they suggest the possibility of future interventions to guide gene expression toward diminished pathogenic potential.IMPORTANCEThe rise in hospital outbreaks of multidrug-resistant Acinetobacter baumannii infections underscores the urgent need for alternatives to traditional broad-spectrum antibiotic therapies. The success of A. baumannii as a significant nosocomial pathogen is largely attributed to its ability to resist antibiotics and survive environmental stressors. However, there is limited literature available on the global, complex regulatory circuitry that shapes these phenotypes. Computational tools that can assist in the elucidation of A. baumannii's transcriptional regulatory network architecture can provide much-needed context for a comprehensive understanding of pathogenesis and virulence, as well as for the development of targeted therapies that modulate these pathways.


Asunto(s)
Infecciones por Acinetobacter , Acinetobacter baumannii , Humanos , Acinetobacter baumannii/genética , Infecciones por Acinetobacter/tratamiento farmacológico , Virulencia/genética , Regulación de la Expresión Génica , Antibacterianos/farmacología
14.
mSystems ; 9(2): e0100123, 2024 Feb 20.
Artículo en Inglés | MEDLINE | ID: mdl-38259168

RESUMEN

Understanding the dynamics of biological systems in evolving environments is a challenge due to their scale and complexity. Here, we present a computational framework for the timescale decomposition of biochemical reaction networks to distill essential patterns from their intricate dynamics. This approach identifies timescale hierarchies, concentration pools, and coherent structures from time-series data, providing a system-level description of reaction networks at physiologically important timescales. We apply this technique to kinetic models of hypothetical and biological pathways, validating it by reproducing analytically characterized or previously known concentration pools of these pathways. Moreover, by analyzing the timescale hierarchy of the glycolytic pathway, we elucidate the connections between the stoichiometric and dissipative structures of reaction networks and the temporal organization of coherent structures. Specifically, we show that glycolysis is a cofactor-driven pathway, the slowest dynamics of which are described by a balance between high-energy phosphate bond and redox trafficking. Overall, this approach provides more biologically interpretable characterizations of network dynamics than large-scale kinetic models, thus facilitating model reduction and personalized medicine applications. IMPORTANCE Complex interactions within interconnected biochemical reaction networks enable cellular responses to a wide range of unpredictable environmental perturbations. Understanding how biological functions arise from these intricate interactions has been a long-standing problem in biology. Here, we introduce a computational approach to dissect complex biological systems' dynamics in evolving environments. This approach characterizes the timescale hierarchies of complex reaction networks, offering a system-level understanding at physiologically relevant timescales. Analyzing various hypothetical and biological pathways, we show how stoichiometric properties shape the way energy is dissipated throughout reaction networks. Notably, we establish that glycolysis operates as a cofactor-driven pathway, where the slowest dynamics are governed by a balance between high-energy phosphate bonds and redox trafficking. This approach enhances our understanding of network dynamics and facilitates the development of reduced-order kinetic models with biologically interpretable components.


Asunto(s)
Fenómenos Fisiológicos Celulares , Glucólisis , Cinética , Fosfatos
15.
PLoS Comput Biol ; 20(1): e1011824, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38252668

RESUMEN

The transcriptional regulatory network (TRN) of E. coli consists of thousands of interactions between regulators and DNA sequences. Regulons are typically determined either from resource-intensive experimental measurement of functional binding sites, or inferred from analysis of high-throughput gene expression datasets. Recently, independent component analysis (ICA) of RNA-seq compendia has shown to be a powerful method for inferring bacterial regulons. However, it remains unclear to what extent regulons predicted by ICA structure have a biochemical basis in promoter sequences. Here, we address this question by developing machine learning models that predict inferred regulon structures in E. coli based on promoter sequence features. Models were constructed successfully (cross-validation AUROC > = 0.8) for 85% (40/47) of ICA-inferred E. coli regulons. We found that: 1) The presence of a high scoring regulator motif in the promoter region was sufficient to specify regulatory activity in 40% (19/47) of the regulons, 2) Additional features, such as DNA shape and extended motifs that can account for regulator multimeric binding, helped to specify regulon structure for the remaining 60% of regulons (28/47); 3) investigating regulons where initial machine learning models failed revealed new regulator-specific sequence features that improved model accuracy. Finally, we found that strong regulatory binding sequences underlie both the genes shared between ICA-inferred and experimental regulons as well as genes in the E. coli core pan-regulon of Fur. This work demonstrates that the structure of ICA-inferred regulons largely can be understood through the strength of regulator binding sites in promoter regions, reinforcing the utility of top-down inference for regulon discovery.


Asunto(s)
Escherichia coli , Regulón , Regulón/genética , Escherichia coli/genética , Escherichia coli/metabolismo , Bacterias/genética , Sitios de Unión/genética , Regiones Promotoras Genéticas/genética , Regulación Bacteriana de la Expresión Génica/genética , Proteínas Bacterianas/metabolismo
16.
bioRxiv ; 2024 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-38260479

RESUMEN

Mature red blood cells (RBCs) lack mitochondria, and thus exclusively rely on glycolysis to generate adenosine triphosphate (ATP) during aging in vivo and during storage in vitro in the blood bank. Here we identify an association between blood donor age, sex, ethnicity and end-of-storage levels of glycolytic metabolites in 13,029 volunteers from the Recipient Epidemiology and Donor Evaluation Study. Associations were also observed to ancestry-specific genetic polymorphisms in regions encoding phosphofructokinase 1, platelet (which we detected in mature RBCs), hexokinase 1, and ADP-ribosyl cyclase 1 and 2 (CD38/BST1). Gene-metabolite associations were validated in fresh and stored RBCs from 525 Diversity Outbred mice, and via multi-omics characterization of 1,929 samples from 643 human RBC units during storage. ATP levels, breakdown, and deamination into hypoxanthine were associated with hemolysis in vitro and in vivo, both in healthy autologous transfusion recipients and in 5,816 critically ill patients receiving heterologous transfusions. Highlights: Blood donor age and sex affect glycolysis in stored RBCs from 13,029 volunteers;Ancestry, genetic polymorphisms in PFKP, HK1, CD38/BST1 influence RBC glycolysis;RBC PFKP boosts glycolytic fluxes when ATP is low, such as in stored RBCs;ATP and hypoxanthine are biomarkers of hemolysis in vitro and in vivo.

17.
Nat Commun ; 14(1): 7690, 2023 Nov 24.
Artículo en Inglés | MEDLINE | ID: mdl-38001096

RESUMEN

Surveillance programs for managing antimicrobial resistance (AMR) have yielded thousands of genomes suited for data-driven mechanism discovery. We present a workflow integrating pangenomics, gene annotation, and machine learning to identify AMR genes at scale. When applied to 12 species, 27,155 genomes, and 69 drugs, we 1) find AMR gene transfer mostly confined within related species, with 925 genes in multiple species but just eight in multiple phylogenetic classes, 2) demonstrate that discovery-oriented support vector machines outperform contemporary methods at recovering known AMR genes, recovering 263 genes compared to 145 by Pyseer, and 3) identify 142 AMR gene candidates. Validation of two candidates in E. coli BW25113 reveals cases of conditional resistance: ΔcycA confers ciprofloxacin resistance in minimal media with D-serine, and frdD V111D confers ampicillin resistance in the presence of ampC by modifying the overlapping promoter. We expect this approach to be adaptable to other species and phenotypes.


Asunto(s)
Antibacterianos , Escherichia coli , Antibacterianos/farmacología , Escherichia coli/genética , Farmacorresistencia Bacteriana/genética , Filogenia , Ciprofloxacina/farmacología
18.
Nat Commun ; 14(1): 7370, 2023 11 14.
Artículo en Inglés | MEDLINE | ID: mdl-37963869

RESUMEN

Functional annotation of open reading frames in microbial genomes remains substantially incomplete. Enzymes constitute the most prevalent functional gene class in microbial genomes and can be described by their specific catalytic functions using the Enzyme Commission (EC) number. Consequently, the ability to predict EC numbers could substantially reduce the number of un-annotated genes. Here we present a deep learning model, DeepECtransformer, which utilizes transformer layers as a neural network architecture to predict EC numbers. Using the extensively studied Escherichia coli K-12 MG1655 genome, DeepECtransformer predicted EC numbers for 464 un-annotated genes. We experimentally validated the enzymatic activities predicted for three proteins (YgfF, YciO, and YjdM). Further examination of the neural network's reasoning process revealed that the trained neural network relies on functional motifs of enzymes to predict EC numbers. Thus, DeepECtransformer is a method that facilitates the functional annotation of uncharacterized genes.


Asunto(s)
Aprendizaje Profundo , Escherichia coli K12 , Escherichia coli K12/genética , Proteínas/genética , Genoma , Escherichia coli/genética , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta
19.
Metabolites ; 13(11)2023 Nov 03.
Artículo en Inglés | MEDLINE | ID: mdl-37999223

RESUMEN

Pathway analysis is ubiquitous in biological data analysis due to the ability to integrate small simultaneous changes in functionally related components. While pathways are often defined based on either manual curation or network topological properties, an attractive alternative is to generate pathways around specific functions, in which metabolism can be defined as the production and consumption of specific metabolites. In this work, we present an algorithm, termed MetPath, that calculates pathways for condition-specific production and consumption of specific metabolites. We demonstrate that these pathways have several useful properties. Pathways calculated in this manner (1) take into account the condition-specific metabolic role of a gene product, (2) are localized around defined metabolic functions, and (3) quantitatively weigh the importance of expression to a function based on the flux contribution of the gene product. We demonstrate how these pathways elucidate network interactions between genes across different growth conditions and between cell types. Furthermore, the calculated pathways compare favorably to manually curated pathways in predicting the expression correlation between genes. To facilitate the use of these pathways, we have generated a large compendium of pathways under different growth conditions for E. coli. The MetPath algorithm provides a useful tool for metabolic network-based statistical analyses of high-throughput data.

20.
Metabolites ; 13(11)2023 Nov 11.
Artículo en Inglés | MEDLINE | ID: mdl-37999241

RESUMEN

Red blood cells (RBCs) are abundant (more than 80% of the total cells in the human body), yet relatively simple, as they lack nuclei and organelles, including mitochondria. Since the earliest days of biochemistry, the accessibility of blood and RBCs made them an ideal matrix for the characterization of metabolism. Because of this, investigations into RBC metabolism are of extreme relevance for research and diagnostic purposes in scientific and clinical endeavors. The relative simplicity of RBCs has made them an eligible model for the development of reconstruction maps of eukaryotic cell metabolism since the early days of systems biology. Computational models hold the potential to deepen knowledge of RBC metabolism, but also and foremost to predict in silico RBC metabolic behaviors in response to environmental stimuli. Here, we review now classic concepts on RBC metabolism, prior work in systems biology of unicellular organisms, and how this work paved the way for the development of reconstruction models of RBC metabolism. Translationally, we discuss how the fields of metabolomics and systems biology have generated evidence to advance our understanding of the RBC storage lesion, a process of decline in storage quality that impacts over a hundred million blood units transfused every year.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA