Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 99
Filter
1.
PLoS Comput Biol ; 20(5): e1012100, 2024 May.
Article in English | MEDLINE | ID: mdl-38768223

ABSTRACT

The activities of most enzymes and drugs depend on interactions between proteins and small molecules. Accurate prediction of these interactions could greatly accelerate pharmaceutical and biotechnological research. Current machine learning models designed for this task have a limited ability to generalize beyond the proteins used for training. This limitation is likely due to a lack of information exchange between the protein and the small molecule during the generation of the required numerical representations. Here, we introduce ProSmith, a machine learning framework that employs a multimodal Transformer Network to simultaneously process protein amino acid sequences and small molecule strings in the same input. This approach facilitates the exchange of all relevant information between the two molecule types during the computation of their numerical representations, allowing the model to account for their structural and functional interactions. Our final model combines gradient boosting predictions based on the resulting multimodal Transformer Network with independent predictions based on separate deep learning representations of the proteins and small molecules. The resulting predictions outperform recently published state-of-the-art models for predicting protein-small molecule interactions across three diverse tasks: predicting kinase inhibitions; inferring potential substrates for enzymes; and predicting Michaelis constants KM. The Python code provided can be used to easily implement and improve machine learning predictions involving arbitrary protein-small molecule interactions.


Subject(s)
Computational Biology , Machine Learning , Protein Kinase Inhibitors/pharmacology , Protein Kinase Inhibitors/chemistry , Substrate Specificity , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Proteins/metabolism , Proteins/chemistry , Amino Acid Sequence , Deep Learning , Protein Binding , Protein Kinases/metabolism , Protein Kinases/chemistry , Humans
2.
Nat Biotechnol ; 42(1): 18-19, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38191661
3.
Science ; 382(6671): 653, 2023 11 10.
Article in English | MEDLINE | ID: mdl-37943930

ABSTRACT

A philosopher reflects on his influential interrogations of free will, consciousness, and artificial intelligence.

4.
mSystems ; 8(5): e0076023, 2023 Oct 26.
Article in English | MEDLINE | ID: mdl-37795991

ABSTRACT

IMPORTANCE: Protein translation is the most expensive cellular process in fast-growing bacteria, and efficient proteome usage should thus be under strong natural selection. However, recent studies show that a considerable part of the proteome is unneeded for instantaneous cell growth in Escherichia coli. We still lack a systematic understanding of how this excess proteome is distributed across different pathways as a function of the growth conditions. We estimated the minimal required proteome across growth conditions in E. coli and compared the predictions with experimental data. We found that the proteome allocated to the most expensive internal pathways, including translation and the synthesis of amino acids and cofactors, is near the minimally required levels. In contrast, transporters and central carbon metabolism show much higher proteome levels than the predicted minimal abundance. Our analyses show that the proteome fraction unneeded for instantaneous cell growth decreases along the nutrient flow in E. coli.


Subject(s)
Escherichia coli Proteins , Escherichia coli , Escherichia coli/genetics , Proteome/chemistry , Metabolic Networks and Pathways , Escherichia coli Proteins/chemistry , Energy Metabolism
5.
Nat Commun ; 14(1): 4139, 2023 07 12.
Article in English | MEDLINE | ID: mdl-37438349

ABSTRACT

The turnover number kcat, a measure of enzyme efficiency, is central to understanding cellular physiology and resource allocation. As experimental kcat estimates are unavailable for the vast majority of enzymatic reactions, the development of accurate computational prediction methods is highly desirable. However, existing machine learning models are limited to a single, well-studied organism, or they provide inaccurate predictions except for enzymes that are highly similar to proteins in the training set. Here, we present TurNuP, a general and organism-independent model that successfully predicts turnover numbers for natural reactions of wild-type enzymes. We constructed model inputs by representing complete chemical reactions through differential reaction fingerprints and by representing enzymes through a modified and re-trained Transformer Network model for protein sequences. TurNuP outperforms previous models and generalizes well even to enzymes that are not similar to proteins in the training set. Parameterizing metabolic models with TurNuP-predicted kcat values leads to improved proteome allocation predictions. To provide a powerful and convenient tool for the study of molecular biochemistry and physiology, we implemented a TurNuP web server.


Subject(s)
Deep Learning , Amino Acid Sequence , Electric Power Supplies , Machine Learning , Proteome
6.
PLoS Comput Biol ; 19(6): e1011177, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37307285

ABSTRACT

A substantial fraction of the bacterial cytosol is occupied by catalysts and their substrates. While a higher volume density of catalysts and substrates might boost biochemical fluxes, the resulting molecular crowding can slow down diffusion, perturb the reactions' Gibbs free energies, and reduce the catalytic efficiency of proteins. Due to these tradeoffs, dry mass density likely possesses an optimum that facilitates maximal cellular growth and that is interdependent on the cytosolic molecule size distribution. Here, we analyze the balanced growth of a model cell, accounting systematically for crowding effects on reaction kinetics. Its optimal cytosolic volume occupancy depends on the nutrient-dependent resource allocation into large ribosomal vs. small metabolic macromolecules, reflecting a tradeoff between the saturation of metabolic enzymes, favoring larger occupancies with higher encounter rates, and the inhibition of the ribosomes, favoring lower occupancies with unhindered diffusion of tRNAs. Our predictions across growth rates are quantitatively consistent with the experimentally observed reduction in volume occupancy on rich media compared to minimal media in E. coli. Strong deviations from optimal cytosolic occupancy only lead to minute reductions in growth rate, which are nevertheless evolutionarily relevant due to large bacterial population sizes. In sum, cytosolic density variation in bacterial cells appears to be consistent with an optimality principle of cellular efficiency.


Subject(s)
Biochemical Phenomena , Escherichia coli , Escherichia coli/metabolism , Ribosomes/metabolism , Kinetics , Cell Proliferation
7.
PLoS Comput Biol ; 19(6): e1011156, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37279246

ABSTRACT

The physiology of biological cells evolved under physical and chemical constraints, such as mass conservation across the network of biochemical reactions, nonlinear reaction kinetics, and limits on cell density. For unicellular organisms, the fitness that governs this evolution is mainly determined by the balanced cellular growth rate. We previously introduced growth balance analysis (GBA) as a general framework to model and analyze such nonlinear systems, revealing important analytical properties of optimal balanced growth states. It has been shown that at optimality, only a minimal subset of reactions can have nonzero flux. However, no general principles have been established to determine if a specific reaction is active at optimality. Here, we extend the GBA framework to study the optimality of each biochemical reaction, and we identify the mathematical conditions determining whether a reaction is active or not at optimal growth in a given environment. We reformulate the mathematical problem in terms of a minimal number of dimensionless variables and use the Karush-Kuhn-Tucker (KKT) conditions to identify fundamental principles of optimal resource allocation in GBA models of any size and complexity. Our approach helps to identify from first principles the economic values of biochemical reactions, expressed as marginal changes in cellular growth rate; these economic values can be related to the costs and benefits of proteome allocation into the reactions' catalysts. Our formulation also generalizes the concepts of Metabolic Control Analysis to models of growing cells. We show how the extended GBA framework unifies and extends previous approaches of cellular modeling and analysis, putting forward a program to analyze cellular growth through the stationarity conditions of a Lagrangian function. GBA thereby provides a general theoretical toolbox for the study of fundamental mathematical properties of balanced cellular growth.


Subject(s)
Models, Biological , Cell Proliferation , Kinetics , Cell Cycle
8.
Nat Commun ; 14(1): 2787, 2023 05 15.
Article in English | MEDLINE | ID: mdl-37188731

ABSTRACT

For most proteins annotated as enzymes, it is unknown which primary and/or secondary reactions they catalyze. Experimental characterizations of potential substrates are time-consuming and costly. Machine learning predictions could provide an efficient alternative, but are hampered by a lack of information regarding enzyme non-substrates, as available training data comprises mainly positive examples. Here, we present ESP, a general machine-learning model for the prediction of enzyme-substrate pairs with an accuracy of over 91% on independent and diverse test data. ESP can be applied successfully across widely different enzymes and a broad range of metabolites included in the training data, outperforming models designed for individual, well-studied enzyme families. ESP represents enzymes through a modified transformer model, and is trained on data augmented with randomly sampled small molecules assigned as non-substrates. By facilitating easy in silico testing of potential substrates, the ESP web server may support both basic and applied science.


Subject(s)
Deep Learning , Proteins , Machine Learning , Support Vector Machine , Catalysis
9.
Nat Biotechnol ; 41(4): 450-451, 2023 Apr.
Article in English | MEDLINE | ID: mdl-36973558
10.
Plant Physiol ; 191(2): 1214-1233, 2023 02 12.
Article in English | MEDLINE | ID: mdl-36423222

ABSTRACT

Reactive carbonyl species (RCS) such as methylglyoxal (MGO) and glyoxal (GO) are highly reactive, unwanted side-products of cellular metabolism maintained at harmless intracellular levels by specific scavenging mechanisms.MGO and GO are metabolized through the glyoxalase (GLX) system, which consists of two enzymes acting in sequence, GLXI and GLXII. While plant genomes encode a number of different GLX isoforms, their specific functions and how they arose during evolution are unclear. Here, we used Arabidopsis (Arabidopsis thaliana) as a model species to investigate the evolutionary history of GLXI and GLXII in plants and whether the GLX system can protect plant cells from the toxicity of RCS other than MGO and GO. We show that plants possess two GLX systems of different evolutionary origins and with distinct structural and functional properties. The first system is shared by all eukaryotes, scavenges MGO and GO, especially during seedling establishment, and features Zn2+-type GLXI proteins with a metal cofactor preference that were present in the last eukaryotic common ancestor. GLXI and GLXII of the second system, featuring Ni2+-type GLXI, were acquired by the last common ancestor of Viridiplantae through horizontal gene transfer from proteobacteria and can together metabolize keto-D-glucose (KDG, glucosone), a glucose-derived RCS, to D-gluconate. When plants displaying loss-of-function of a Viridiplantae-specific GLXI were grown in KDG, D-gluconate levels were reduced to 10%-15% of those in the wild type, while KDG levels showed an increase of 48%-67%. In contrast to bacterial GLXI homologs, which are active as dimers, plant Ni2+-type GLXI proteins contain a domain duplication, are active as monomers, and have a modified second active site. The acquisition and neofunctionalization of a structurally, biochemically, and functionally distinct GLX system indicates that Viridiplantae are under strong selection to detoxify diverse RCS.


Subject(s)
Arabidopsis , Lactoylglutathione Lyase , Magnesium Oxide , Lactoylglutathione Lyase/chemistry , Lactoylglutathione Lyase/genetics , Lactoylglutathione Lyase/metabolism , Protein Isoforms/genetics , Arabidopsis/genetics , Arabidopsis/metabolism
11.
Mol Syst Biol ; 18(9): e10490, 2022 09.
Article in English | MEDLINE | ID: mdl-36124745

ABSTRACT

Dose-response relationships are a general concept for quantitatively describing biological systems across multiple scales, from the molecular to the whole-cell level. A clinically relevant example is the bacterial growth response to antibiotics, which is routinely characterized by dose-response curves. The shape of the dose-response curve varies drastically between antibiotics and plays a key role in treatment, drug interactions, and resistance evolution. However, the mechanisms shaping the dose-response curve remain largely unclear. Here, we show in Escherichia coli that the distinctively shallow dose-response curve of the antibiotic trimethoprim is caused by a negative growth-mediated feedback loop: Trimethoprim slows growth, which in turn weakens the effect of this antibiotic. At the molecular level, this feedback is caused by the upregulation of the drug target dihydrofolate reductase (FolA/DHFR). We show that this upregulation is not a specific response to trimethoprim but follows a universal trend line that depends primarily on the growth rate, irrespective of its cause. Rewiring the feedback loop alters the dose-response curve in a predictable manner, which we corroborate using a mathematical model of cellular resource allocation and growth. Our results indicate that growth-mediated feedback loops may shape drug responses more generally and could be exploited to design evolutionary traps that enable selection against drug resistance.


Subject(s)
Anti-Bacterial Agents , Tetrahydrofolate Dehydrogenase , Anti-Bacterial Agents/pharmacology , Escherichia coli/genetics , Feedback , Tetrahydrofolate Dehydrogenase/genetics , Tetrahydrofolate Dehydrogenase/pharmacology , Trimethoprim/pharmacology
12.
Genome Biol ; 23(1): 179, 2022 Aug 25.
Article in English | MEDLINE | ID: mdl-36008862
13.
Science ; 374(6569): 828, 2021 Nov 12.
Article in English | MEDLINE | ID: mdl-34762478

ABSTRACT

A paleontologist's history of life highlights the recurring role played by geological, climatic, and atmospheric forces.

14.
PLoS Genet ; 17(11): e1009939, 2021 11.
Article in English | MEDLINE | ID: mdl-34843465

ABSTRACT

The distribution of cellular resources across bacterial proteins has been quantified through phenomenological growth laws. Here, we describe a complementary bacterial growth law for RNA composition, emerging from optimal cellular resource allocation into ribosomes and ternary complexes. The predicted decline of the tRNA/rRNA ratio with growth rate agrees quantitatively with experimental data. Its regulation appears to be implemented in part through chromosomal localization, as rRNA genes are typically closer to the origin of replication than tRNA genes and thus have increasingly higher gene dosage at faster growth. At the highest growth rates in E. coli, the tRNA/rRNA gene dosage ratio based on chromosomal positions is almost identical to the observed and theoretically optimal tRNA/rRNA expression ratio, indicating that the chromosomal arrangement has evolved to favor maximal transcription of both types of genes at this condition.


Subject(s)
Escherichia coli/genetics , Genome, Bacterial/genetics , Ribosomes/genetics , Transcription, Genetic , Chromosomes, Bacterial/genetics , Escherichia coli/growth & development , Gene Dosage/genetics , RNA, Bacterial/genetics , RNA, Ribosomal/genetics , RNA, Transfer/genetics
15.
PLoS Biol ; 19(10): e3001402, 2021 10.
Article in English | MEDLINE | ID: mdl-34665809

ABSTRACT

The Michaelis constant KM describes the affinity of an enzyme for a specific substrate and is a central parameter in studies of enzyme kinetics and cellular physiology. As measurements of KM are often difficult and time-consuming, experimental estimates exist for only a minority of enzyme-substrate combinations even in model organisms. Here, we build and train an organism-independent model that successfully predicts KM values for natural enzyme-substrate combinations using machine and deep learning methods. Predictions are based on a task-specific molecular fingerprint of the substrate, generated using a graph neural network, and on a deep numerical representation of the enzyme's amino acid sequence. We provide genome-scale KM predictions for 47 model organisms, which can be used to approximately relate metabolite concentrations to cellular physiology and to aid in the parameterization of kinetic models of cellular metabolism.


Subject(s)
Deep Learning , Genome , Databases, Genetic , Enzymes/metabolism , Kinetics , Metabolomics , Models, Biological , Neural Networks, Computer , Substrate Specificity
16.
PLoS Biol ; 19(10): e3001416, 2021 10.
Article in English | MEDLINE | ID: mdl-34699521

ABSTRACT

Much recent progress has been made to understand the impact of proteome allocation on bacterial growth; much less is known about the relationship between the abundances of the enzymes and their substrates, which jointly determine metabolic fluxes. Here, we report a correlation between the concentrations of enzymes and their substrates in Escherichia coli. We suggest this relationship to be a consequence of optimal resource allocation, subject to an overall constraint on the biomass density: For a cellular reaction network composed of effectively irreversible reactions, maximal reaction flux is achieved when the dry mass allocated to each substrate is equal to the dry mass of the unsaturated (or "free") enzymes waiting to consume it. Calculations based on this optimality principle successfully predict the quantitative relationship between the observed enzyme and metabolite abundances, parameterized only by molecular masses and enzyme-substrate dissociation constants (Km). The corresponding organizing principle provides a fundamental rationale for cellular investment into different types of molecules, which may aid in the design of more efficient synthetic cellular systems.


Subject(s)
Enzymes/metabolism , Escherichia coli/enzymology , Kinetics , Metabolome , Substrate Specificity
17.
Sci Adv ; 7(37): eabg4298, 2021 Sep 10.
Article in English | MEDLINE | ID: mdl-34516872

ABSTRACT

Glutamate has dual roles in metabolism and signaling; thus, signaling functions must be isolatable and distinct from metabolic fluctuations, as seen in low-glutamate domains at synapses. In plants, wounding triggers electrical and calcium (Ca2+) signaling, which involve homologs of mammalian glutamate receptors. The hydraulic dispersal and squeeze-cell hypotheses implicate pressure as a key component of systemic signaling. Here, we identify the stretch-activated anion channel MSL10 as necessary for proper wound-induced electrical and Ca2+ signaling. Wound gene induction, genetics, and Ca2+ imaging indicate that MSL10 acts in the same pathway as the glutamate receptor­like proteins (GLRs). Analogous to mammalian NMDA glutamate receptors, GLRs may serve as coincidence detectors gated by the combined requirement for ligand binding and membrane depolarization, here mediated by stretch activation of MSL10. This study provides a molecular genetic basis for a role of mechanical signal perception and the transmission of long-distance electrical and Ca2+ signals in plants.

18.
Sci Rep ; 11(1): 15979, 2021 08 05.
Article in English | MEDLINE | ID: mdl-34354112

ABSTRACT

The regulation of resource allocation in biological systems observed today is the cumulative result of natural selection in ancestral and recent environments. To what extent are observed resource allocation patterns in different photosynthetic types optimally adapted to current conditions, and to what extent do they reflect ancestral environments? Here, we explore these questions for C3, C4, and C3-C4 intermediate plants of the model genus Flaveria. We developed a detailed mathematical model of carbon fixation, which accounts for various environmental parameters and for energy and nitrogen partitioning across photosynthetic components. This allows us to assess environment-dependent plant physiology and performance as a function of resource allocation patterns. Models of C4 plants optimized for conditions experienced by evolutionary ancestors perform better than models accounting for experimental growth conditions, indicating low phenotypic plasticity. Supporting this interpretation, the model predicts that C4 species need to re-allocate more nitrogen between photosynthetic components than C3 species to adapt to new environments. We thus hypothesize that observed resource distribution patterns in C4 plants still reflect optimality in ancestral environments, allowing the quantitative inference of these environments from today's plants. Our work allows us to quantify environmental effects on photosynthetic resource allocation and performance in the light of evolutionary history.

19.
Nat Commun ; 11(1): 5260, 2020 10 16.
Article in English | MEDLINE | ID: mdl-33067428

ABSTRACT

Protein synthesis is the most expensive process in fast-growing bacteria. Experimentally observed growth rate dependencies of the translation machinery form the basis of powerful phenomenological growth laws; however, a quantitative theory on the basis of biochemical and biophysical constraints is lacking. Here, we show that the growth rate-dependence of the concentrations of ribosomes, tRNAs, mRNA, and elongation factors observed in Escherichia coli can be predicted accurately from a minimization of cellular costs in a mechanistic model of protein translation. The model is constrained only by the physicochemical properties of the molecules and has no adjustable parameters. The costs of individual components (made of protein and RNA parts) can be approximated through molecular masses, which correlate strongly with alternative cost measures such as the molecules' carbon content or the requirement of energy or enzymes for their biosynthesis. Analogous cost minimization approaches may facilitate similar quantitative insights also for other cellular subsystems.


Subject(s)
Escherichia coli Proteins/genetics , Escherichia coli/genetics , Protein Biosynthesis , Escherichia coli/metabolism , Escherichia coli Proteins/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , RNA, Transfer/genetics , RNA, Transfer/metabolism , Ribosomes/genetics , Ribosomes/metabolism
20.
Nat Commun ; 11(1): 1226, 2020 03 06.
Article in English | MEDLINE | ID: mdl-32144263

ABSTRACT

The biological fitness of microbes is largely determined by the rate with which they replicate their biomass composition. Mathematical models that maximize this balanced growth rate while accounting for mass conservation, reaction kinetics, and limits on dry mass per volume are inevitably non-linear. Here, we develop a general theory for such models, termed Growth Balance Analysis (GBA), which provides explicit expressions for protein concentrations, fluxes, and growth rates. These variables are functions of the concentrations of cellular components, for which we calculate marginal fitness costs and benefits that are related to metabolic control coefficients. At maximal growth rate, the net benefits of all concentrations are equal. Based solely on physicochemical constraints, GBA unveils fundamental quantitative principles of cellular resource allocation and growth; it accurately predicts the relationship between growth rates and ribosome concentrations in E. coli and yeast and between growth rate and dry mass density in E. coli.


Subject(s)
Cell Proliferation/physiology , Escherichia coli/physiology , Models, Biological , Computer Simulation , Kinetics , Metabolic Networks and Pathways/physiology , Ribosomes/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...