Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 105
Filtrar
1.
Microb Cell Fact ; 23(1): 138, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38750569

RESUMO

BACKGROUND: Genome-scale metabolic models (GEMs) serve as effective tools for understanding cellular phenotypes and predicting engineering targets in the development of industrial strain. Enzyme-constrained genome-scale metabolic models (ecGEMs) have emerged as a valuable advancement, providing more accurate predictions and unveiling new engineering targets compared to models lacking enzyme constraints. In 2022, a stoichiometric GEM, iDL1450, was reconstructed for the industrially significant fungus Myceliophthora thermophila. To enhance the GEM's performance, an ecGEM was developed for M. thermophila in this study. RESULTS: Initially, the model iDL1450 underwent refinement and updates, resulting in a new version named iYW1475. These updates included adjustments to biomass components, correction of gene-protein-reaction (GPR) rules, and a consensus on metabolites. Subsequently, the first ecGEM for M. thermophila was constructed using machine learning-based kcat data predicted by TurNuP within the ECMpy framework. During the construction, three versions of ecGEMs were developed based on three distinct kcat collection methods, namely AutoPACMEN, DLKcat and TurNuP. After comparison, the ecGEM constructed using TurNuP-predicted kcat values performed better in several aspects and was selected as the definitive version of ecGEM for M. thermophila (ecMTM). Comparing ecMTM to iYW1475, the solution space was reduced and the growth simulation results more closely resembled realistic cellular phenotypes. Metabolic adjustment simulated by ecMTM revealed a trade-off between biomass yield and enzyme usage efficiency at varying glucose uptake rates. Notably, hierarchical utilization of five carbon sources derived from plant biomass hydrolysis was accurately captured and explained by ecMTM. Furthermore, based on enzyme cost considerations, ecMTM successfully predicted reported targets for metabolic engineering modification and introduced some new potential targets for chemicals produced in M. thermophila. CONCLUSIONS: In this study, the incorporation of enzyme constraint to iYW1475 not only improved prediction accuracy but also broadened the model's applicability. This research demonstrates the effectiveness of integrating of machine learning-based kcat data in the construction of ecGEMs especially in situations where there is limited measured enzyme kinetic parameters for a specific organism.


Assuntos
Aprendizado de Máquina , Redes e Vias Metabólicas , Sordariales , Sordariales/metabolismo , Sordariales/enzimologia , Sordariales/genética , Engenharia Metabólica/métodos , Biomassa , Modelos Biológicos , Cinética , Genoma Fúngico
2.
Int J Mol Sci ; 25(9)2024 Apr 28.
Artigo em Inglês | MEDLINE | ID: mdl-38732022

RESUMO

The molecular weight (MW) of an enzyme is a critical parameter in enzyme-constrained models (ecModels). It is determined by two factors: the presence of subunits and the abundance of each subunit. Although the number of subunits (NS) can potentially be obtained from UniProt, this information is not readily available for most proteins. In this study, we addressed this gap by extracting and curating subunit information from the UniProt database to establish a robust benchmark dataset. Subsequently, we propose a novel model named DeepSub, which leverages the protein language model and Bi-directional Gated Recurrent Unit (GRU), to predict NS in homo-oligomers solely based on protein sequences. DeepSub demonstrates remarkable accuracy, achieving an accuracy rate as high as 0.967, surpassing the performance of QUEEN. To validate the effectiveness of DeepSub, we performed predictions for protein homo-oligomers that have been reported in the literature but are not documented in the UniProt database. Examples include homoserine dehydrogenase from Corynebacterium glutamicum, Matrilin-4 from Mus musculus and Homo sapiens, and the Multimerins protein family from M. musculus and H. sapiens. The predicted results align closely with the reported findings in the literature, underscoring the reliability and utility of DeepSub.


Assuntos
Bases de Dados de Proteínas , Aprendizado Profundo , Subunidades Proteicas , Subunidades Proteicas/química , Subunidades Proteicas/metabolismo , Animais , Humanos , Multimerização Proteica , Camundongos , Biologia Computacional/métodos
3.
Nucleic Acids Res ; 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38769057

RESUMO

A key challenge in pathway design is finding proper enzymes that can be engineered to catalyze a non-natural reaction. Although existing tools can identify potential enzymes based on similar reactions, these tools encounter several issues. Firstly, the calculated similar reactions may not even have the same reaction type. Secondly, the associated enzymes are often numerous and identifying the most promising candidate enzymes is difficult due to the lack of data for evaluation. Thirdly, existing web tools do not provide interactive functions that enable users to fine-tune results based on their expertise. Here, we present REME (https://reme.biodesign.ac.cn/), the first integrated web platform for reaction enzyme mining and evaluation. Combining atom-to-atom mapping, atom type change identification, and reaction similarity calculation enables quick ranking and visualization of reactions similar to an objective non-natural reaction. Additional functionality enables users to filter similar reactions by their specified functional groups and candidate enzymes can be further filtered (e.g. by organisms) or expanded by Enzyme Commission number (EC) or sequence homology. Afterward, enzyme attributes (such as kcat, Km, optimal temperature and pH) can be assessed with deep learning-based methods, facilitating the swift identification of potential enzymes that can catalyze the non-natural reaction.

4.
Synth Syst Biotechnol ; 9(3): 494-502, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38651096

RESUMO

Genome-scale metabolic models (GEMs) have been widely employed to predict microorganism behaviors. However, GEMs only consider stoichiometric constraints, leading to a linear increase in simulated growth and product yields as substrate uptake rates rise. This divergence from experimental measurements prompted the creation of enzyme-constrained models (ecModels) for various species, successfully enhancing chemical production. Building upon studies that allocate macromolecule resources, we developed a Python-based workflow (ECMpy) that constructs an enzyme-constrained model. This involves directly imposing an enzyme amount constraint in GEM and accounting for protein subunit composition in reactions. However, this procedure demands manual collection of enzyme kinetic parameter information and subunit composition details, making it rather user-unfriendly. In this work, we've enhanced the ECMpy toolbox to version 2.0, broadening its scope to automatically generate ecGEMs for a wider array of organisms. ECMpy 2.0 automates the retrieval of enzyme kinetic parameters and employs machine learning for predicting these parameters, which significantly enhances parameter coverage. Additionally, ECMpy 2.0 introduces common analytical and visualization features for ecModels, rendering computational results more user accessible. Furthermore, ECMpy 2.0 seamlessly integrates three published algorithms that exploit ecModels to uncover potential targets for metabolic engineering. ECMpy 2.0 is available at https://github.com/tibbdc/ECMpy or as a pip package (https://pypi.org/project/ECMpy/).

5.
Synth Syst Biotechnol ; 9(2): 304-311, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38510205

RESUMO

Proteins play a pivotal role in coordinating the functions of organisms, essentially governing their traits, as the dynamic arrangement of diverse amino acids leads to a multitude of folded configurations within peptide chains. Despite dynamic changes in amino acid composition of an individual protein (referred to as AAP) and great variance in protein expression levels under different conditions, our study, utilizing transcriptomics data from four model organisms uncovers surprising stability in the overall amino acid composition of the total cellular proteins (referred to as AACell). Although this value may vary between different species, we observed no significant differences among distinct strains of the same species. This indicates that organisms enforce system-level constraints to maintain a consistent AACell, even amid fluctuations in AAP and protein expression. Further exploration of this phenomenon promises insights into the intricate mechanisms orchestrating cellular protein expression and adaptation to varying environmental challenges.

6.
Front Microbiol ; 14: 1277847, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38053556

RESUMO

Sulfur-oxidizing bacteria play a crucial role in various processes, including mine bioleaching, biodesulfurization, and treatment of sulfur-containing wastewater. Nevertheless, the pathway involved in sulfur oxidation is highly intricate, making it complete comprehension a formidable and protracted undertaking. The mechanisms of sulfur oxidation within the Acidithiobacillus genus, along with the process of energy production, remain areas that necessitate further research and elucidation. In this study, a novel strain of sulfur-oxidizing bacterium, Acidithiobacillus Ameehan, was isolated. Several physiological characteristics of the strain Ameehan were verified and its complete genome sequence was presented in the study. Besides, the first genome-scale metabolic network model (AMEE_WP1377) was reconstructed for Acidithiobacillus Ameehan to gain a comprehensive understanding of the metabolic capacity of the strain.The characteristics of Acidithiobacillus Ameehan included morphological size and an optimal growth temperature range of 37-45°C, as well as an optimal growth pH range of pH 2.0-8.0. The microbe was found to be capable of growth when sulfur and K2O6S4 were supplied as the energy source and electron donor for CO2 fixation. Conversely, it could not utilize Na2S2O3, FeS2, and FeSO4·7H2O as the energy source or electron donor for CO2 fixation, nor could it grow using glucose or yeast extract as a carbon source. Genome annotation revealed that the strain Ameehan possessed a series of sulfur oxidizing genes that enabled it to oxidize elemental sulfur or various reduced inorganic sulfur compounds (RISCs). In addition, the bacterium also possessed carbon fixing genes involved in the incomplete Calvin-Benson-Bassham (CBB) cycle. However, the bacterium lacked the ability to oxidize iron and fix nitrogen. By implementing a constraint-based flux analysis to predict cellular growth in the presence of 71 carbon sources, 88.7% agreement with experimental Biolog data was observed. Five sulfur oxidation pathways were discovered through model simulations. The optimal sulfur oxidation pathway had the highest ATP production rate of 14.81 mmol/gDW/h, NADH/NADPH production rate of 5.76 mmol/gDW/h, consumed 1.575 mmol/gDW/h of CO2, and 1.5 mmol/gDW/h of sulfur. Our findings provide a comprehensive outlook on the most effective cellular metabolic pathways implicated in sulfur oxidation within Acidithiobacillus Ameehan. It suggests that the OMP (outer-membrane proteins) and SQR enzymes (sulfide: quinone oxidoreductase) have a significant impact on the energy production efficiency of sulfur oxidation, which could have potential biotechnological applications.

7.
Front Plant Sci ; 14: 1281348, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38023876

RESUMO

The systematical characterization and understanding of the metabolic behaviors are the basis of the efficient plant metabolic engineering and synthetic biology. Genome-scale metabolic networks (GSMNs) are indispensable tools for the comprehensive characterization of overall metabolic profile. Here we first constructed a GSMN of tobacco, which is one of the most widely used plant chassis, and then combined the tobacco GSMN and multiomics analysis to systematically elucidate the impact of in-vitro cultivation on the tobacco metabolic network. In-vitro cultivation is a widely used technique for plant cultivation, not only in the field of basic research but also for the rapid propagation of valuable horticultural and pharmaceutical plants. However, the systemic effects of in-vitro cultivation on overall plant metabolism could easily be overlooked and are still poorly understood. We found that in-vitro tobacco showed slower growth, less biomass and suppressed photosynthesis than soil-grown tobacco. Many changes of metabolites and metabolic pathways between in-vitro and soil-grown tobacco plants were identified, which notably revealed a significant increase of the amino acids content under in-vitro condition. The in silico investigation showed that in-vitro tobacco downregulated photosynthesis and primary carbon metabolism, while significantly upregulated the GS/GOGAT cycle, as well as producing more energy and less NADH/NADPH to acclimate in-vitro growth demands. Altogether, the combination of experimental and in silico analyses offers an unprecedented view of tobacco metabolism, with valuable insights into the impact of in-vitro cultivation, enabling more efficient utilization of in-vitro techniques for plant propagation and metabolic engineering.

8.
Synth Syst Biotechnol ; 8(4): 688-696, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37927897

RESUMO

Pseudomonas stutzeri A1501 is a non-fluorescent denitrifying bacteria that belongs to the gram-negative bacterial group. As a prominent strain in the fields of agriculture and bioengineering, there is still a lack of comprehensive understanding regarding its metabolic capabilities, specifically in terms of central metabolism and substrate utilization. Therefore, further exploration and extensive studies are required to gain a detailed insight into these aspects. This study reconstructed a genome-scale metabolic network model for P. stutzeri A1501 and conducted extensive curations, including correcting energy generation cycles, respiratory chains, and biomass composition. The final model, iQY1018, was successfully developed, covering more genes and reactions and having higher prediction accuracy compared with the previously published model iPB890. The substrate utilization ability of 71 carbon sources was investigated by BIOLOG experiment and was utilized to validate the model quality. The model prediction accuracy of substrate utilization for P. stutzeri A1501 reached 90 %. The model analysis revealed its new ability in central metabolism and predicted that the strain is a suitable chassis for the production of Acetyl CoA-derived products. This work provides an updated, high-quality model of P. stutzeri A1501for further research and will further enhance our understanding of the metabolic capabilities.

9.
ACS Synth Biol ; 12(11): 3381-3392, 2023 11 17.
Artigo em Inglês | MEDLINE | ID: mdl-37870756

RESUMO

Isopentyldiol (IPDO) is an important raw material in the cosmetic industry. So far, IPDO is exclusively produced through chemical synthesis. Growing interest in natural personal care products has inspired the quest to develop a biobased process. We previously reported a biosynthetic route that produces IPDO via extending the leucine catabolism (route A), the efficiency of which, however, is not satisfactory. To address this issue, we computationally designed a novel non-natural IPDO synthesis pathway (route B) using RetroPath RL, the state-of-the-art tool for bioretrosynthesis based on artificial intelligence methods. We compared this new pathway with route A and two other intuitively designed routes for IPDO biosynthesis from various perspectives. Route B, which exhibits the highest thermodynamic driving force, least non-native reaction steps, and lowest energy requirements, appeared to hold the greatest potential for IPDO production. All three newly designed routes were then implemented in the Escherichia coli BL21(DE3) strain. Results show that the computationally designed route B can produce 2.2 mg/L IPDO from glucose but no IPDO production from routes C and D. These results highlight the importance and usefulness of in silico design and comprehensive evaluation of the potential efficiencies of candidate pathways in constructing novel non-natural pathways for the production of biochemicals.


Assuntos
Inteligência Artificial , Escherichia coli , Escherichia coli/genética , Escherichia coli/metabolismo , Vias Biossintéticas , Engenharia Metabólica/métodos
10.
Synth Syst Biotechnol ; 8(4): 597-605, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37743907

RESUMO

Metabolic network models have become increasingly precise and accurate as the most widespread and practical digital representations of living cells. The prediction functions were significantly expanded by integrating cellular resources and abiotic constraints in recent years. However, if unreasonable modeling methods were adopted due to a lack of consideration of biological knowledge, the conflicts between stoichiometric and other constraints, such as thermodynamic feasibility and enzyme resource availability, would lead to distorted predictions. In this work, we investigated a prediction anomaly of EcoETM, a constraints-based metabolic network model, and introduced the idea of enzyme compartmentalization into the analysis process. Through rational combination of reactions, we avoid the false prediction of pathway feasibility caused by the unrealistic assumption of free intermediate metabolites. This allowed us to correct the pathway structures of l-serine and l-tryptophan. A specific analysis explains the application method of the EcoETM-like model and demonstrates its potential and value in correcting the prediction results in pathway structure by resolving the conflict between different constraints and incorporating the evolved roles of enzymes as reaction compartments. Notably, this work also reveals the trade-off between product yield and thermodynamic feasibility. Our work is of great value for the structural improvement of constraints-based models.

11.
J Biomol Struct Dyn ; : 1-11, 2023 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-37676253

RESUMO

Allosteric feedback inhibition of the committed step in amino acid biosynthetic pathways is a major concern for production of amino acids at industrial scale. Anthranilate synthase (AS) catalyzes the first reaction of tryptophan biosynthetic pathway found in microorganisms and is feedback inhibited by its own product i.e. tryptophan. Here, we identified new mutant sites in AS using computational mutagenesis approach. MD simulations (20 ns) followed by MMPBSA and per residue decomposition energy analysis identified seven amino acid residues with best binding affinity for tryptophan. All 19 mutant structures were generated for each identified amino acid residue followed by simulation to evaluate effect of mutation on protein stability. Later, molecular docking studies were employed to generate mutant-tryptophan complex and structures with binding energies (kcal/mol) much higher than wild-type AS were selected. Finally, two mutants i.e., S37W and S37H were identified on the basis of positive binding scores and loss of tryptophan binding inside pocket. Further, MD simulations run for 200 ns were performed over these mutant-tryptophan complexes followed by RMSD, RMSF, radius of gyration , solvent accessible surface area , intra-protein hydrogen bond numbers, principal component analysis, free energy landscape (FEL) and secondary structure analysis to rationale effect of mutations on stability of protein. Cross correlation analysis of mutant site amino acids (S37W) with key residues of catalytic site (G325, T326, H395 and G482) was done to evaluate the effect of mutations on catalytic site conformation. Current computational mutagenesis approach predicted two mutants S37W and S37H with proposed deregulated feedback inhibition by tryptophan and retained catalytic activity.Communicated by Ramaswamy H. Sarma.

12.
Microbiol Res ; 276: 127485, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37683565

RESUMO

Gene expression in bacteria is regulated by multiple transcription factors. Clarifying the regulation mechanism of gene expression is necessary to understand bacterial physiological activities. To further understand the structure of the transcriptional regulatory network of Corynebacterium glutamicum, we applied independent component analysis, an unsupervised machine learning algorithm, to the high-quality C. glutamicum gene expression profile which includes 263 samples from 29 independent projects. We obtained 87 robust independent regulatory modules (iModulons). These iModulons explain 76.7% of the variance in the expression profile and constitute the quantitative transcriptional regulatory network of C. glutamicum. By analyzing the constituent genes in iModulons, we identified potential targets for 20 transcription factors. We also captured the changes in iModulon activities under different growth rates and dissolved oxygen concentrations, demonstrating the ability of iModulons to comprehensively interpret transcriptional responses to environmental changes. In summary, this study provides a genome-scale quantitative transcriptional regulatory network for C. glutamicum and informs future research on complex changes in the transcriptome.


Assuntos
Corynebacterium glutamicum , Corynebacterium glutamicum/genética , Transcriptoma/genética , Redes Reguladoras de Genes , Fatores de Transcrição/genética
13.
Microb Cell Fact ; 22(1): 161, 2023 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-37612753

RESUMO

Regulation of amino acid's biosynthetic pathway is of significant importance to maintain homeostasis and cell functions. Amino acids regulate their biosynthetic pathway by end-product feedback inhibition of enzymes catalyzing committed steps of a pathway. Discovery of new feedback resistant enzyme variants to enhance industrial production of amino acids is a key objective in industrial biotechnology. Deregulation of feedback inhibition has been achieved for various enzymes using in vitro and in silico mutagenesis techniques. As enzyme's function, its substrate binding capacity, catalysis activity, regulation and stability are dependent on its structural characteristics, here, we provide detailed structural analysis of all feedback sensitive enzyme targets in amino acid biosynthetic pathways. Current review summarizes information regarding structural characteristics of various enzyme targets and effect of mutations on their structures and functions especially in terms of deregulation of feedback inhibition. Furthermore, applicability of various experimental as well as computational mutagenesis techniques to accomplish feedback resistance has also been discussed in detail to have an insight into various aspects of research work reported in this particular field of study.


Assuntos
Aminoácidos , Biotecnologia , Retroalimentação , Mutagênese , Mutação
14.
Synth Syst Biotechnol ; 8(3): 498-508, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37554249

RESUMO

High-quality genome-scale metabolic models (GEMs) could play critical roles on rational design of microbial cell factories in the classical Design-Build-Test-Learn cycle of synthetic biology studies. Despite of the constant establishment and update of GEMs for model microorganisms such as Escherichia coli and Saccharomyces cerevisiae, high-quality GEMs for non-model industrial microorganisms are still scarce. Zymomonas mobilis subsp. mobilis ZM4 is a non-model ethanologenic microorganism with many excellent industrial characteristics that has been developing as microbial cell factories for biochemical production. Although five GEMs of Z. mobilis have been constructed, these models are either generating ATP incorrectly, or lacking information of plasmid genes, or not providing standard format file. In this study, a high-quality GEM iZM516 of Z. mobilis ZM4 was constructed. The information from the improved genome annotation, literature, datasets of Biolog Phenotype Microarray studies, and recently updated Gene-Protein-Reaction information was combined for the curation of iZM516. Finally, 516 genes, 1389 reactions, 1437 metabolites, and 3 cell compartments are included in iZM516, which also had the highest MEMOTE score of 91% among all published GEMs of Z. mobilis. Cell growth was then predicted by iZM516, which had 79.4% agreement with the experimental results of the substrate utilization. In addition, the potential endogenous succinate synthesis pathway of Z. mobilis ZM4 was proposed through simulation and analysis using iZM516. Furthermore, metabolic engineering strategies to produce succinate and 1,4-butanediol (1,4-BDO) were designed and then simulated under anaerobic condition using iZM516. The results indicated that 1.68 mol/mol succinate and 1.07 mol/mol 1,4-BDO can be achieved through combinational metabolic engineering strategies, which was comparable to that of the model species E. coli. Our study thus not only established a high-quality GEM iZM516 to help understand and design microbial cell factories for economic biochemical production using Z. mobilis as the chassis, but also provided guidance on building accurate GEMs for other non-model industrial microorganisms.

15.
Nat Commun ; 14(1): 5304, 2023 08 31.
Artigo em Inglês | MEDLINE | ID: mdl-37652926

RESUMO

Vitamin B6 is an essential nutrient with extensive applications in the medicine, food, animal feed, and cosmetics industries. Pyridoxine (PN), the most common commercial form of vitamin B6, is currently chemically synthesized using expensive and toxic chemicals. However, the low catalytic efficiencies of natural enzymes and the tight regulation of the metabolic pathway have hindered PN production by the microbial fermentation process. Here, we report an engineered Escherichia coli strain for PN production. Parallel pathway engineering is performed to decouple PN production and cell growth. Further, protein engineering is rationally designed including the inefficient enzymes PdxA, PdxJ, and the initial enzymes Epd and Dxs. By the iterative multimodule optimization strategy, the final strain produces 1.4 g/L of PN with productivity of 29.16 mg/L/h by fed-batch fermentation. The strategies reported here will be useful for developing microbial strains for the production of vitamins and other bioproducts having inherently low metabolic fluxes.


Assuntos
Proteínas de Escherichia coli , Piridoxina , Animais , Vitamina B 6 , Vitaminas , Engenharia de Proteínas , Escherichia coli/genética , Ligases , Proteínas de Escherichia coli/genética
16.
Biotechnol J ; 18(9): e2200578, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37300341

RESUMO

Recent advances in biofoundries have enabled the construction of a large quantity of strains in parallel, accelerating the design-build-test-learn (DBTL) cycles for strain development. However, the construction of a large number of strains by iterative gene manipulation is still time-consuming and costly, posing a challenge for the development of commercial strains. Common gene manipulations among different objective strains open up the possibility of reducing cost and time for strain construction in biofoundries by optimizing genetic manipulation schedules. A method is introduced consisting of two complementary algorithms for designing optimal parent-children manipulation schedules for strain construction: greedy search of common ancestor strains (GSCAS) and minimizing total manipulations (MTM). By reusing common ancestor strains, the number of strains to be constructed can be effectively reduced, resulting in a tree-like structure of descendants instead of linear lineages for each strain. The GSCAS algorithm can quickly find common ancestor strains and clusters them together based on their genetic makeup, and the MTM algorithm subsequently minimize the genetic manipulations required, resulting in a further reduction in the total number of genetic manipulations. The effectiveness of our method is demonstrated through a case study of 94 target strains, where GSCAS reduces an average of 36% of the total gene manipulations, and MTM reduces an additional 10%. The performance of both algorithms is robust among case studies with different average occurrences of gene manipulations across objective strains. Our method potentially improves cost efficiency and accelerate the development of commercial strains significantly. The implementation of the methods can be freely accessed via https://gscas-mtm.biodesign.ac.cn/.


Assuntos
Algoritmos , Engenharia Genética , Engenharia Genética/métodos , Biotecnologia/métodos
17.
Research (Wash D C) ; 6: 0153, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37275124

RESUMO

Enzyme commission (EC) numbers, which associate a protein sequence with the biochemical reactions it catalyzes, are essential for the accurate understanding of enzyme functions and cellular metabolism. Many ab initio computational approaches were proposed to predict EC numbers for given input protein sequences. However, the prediction performance (accuracy, recall, and precision), usability, and efficiency of existing methods decreased seriously when dealing with recently discovered proteins, thus still having much room to be improved. Here, we report HDMLF, a hierarchical dual-core multitask learning framework for accurately predicting EC numbers based on novel deep learning techniques. HDMLF is composed of an embedding core and a learning core; the embedding core adopts the latest protein language model for protein sequence embedding, and the learning core conducts the EC number prediction. Specifically, HDMLF is designed on the basis of a gated recurrent unit framework to perform EC number prediction in the multi-objective hierarchy, multitasking manner. Additionally, we introduced an attention layer to optimize the EC prediction and employed a greedy strategy to integrate and fine-tune the final model. Comparative analyses against 4 representative methods demonstrate that HDMLF stably delivers the highest performance, which improves accuracy and F1 score by 60% and 40% over the state of the art, respectively. An additional case study of tyrB predicted to compensate for the loss of aspartate aminotransferase aspC, as reported in a previous experimental study, shows that our model can also be used to uncover the enzyme promiscuity. Finally, we established a web platform, namely, ECRECer (https://ecrecer.biodesign.ac.cn), using an entirely could-based serverless architecture and provided an offline bundle to improve usability.

18.
Nucleic Acids Res ; 51(W1): W70-W77, 2023 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-37158271

RESUMO

Flux balance analysis (FBA) is an important method for calculating optimal pathways to produce industrially important chemicals in genome-scale metabolic models (GEMs). However, for biologists, the requirement of coding skills poses a significant obstacle to using FBA for pathway analysis and engineering target identification. Additionally, a time-consuming manual drawing process is often needed to illustrate the mass flow in an FBA-calculated pathway, making it challenging to detect errors or discover interesting metabolic features. To solve this problem, we developed CAVE, a cloud-based platform for the integrated calculation, visualization, examination and correction of metabolic pathways. CAVE can analyze and visualize pathways for over 100 published GEMs or user-uploaded GEMs, allowing for quicker examination and identification of special metabolic features in a particular GEM. Additionally, CAVE offers model modification functions, such as gene/reaction removal or addition, making it easy for users to correct errors found in pathway analysis and obtain more reliable pathways. With a focus on the design and analysis of optimal pathways for biochemicals, CAVE complements existing visualization tools based on manually drawn global maps and can be applied to a broader range of organisms for rational metabolic engineering. CAVE is available at https://cave.biodesign.ac.cn/.


Assuntos
Computação em Nuvem , Visualização de Dados , Redes e Vias Metabólicas , Metabolômica , Genoma , Redes e Vias Metabólicas/genética , Modelos Biológicos , Software , Metabolômica/instrumentação , Metabolômica/métodos
19.
Bioengineering (Basel) ; 10(4)2023 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-37106602

RESUMO

The naturally occurring one-carbon assimilation pathways for the production of acetyl-CoA and its derivatives often have low product yields because of carbon loss as CO2. We constructed a methanol assimilation pathway to produce poly-3-hydroxybutyrate (P3HB) using the MCC pathway, which included the ribulose monophosphate (RuMP) pathway for methanol assimilation and non-oxidative glycolysis (NOG) for acetyl-CoA (precursor for PHB synthesis) production. The theoretical product carbon yield of the new pathway is 100%, hence no carbon loss. We constructed this pathway in E. coli JM109 by introducing methanol dehydrogenase (Mdh), a fused Hps-phi (hexulose-6-phosphate synthase and 3-phospho-6-hexuloisomerase), phosphoketolase, and the genes for PHB synthesis. We also knocked out the frmA gene (encoding formaldehyde dehydrogenase) to prevent the dehydrogenation of formaldehyde to formate. Mdh is the primary rate-limiting enzyme in methanol uptake; thus, we compared the activities of three Mdhs in vitro and in vivo and then selected the one from Bacillus methanolicus MGA3 for further study. Experimental results indicate that, in agreement with the computational analysis results, the introduction of the NOG pathway is essential for improving PHB production (65% increase in PHB concentration, up to 6.19% of dry cell weight). We demonstrated that PHB can be produced from methanol via metabolic engineering, which provides the foundation for the future large-scale use of one-carbon compounds for biopolymer production.

20.
Genes (Basel) ; 14(3)2023 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-36980878

RESUMO

DNA synthesis is widely used in synthetic biology to construct and assemble sequences ranging from short RBS to ultra-long synthetic genomes. Many sequence features, such as the GC content and repeat sequences, are known to affect the synthesis difficulty and subsequently the synthesis cost. In addition, there are latent sequence features, especially local characteristics of the sequence, which might affect the DNA synthesis process as well. Reliable prediction of the synthesis difficulty for a given sequence is important for reducing the cost, but this remains a challenge. In this study, we propose a new automated machine learning (AutoML) approach to predict the DNA synthesis difficulty, which achieves an F1 score of 0.930 and outperforms the current state-of-the-art model. We found local sequence features that were neglected in previous methods, which might also affect the difficulty of DNA synthesis. Moreover, experimental validation based on ten genes of Escherichia coli strain MG1655 shows that our model can achieve an 80% accuracy, which is also better than the state of art. Moreover, we developed the cloud platform SCP4SSD using an entirely cloud-based serverless architecture for the convenience of the end users.


Assuntos
Escherichia coli , Aprendizado de Máquina , Sequência de Bases , Escherichia coli/genética , Composição de Bases , DNA/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA