Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Metab Eng ; 67: 227-236, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34242777

RESUMO

Predicting bioproduction titers from microbial hosts has been challenging due to complex interactions between microbial regulatory networks, stress responses, and suboptimal cultivation conditions. This study integrated knowledge mining, feature extraction, genome-scale modeling (GSM), and machine learning (ML) to develop a model for predicting Yarrowia lipolytica chemical titers (i.e., organic acids, terpenoids, etc.). First, Y. lipolytica production data, including cultivation conditions, genetic engineering strategies, and product information, was manually collected from literature (~100 papers) and stored as either numerical (e.g., substrate concentrations) or categorical (e.g., bioreactor modes) variables. For each case recorded, central pathway fluxes were estimated using GSMs and flux balance analysis (FBA) to provide metabolic features. Second, a ML ensemble learner was trained to predict strain production titers. Accurate predictions on the test data were obtained for instances with production titers >1 g/L (R2 = 0.87). However, the model had reduced predictability for low performance strains (0.01-1 g/L, R2 = 0.29) potentially due to biosynthesis bottlenecks not captured in the features. Feature ranking indicated that the FBA fluxes, the number of enzyme steps, the substrate inputs, and thermodynamic barriers (i.e., Gibbs free energy of reaction) were the most influential factors. Third, the model was evaluated on other oleaginous yeasts and indicated there were conserved features for some hosts that can be potentially exploited by transfer learning. The platform was also designed to assist computational strain design tools (such as OptKnock) to screen genetic targets for improved microbial production in light of experimental conditions.


Assuntos
Yarrowia , Aprendizado de Máquina , Engenharia Metabólica , Terpenos , Yarrowia/genética
2.
Theor Appl Genet ; 133(11): 3177-3186, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32785738

RESUMO

KEY MESSAGE: Up to five chromosomes that carry targeted recombinations can be stacked via a multiple funnel scheme, provided that the probabilities of inheriting intact chromosomes from donor parents are high. Targeted recombination involves inducing or selecting for recombination events at specific points in the genome to maximize genetic gain. Practical application of targeted recombination requires efficient breeding strategies to stack multiple chromosomes that carry such recombinations. Our objectives were to determine how many chromosomes with targeted recombinations can be feasibly stacked in a breeding program, and how the feasibility of stacking is affected by the crossing design, homozygosity versus heterozygosity of the donor lines, size of the chromosomal segment showing recombination, and probability of an intact chromosome being inherited. Based on a genetic model for maize (Zea mays L.) with 10 pairs of chromosomes, we examined different crossing schemes by simulation experiments and analytical studies with the goal of minimizing the number of generations and population sizes required for stacking. We found that targeted recombinations on up to five chromosomes can be stacked within practical constraints on time and resources. Linear and funnel schemes were less efficient than a multiple funnel scheme, which involved making all possible crosses in the first generation and stacking two additional chromosomes across multiple lines in subsequent generations. Homozygosity versus heterozygosity of the donor lines did not affect stacking efficiency. Population sizes and stacking efficiency were largely determined by the probability of intact chromosomal transfer from a donor parent to offspring. Such probability increased as the size of the chromosome segment from the donor decreased. When the probability of inheriting an intact chromosome was less than 0.15, population sizes needed for stacking became infeasibly large.


Assuntos
Cromossomos de Plantas/genética , Modelos Genéticos , Melhoramento Vegetal/métodos , Recombinação Genética , Zea mays/genética , Simulação por Computador , Cruzamentos Genéticos , Densidade Demográfica
3.
Bioinformatics ; 33(4): 608-611, 2017 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-27797784

RESUMO

Motivation: Metabolic network reconstructions are often incomplete. Constraint-based and pattern-based methodologies have been used for automated gap filling of these networks, each with its own strengths and weaknesses. Moreover, since validation of hypotheses made by gap filling tools require experimentation, it is challenging to benchmark performance and make improvements other than that related to speed and scalability. Results: We present BoostGAPFILL, an open source tool that leverages both constraint-based and machine learning methodologies for hypotheses generation in gap filling and metabolic model refinement. BoostGAPFILL uses metabolite patterns in the incomplete network captured using a matrix factorization formulation to constrain the set of reactions used to fill gaps in a metabolic network. We formulate a testing framework based on the available metabolic reconstructions and demonstrate the superiority of BoostGAPFILL to state-of-the-art gap filling tools. We randomly delete a number of reactions from a metabolic network and rate the different algorithms on their ability to both predict the deleted reactions from a universal set and to fill gaps. For most metabolic network reconstructions tested, BoostGAPFILL shows above 60% precision and recall, which is more than twice that of other existing tools. Availability and Implementation: MATLAB open source implementation ( https://github.com/Tolutola/BoostGAPFILL ). Contacts: toyetunde@wustl.edu or muhan@wustl.edu . Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Redes e Vias Metabólicas , Modelos Biológicos , Software , Algoritmos , Aprendizado de Máquina
4.
PLoS Comput Biol ; 12(4): e1004838, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-27092947

RESUMO

13C metabolic flux analysis (13C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux (http://mflux.org) that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 100 13C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on 13C-MFA are published for non-model species.


Assuntos
Bactérias/metabolismo , Análise do Fluxo Metabólico/métodos , Algoritmos , Isótopos de Carbono/metabolismo , Biologia Computacional , Árvores de Decisões , Aprendizado de Máquina , Análise do Fluxo Metabólico/estatística & dados numéricos , Redes e Vias Metabólicas , Modelos Biológicos , Máquina de Vetores de Suporte , Biologia de Sistemas
5.
Int J Mol Sci ; 18(1)2017 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-28054959

RESUMO

A mathematical model for the theoretical evaluation of microbial electrochemical technologies (METs) is presented that incorporates a detailed physico-chemical framework, includes multiple reactions (both at the electrodes and in the bulk phase) and involves a variety of microbial functional groups. The model is applied to two theoretical case studies: (i) A microbial electrolysis cell (MEC) for continuous anodic volatile fatty acids (VFA) oxidation and cathodic VFA reduction to alcohols, for which the theoretical system response to changes in applied voltage and VFA feed ratio (anode-to-cathode) as well as membrane type are investigated. This case involves multiple parallel electrode reactions in both anode and cathode compartments; (ii) A microbial fuel cell (MFC) for cathodic perchlorate reduction, in which the theoretical impact of feed flow rates and concentrations on the overall system performance are investigated. This case involves multiple electrode reactions in series in the cathode compartment. The model structure captures interactions between important system variables based on first principles and provides a platform for the dynamic description of METs involving electrode reactions both in parallel and in series and in both MFC and MEC configurations. Such a theoretical modelling approach, largely based on first principles, appears promising in the development and testing of MET control and optimization strategies.


Assuntos
Fontes de Energia Bioelétrica/microbiologia , Técnicas Eletroquímicas , Biodegradação Ambiental , Butanóis/análise , Butanóis/metabolismo , Simulação por Computador , Eletrodos , Eletrólise , Etanol/análise , Etanol/metabolismo , Ácidos Graxos/análise , Ácidos Graxos/metabolismo , Modelos Biológicos , Oxirredução , Percloratos/isolamento & purificação , Percloratos/metabolismo
6.
Nat Commun ; 14(1): 1159, 2023 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-36859392

RESUMO

Extracting quantitative information about highly scattering surfaces from an imaging system is challenging because the phase of the scattered light undergoes multiple folds upon propagation, resulting in complex speckle patterns. One specific application is the drying of wet powders in the pharmaceutical industry, where quantifying the particle size distribution (PSD) is of particular interest. A non-invasive and real-time monitoring probe in the drying process is required, but there is no suitable candidate for this purpose. In this report, we develop a theoretical relationship from the PSD to the speckle image and describe a physics-enhanced autocorrelation-based estimator (PEACE) machine learning algorithm for speckle analysis to measure the PSD of a powder surface. This method solves both the forward and inverse problems together and enjoys increased interpretability, since the machine learning approximator is regularized by the physical law.

7.
PLoS One ; 14(1): e0210558, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30645629

RESUMO

Metabolic models can estimate intrinsic product yields for microbial factories, but such frameworks struggle to predict cell performance (including product titer or rate) under suboptimal metabolism and complex bioprocess conditions. On the other hand, machine learning, complementary to metabolic modeling necessitates large amounts of data. Building such a database for metabolic engineering designs requires significant manpower and is prone to human errors and bias. We propose an approach to integrate data-driven methods with genome scale metabolic model for assessment of microbial bio-production (yield, titer and rate). Using engineered E. coli as an example, we manually extracted and curated a data set comprising about 1200 experimentally realized cell factories from ~100 papers. We furthermore augmented the key design features (e.g., genetic modifications and bioprocess variables) extracted from literature with additional features derived from running the genome-scale metabolic model iML1515 simulations with constraints that match the experimental data. Then, data augmentation and ensemble learning (e.g., support vector machines, gradient boosted trees, and neural networks in a stacked regressor model) are employed to alleviate the challenges of sparse, non-standardized, and incomplete data sets, while multiple correspondence analysis/principal component analysis are used to rank influential factors on bio-production. The hybrid framework demonstrates a reasonably high cross-validation accuracy for prediction of E.coli factory performance metrics under presumed bioprocess and pathway conditions (Pearson correlation coefficients between 0.8 and 0.93 on new data not seen by the model).


Assuntos
Algoritmos , Escherichia coli/genética , Aprendizado de Máquina , Engenharia Metabólica/métodos , Modelos Biológicos , Simulação por Computador , Bases de Dados Factuais , Escherichia coli/metabolismo , Reprodutibilidade dos Testes
8.
Biotechnol Adv ; 36(4): 1308-1315, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29729378

RESUMO

Genome scale modeling (GSM) predicts the performance of microbial workhorses and helps identify beneficial gene targets. GSM integrated with intracellular flux dynamics, omics, and thermodynamics have shown remarkable progress in both elucidating complex cellular phenomena and computational strain design (CSD). Nonetheless, these models still show high uncertainty due to a poor understanding of innate pathway regulations, metabolic burdens, and other factors (such as stress tolerance and metabolite channeling). Besides, the engineered hosts may have genetic mutations or non-genetic variations in bioreactor conditions and thus CSD rarely foresees fermentation rate and titer. Metabolic models play important role in design-build-test-learn cycles for strain improvement, and machine learning (ML) may provide a viable complementary approach for driving strain design and deciphering cellular processes. In order to develop quality ML models, knowledge engineering leverages and standardizes the wealth of information in literature (e.g., genomic/phenomic data, synthetic biology strategies, and bioprocess variables). Data driven frameworks can offer new constraints for mechanistic models to describe cellular regulations, to design pathways, to search gene targets, and to estimate fermentation titer/rate/yield under specified growth conditions (e.g., mixing, nutrients, and O2). This review highlights the scope of information collections, database constructions, and machine learning techniques (such as deep learning and transfer learning), which may facilitate "Learn and Design" for strain development.


Assuntos
Reatores Biológicos , Aprendizado de Máquina , Engenharia Metabólica , Modelos Biológicos , Biologia Sintética
9.
ACS Synth Biol ; 6(8): 1596-1604, 2017 08 18.
Artigo em Inglês | MEDLINE | ID: mdl-28459541

RESUMO

Synthetic biology aspires to develop frameworks that enable the construction of complex and reliable gene networks with predictable functionalities. A key limitation is that increasing network complexity increases the demand for cellular resources, potentially causing resource-associated interference among noninteracting circuits. Although recent studies have shown the effects of resource competition on circuit behaviors, mechanisms that decouple such interference remain unclear. Here, we constructed three systems in Escherichia coli, each consisting of two independent circuit modules where the complexity of one module (Circuit 2) was systematically increased while the other (Circuit 1) remained identical. By varying the expression level of Circuit 1 and measuring its effect on the expression level of Circuit 2, we demonstrated computationally and experimentally that indirect coupling between these seemingly unconnected genetic circuits can occur in three different regulatory topologies. More importantly, we experimentally verified the computational prediction that negative feedback can significantly reduce resource-coupled interference in regulatory circuits. Our results reveal a design principle that enables cells to reliably multitask while tightly controlling cellular resources.


Assuntos
Proteínas de Escherichia coli/genética , Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Genes Bacterianos/genética , Modelos Genéticos , Transdução de Sinais/genética , Simulação por Computador , Retroalimentação Fisiológica/fisiologia , Biologia Sintética/métodos
10.
Biotechnol Biofuels ; 10: 22, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28149324

RESUMO

BACKGROUND: C1 substrates (such as formate and methanol) are promising feedstock for biochemical/biofuel production. Numerous studies have been focusing on engineering heterologous pathways to incorporate C1 substrates into biomass, while the engineered microbial hosts often demonstrate inferior fermentation performance due to substrate toxicity, metabolic burdens from engineered pathways, and poor enzyme activities. Alternatively, exploring native C1 pathways in non-model microbes could be a better solution to address these challenges. RESULTS: An oleaginous fungus, Umbelopsis isabellina, demonstrates an excellent capability of metabolizing formate to promote growth and lipid accumulation. By co-feeding formate with glucose at a mole ratio of 3.9:1, biomass and lipid productivities of the culture in 7.5 L bioreactors were improved by 20 and 70%, respectively. 13C-metabolite analysis, genome annotations, and enzyme assay further discovered that formate not only provides an auxiliary energy source [promoting NAD(P)H and ATP] for cell anabolism, but also contributes carbon backbones via folate-mediated C1 pathways. More interestingly, formate addition can tune fatty acid profile and increase the portion of medium-chain fatty acids, which would benefit conversion of fungal lipids for high-quality biofuel production. Flux balance analysis further indicates that formate co-utilization can power microbial metabolism to improve biosynthesis, particularly on glucose-limited cultures. CONCLUSION: This study demonstrates Umbelopsis isabellina's strong capability for co-utilizing formate to produce biomass and enhance fatty acid production. It is a promising non-model platform that can be potentially integrated with photochemical/electrochemical processes to efficiently convert carbon dioxide into biofuels and value-added chemicals.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa