RESUMO
Sustainably grown biomass is a promising alternative to produce fuels and chemicals and reduce the dependency on fossil energy sources. However, the efficient conversion of lignocellulosic biomass into biofuels and bioproducts often requires extensive testing of components and reaction conditions used in the pretreatment, saccharification, and bioconversion steps. This restriction can result in a significant and unwieldy number of combinations of biomass types, solvents, microbial strains, and operational parameters that need to be characterized, turning these efforts into a daunting and time-consuming task. Here we developed a high-throughput feedstocks-to-fuels screening platform to address these challenges. The result is a miniaturized semi-automated platform that leverages the capabilities of a solid handling robot, a liquid handling robot, analytical instruments, and a centralized data repository, adapted to operate as an ionic-liquid-based biomass conversion pipeline. The pipeline was tested by using sorghum as feedstock, the biocompatible ionic liquid cholinium phosphate as pretreatment solvent, a "one-pot" process configuration that does not require ionic liquid removal after pretreatment, and an engineered strain of the yeast Rhodosporidium toruloides that produces the jet-fuel precursor bisabolene as a conversion microbe. By the simultaneous processing of 48 samples, we show that this configuration and reaction conditions result in sugar yields (~70%) and bisabolene titers (~1500 mg/L) that are comparable to the efficiencies observed at larger scales but require only a fraction of the time. We expect that this Feedstocks-to-Fuels pipeline will become an effective tool to screen thousands of bioenergy crop and feedstock samples and assist process optimization efforts and the development of predictive deconstruction approaches.
Assuntos
Biocombustíveis , Biomassa , Lignina , Sorghum , Lignina/metabolismo , Lignina/química , Biocombustíveis/análise , Sorghum/metabolismo , Líquidos Iônicos/químicaRESUMO
Engineering metabolism to efficiently produce chemicals from multi-step pathways requires optimizing multi-gene expression programs to achieve enzyme balance. CRISPR-Cas transcriptional control systems are emerging as important tools for programming multi-gene expression, but poor predictability of guide RNA folding can disrupt expression control. Here, we correlate efficacy of modified guide RNAs (scRNAs) for CRISPR activation (CRISPRa) in E. coli with a computational kinetic parameter describing scRNA folding rate into the active structure (rS = 0.8). This parameter also enables forward design of scRNAs, allowing us to design a system of three synthetic CRISPRa promoters that can orthogonally activate (>35-fold) expression of chosen outputs. Through combinatorial activation tuning, we profile a three-dimensional design space expressing two different biosynthetic pathways, demonstrating variable production of pteridine and human milk oligosaccharide products. This RNA design approach aids combinatorial optimization of metabolic pathways and may accelerate routine design of effective multi-gene regulation programs in bacterial hosts.
Assuntos
Sistemas CRISPR-Cas , Escherichia coli , RNA Guia de Sistemas CRISPR-Cas , Escherichia coli/genética , Escherichia coli/metabolismo , RNA Guia de Sistemas CRISPR-Cas/genética , RNA Guia de Sistemas CRISPR-Cas/metabolismo , Engenharia Metabólica/métodos , Vias Biossintéticas/genética , Regiões Promotoras Genéticas , Humanos , Regulação Bacteriana da Expressão Gênica , Dobramento de RNARESUMO
Metabolic fluxes, the number of metabolites traversing each biochemical reaction in a cell per unit time, are crucial for assessing and understanding cell function. 13C Metabolic Flux Analysis (13C MFA) is considered to be the gold standard for measuring metabolic fluxes. 13C MFA typically works by leveraging extracellular exchange fluxes as well as data from 13C labeling experiments to calculate the flux profile which best fit the data for a small, central carbon, metabolic model. However, the nonlinear nature of the 13C MFA fitting procedure means that several flux profiles fit the experimental data within the experimental error, and traditional optimization methods offer only a partial or skewed picture, especially in "non-gaussian" situations where multiple very distinct flux regions fit the data equally well. Here, we present a method for flux space sampling through Bayesian inference (BayFlux), that identifies the full distribution of fluxes compatible with experimental data for a comprehensive genome-scale model. This Bayesian approach allows us to accurately quantify uncertainty in calculated fluxes. We also find that, surprisingly, the genome-scale model of metabolism produces narrower flux distributions (reduced uncertainty) than the small core metabolic models traditionally used in 13C MFA. The different results for some reactions when using genome-scale models vs core metabolic models advise caution in assuming strong inferences from 13C MFA since the results may depend significantly on the completeness of the model used. Based on BayFlux, we developed and evaluated novel methods (P-13C MOMA and P-13C ROOM) to predict the biological results of a gene knockout, that improve on the traditional MOMA and ROOM methods by quantifying prediction uncertainty.
Assuntos
Análise do Fluxo Metabólico , Modelos Biológicos , Teorema de Bayes , Incerteza , Análise do Fluxo Metabólico/métodos , Isótopos de Carbono/metabolismoRESUMO
Rhodococcus opacus is a bacterium that has a high tolerance to aromatic compounds and can produce significant amounts of triacylglycerol (TAG). Here, we present iGR1773, the first genome-scale model (GSM) of R. opacus PD630 metabolism based on its genomic sequence and associated data. The model includes 1773 genes, 3025 reactions, and 1956 metabolites, was developed in a reproducible manner using CarveMe, and was evaluated through Metabolic Model tests (MEMOTE). We combine the model with two Constraint-Based Reconstruction and Analysis (COBRA) methods that use transcriptomics data to predict growth rates and fluxes: E-Flux2 and SPOT (Simplified Pearson Correlation with Transcriptomic data). Growth rates are best predicted by E-Flux2. Flux profiles are more accurately predicted by E-Flux2 than flux balance analysis (FBA) and parsimonious FBA (pFBA), when compared to 44 central carbon fluxes measured by 13C-Metabolic Flux Analysis (13C-MFA). Under glucose-fed conditions, E-Flux2 presents an R2 value of 0.54, while predictions based on pFBA had an inferior R2 of 0.28. We attribute this improved performance to the extra activity information provided by the transcriptomics data. For phenol-fed metabolism, in which the substrate first enters the TCA cycle, E-Flux2's flux predictions display a high R2 of 0.96 while pFBA showed an R2 of 0.93. We also show that glucose metabolism and phenol metabolism function with similar relative ATP maintenance costs. These findings demonstrate that iGR1773 can help the metabolic engineering community predict aromatic substrate utilization patterns and perform computational strain design.
Assuntos
Engenharia Metabólica , Rhodococcus , Engenharia Metabólica/métodos , Análise do Fluxo Metabólico/métodos , Rhodococcus/genética , Rhodococcus/metabolismo , Fenóis/metabolismoRESUMO
The growing capabilities of synthetic biology and organic chemistry demand tools to guide syntheses toward useful molecules. Here, we present Molecular AutoenCoding Auto-Workaround (MACAW), a tool that uses a novel approach to generate molecules predicted to meet a desired property specification (e.g., a binding affinity of 50 nM or an octane number of 90). MACAW describes molecules by embedding them into a smooth multidimensional numerical space, avoiding uninformative dimensions that previous methods often introduce. The coordinates in this embedding provide a natural choice of features for accurately predicting molecular properties, which we demonstrate with examples for cetane and octane numbers, flash points, and histamine H1 receptor binding affinity. The approach is computationally efficient and well-suited to the small- and medium-size datasets commonly used in biosciences. We showcase the utility of MACAW for virtual screening by identifying molecules with high predicted binding affinity to the histamine H1 receptor and limited affinity to the muscarinic M2 receptor, which are targets of medicinal relevance. Combining these predictive capabilities with a novel generative algorithm for molecules allows us to recommend molecules with a desired property value (i.e., inverse molecular design). We demonstrate this capability by recommending molecules with predicted octane numbers of 40, 80, and 120, which is an important characteristic of biofuels. Thus, MACAW augments classical retrosynthesis tools by providing recommendations for molecules on specification.
Assuntos
Octanos , Receptores Histamínicos H1 , Algoritmos , Ligação ProteicaRESUMO
We present a droplet-based microfluidic system that enables CRISPR-based gene editing and high-throughput screening on a chip. The microfluidic device contains a 10 × 10 element array, and each element contains sets of electrodes for two electric field-actuated operations: electrowetting for merging droplets to mix reagents and electroporation for transformation. This device can perform up to 100 genetic modification reactions in parallel, providing a scalable platform for generating the large number of engineered strains required for the combinatorial optimization of genetic pathways and predictable bioengineering. We demonstrate the system's capabilities through the CRISPR-based engineering of two test cases: (1) disruption of the function of the enzyme galactokinase (galK) in E. coli and (2) targeted engineering of the glutamine synthetase gene (glnA) and the blue-pigment synthetase gene (bpsA) to improve indigoidine production in E. coli.
RESUMO
Concerns over climate change have necessitated a rethinking of our transportation infrastructure. One possible alternative to carbon-polluting fossil fuels is biofuels produced by engineered microorganisms that use a renewable carbon source. Two biofuels, ethanol and biodiesel, have made inroads in displacing petroleum-based fuels, but their uptake has been limited by the amounts that can be used in conventional engines and by their cost. Advanced biofuels that mimic petroleum-based fuels are not limited by the amounts that can be used in existing transportation infrastructure but have had limited uptake due to costs. In this Review, we discuss engineering metabolic pathways to produce advanced biofuels, challenges with substrate and product toxicity with regard to host microorganisms and methods to engineer tolerance, and the use of functional genomics and machine learning approaches to produce advanced biofuels and prospects for reducing their costs.
Assuntos
Bactérias/metabolismo , Biocombustíveis/economia , Engenharia Genética , Genômica , Aprendizado de MáquinaRESUMO
Biology has changed radically in the past two decades, growing from a purely descriptive science into also a design science. The availability of tools that enable the precise modification of cells, as well as the ability to collect large amounts of multimodal data, open the possibility of sophisticated bioengineering to produce fuels, specialty and commodity chemicals, materials, and other renewable bioproducts. However, despite new tools and exponentially increasing data volumes, synthetic biology cannot yet fulfill its true potential due to our inability to predict the behavior of biological systems. Here, we showcase a set of computational tools that, combined, provide the ability to store, visualize, and leverage multiomics data to predict the outcome of bioengineering efforts. We show how to upload, visualize, and output multiomics data, as well as strain information, into online repositories for several isoprenol-producing strain designs. We then use these data to train machine learning algorithms that recommend new strain designs that are correctly predicted to improve isoprenol production by 23%. This demonstration is done by using synthetic data, as provided by a novel library, that can produce credible multiomics data for testing algorithms and computational tools. In short, this paper provides a step-by-step tutorial to leverage these computational tools to improve production in bioengineered strains.
RESUMO
Machine learning provides researchers a unique opportunity to make metabolic engineering more predictable. In this review, we offer an introduction to this discipline in terms that are relatable to metabolic engineers, as well as providing in-depth illustrative examples leveraging omics data and improving production. We also include practical advice for the practitioner in terms of data management, algorithm libraries, computational resources, and important non-technical issues. A variety of applications ranging from pathway construction and optimization, to genetic editing optimization, cell factory testing, and production scale-up are discussed. Moreover, the promising relationship between machine learning and mechanistic models is thoroughly reviewed. Finally, the future perspectives and most promising directions for this combination of disciplines are examined.
Assuntos
Aprendizado de Máquina , Engenharia Metabólica , Algoritmos , Edição de GenesRESUMO
Synthetic biology allows us to bioengineer cells to synthesize novel valuable molecules such as renewable biofuels or anticancer drugs. However, traditional synthetic biology approaches involve ad-hoc engineering practices, which lead to long development times. Here, we present the Automated Recommendation Tool (ART), a tool that leverages machine learning and probabilistic modeling techniques to guide synthetic biology in a systematic fashion, without the need for a full mechanistic understanding of the biological system. Using sampling-based optimization, ART provides a set of recommended strains to be built in the next engineering cycle, alongside probabilistic predictions of their production levels. We demonstrate the capabilities of ART on simulated data sets, as well as experimental data from real metabolic engineering projects producing renewable biofuels, hoppy flavored beer without hops, fatty acids, and tryptophan. Finally, we discuss the limitations of this approach, and the practical consequences of the underlying assumptions failing.
Assuntos
Aprendizado de Máquina , Engenharia Metabólica/métodos , Biologia Sintética/métodos , Teorema de Bayes , Cerveja , Biocombustíveis , Dodecanol/metabolismo , Escherichia coli/metabolismo , Ácidos Graxos/metabolismo , Saccharomyces cerevisiae/metabolismoRESUMO
A significant bottleneck in synthetic biology involves screening large genetically encoded libraries for desirable phenotypes such as chemical production. However, transcription factor-based biosensors can be leveraged to screen thousands of genetic designs for optimal chemical production in engineered microbes. In this study we characterize two glutarate sensing transcription factors (CsiR and GcdR) from Pseudomonas putida. The genomic contexts of csiR homologues were analyzed, and their DNA binding sites were bioinformatically predicted. Both CsiR and GcdR were purified and shown to bind upstream of their coding sequencing in vitro. CsiR was shown to dissociate from DNA in vitro when exogenous glutarate was added, confirming that it acts as a genetic repressor. Both transcription factors and cognate promoters were then cloned into broad host range vectors to create two glutarate biosensors. Their respective sensing performance features were characterized, and more sensitive derivatives of the GcdR biosensor were created by manipulating the expression of the transcription factor. Sensor vectors were then reintroduced into P. putida and evaluated for their ability to respond to glutarate and various lysine metabolites. Additionally, we developed a novel mathematical approach to describe the usable range of detection for genetically encoded biosensors, which may be broadly useful in future efforts to better characterize biosensor performance.
Assuntos
Glutaratos/metabolismo , Lisina/metabolismo , Pseudomonas putida/metabolismo , Fatores de Transcrição/metabolismo , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Engenharia Metabólica/métodos , Regiões Promotoras Genéticas/genética , Pseudomonas putida/genética , Biologia Sintética/métodosRESUMO
Despite broad scientific interest in harnessing the power of Earth's microbiomes, knowledge gaps hinder their efficient use for addressing urgent societal and environmental challenges. We argue that structuring research and technology developments around a design-build-test-learn (DBTL) cycle will advance microbiome engineering and spur new discoveries of the basic scientific principles governing microbiome function. In this Review, we present key elements of an iterative DBTL cycle for microbiome engineering, focusing on generalizable approaches, including top-down and bottom-up design processes, synthetic and self-assembled construction methods, and emerging tools to analyse microbiome function. These approaches can be used to harness microbiomes for broad applications related to medicine, agriculture, energy and the environment. We also discuss key challenges and opportunities of each approach and synthesize them into best practice guidelines for engineering microbiomes. We anticipate that adoption of a DBTL framework will rapidly advance microbiome-based biotechnologies aimed at improving human and animal health, agriculture and enabling the bioeconomy.
Assuntos
Bioengenharia/métodos , Microbiota , Guias de Prática Clínica como Assunto , Agricultura/métodos , Fontes de Energia Bioelétrica , Terapia Biológica/métodos , Microbiologia Ambiental , Guias como Assunto , HumanosRESUMO
Mass spectrometry-based quantitative proteomic analysis has proven valuable for clinical and biotechnology-related research and development. Improvements in sensitivity, resolution, and robustness of mass analyzers have also added value. However, manual sample preparation protocols are often a bottleneck for sample throughput and can lead to poor reproducibility, especially for applications where thousands of samples per month must be analyzed. To alleviate these issues, we developed a "cells-to-peptides" automated workflow for Gram-negative bacteria and fungi that includes cell lysis, protein precipitation, resuspension, quantification, normalization, and tryptic digestion. The workflow takes 2 h to process 96 samples from cell pellets to the initiation of the tryptic digestion step and can process 384 samples in parallel. We measured the efficiency of protein extraction from various amounts of cell biomass and optimized the process for standard liquid chromatography-mass spectrometry systems. The automated workflow was tested by preparing 96 Escherichia coli samples and quantifying over 600 peptides that resulted in a median coefficient of variation of 15.8%. Similar technical variance was observed for three other organisms as measured by highly multiplexed LC-MRM-MS acquisition methods. These results show that this automated sample preparation workflow provides robust, reproducible proteomic samples for high-throughput applications.
Assuntos
Células/química , Técnicas Microbiológicas/métodos , Peptídeos/isolamento & purificação , Proteômica/métodos , Manejo de Espécimes/métodos , Fluxo de Trabalho , Automação , Proteínas de Bactérias/análise , Proteínas de Bactérias/isolamento & purificação , Escherichia coli/química , Proteínas Fúngicas/análise , Proteínas Fúngicas/isolamento & purificação , Fungos/química , Bactérias Gram-Negativas/química , Humanos , Peptídeos/análise , Manejo de Espécimes/normasRESUMO
Our inability to predict the behavior of biological systems severely hampers progress in bioengineering and biomedical applications. We cannot predict the effect of genotype changes on phenotype, nor extrapolate the large-scale behavior from small-scale experiments. Machine learning techniques recently reached a new level of maturity, and are capable of providing the needed predictive power without a detailed mechanistic understanding. However, they require large amounts of data to be trained. The amount and quality of data required can only be produced through a combination of synthetic biology and automation, so as to generate a large diversity of biological systems with high reproducibility. A sustained investment in the intersection of synthetic biology, machine learning, and automation will drive forward predictive biology, and produce improved machine learning algorithms.
Assuntos
Automação/métodos , Biologia Sintética/métodos , Algoritmos , Animais , Bioengenharia/métodos , Biologia Computacional/métodos , Genótipo , Humanos , Aprendizado de Máquina , Fenótipo , Reprodutibilidade dos TestesRESUMO
The Design-Build-Test-Learn (DBTL) cycle, facilitated by exponentially improving capabilities in synthetic biology, is an increasingly adopted metabolic engineering framework that represents a more systematic and efficient approach to strain development than historical efforts in biofuels and biobased products. Here, we report on implementation of two DBTL cycles to optimize 1-dodecanol production from glucose using 60 engineered Escherichia coli MG1655 strains. The first DBTL cycle employed a simple strategy to learn efficiently from a relatively small number of strains (36), wherein only the choice of ribosome-binding sites and an acyl-ACP/acyl-CoA reductase were modulated in a single pathway operon including genes encoding a thioesterase (UcFatB1), an acyl-ACP/acyl-CoA reductase (Maqu_2507, Maqu_2220, or Acr1), and an acyl-CoA synthetase (FadD). Measured variables included concentrations of dodecanol and all proteins in the engineered pathway. We used the data produced in the first DBTL cycle to train several machine-learning algorithms and to suggest protein profiles for the second DBTL cycle that would increase production. These strategies resulted in a 21% increase in dodecanol titer in Cycle 2 (up to 0.83 g/L, which is more than 6-fold greater than previously reported batch values for minimal medium). Beyond specific lessons learned about optimizing dodecanol titer in E. coli, this study had findings of broader relevance across synthetic biology applications, such as the importance of sequencing checks on plasmids in production strains as well as in cloning strains, and the critical need for more accurate protein expression predictive tools.
Assuntos
Dodecanol/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Aprendizado de Máquina , Engenharia Metabólica/métodos , Algoritmos , Redes e Vias Metabólicas/genética , Biologia SintéticaRESUMO
Synthetic biology is a rapidly developing field that pursues the application of engineering principles and development approaches to biological engineering. Synthetic biology is poised to change the way biology is practiced, and has important practical applications: for example, building genetically engineered organisms to produce biofuels, medicines, and other chemicals. Traditionally, synthetic biology has focused on manipulating a few genes (e.g., in a single pathway or genetic circuit), but its combination with systems biology holds the promise of creating new cellular architectures and constructing complex biological systems from the ground up. Enabling this merge of synthetic and systems biology will require greater predictive capability for modeling the behavior of cellular systems, and more comprehensive data sets for building and calibrating these models. The so-called "-omics" data sets can now be generated via high throughput techniques in the form of genomic, proteomic, transcriptomic, and metabolomic information on the engineered biological system. Of particular interest with respect to the engineering of microbes capable of producing biofuels and other chemicals economically and at scale are metabolomic datasets, and their insights into intracellular metabolic fluxes. Metabolic fluxes provide a rapid and easy to understand picture of how carbon and energy flow throughout the cell. Here, we present a detailed guide to performing metabolic flux analysis and modeling using the open source JBEI Quantitative Metabolic Modeling (jQMM) library. This library allows the user to transform metabolomics data in the form of isotope labeling data from a 13C labeling experiment into a determination of cellular fluxes that can be used to develop genetic engineering strategies for metabolic engineering.The jQMM library presents a complete toolbox for performing a range of different tasks of interest in metabolic engineering. Various different types of flux analysis and modeling can be performed such as flux balance analysis, 13C metabolic flux analysis, and two-scale 13C metabolic flux analysis (2S-13C MFA). 2S-13C MFA is a novel method that determines genome-scale fluxes without the need of every single carbon transition in the metabolic network. In addition to several other capabilities, the jQMM library can make model based predictions for how various genetic engineering strategies can be incorporated toward bioengineering goals: it can predict the effects of reaction knockouts on metabolism using both the MoMA and ROOM methodologies. In this chapter, we will illustrate the use of the jQMM library through a step-by-step demonstration of flux determination and knockout prediction in a complex eukaryotic model organism: Saccharomyces cerevisiae (S. cerevisiae). Included with this chapter is a digital Jupyter Notebook file that provides a computable appendix showing a self-contained example of jQMM usage, which can be changed to fit the user's specific needs. As an open source software project, users can modify and extend the code base to make improvements at will, allowing them to share their development work and contribute back to the jQMM modeling community.
Assuntos
Engenharia Metabólica/métodos , Metabolômica/métodos , Modelos Biológicos , Saccharomyces cerevisiae/metabolismo , Isótopos de Carbono/química , Análise de Dados , Técnicas de Inativação de Genes , Engenharia Genética/métodos , Análise do Fluxo Metabólico/métodos , Redes e Vias Metabólicas , Saccharomyces cerevisiae/genética , SoftwareRESUMO
Robust fermentation of biomass-derived sugars into bioproducts demands the reliable microbial expression of metabolic pathways. Plasmid-based expression systems may suffer from instability and result in highly variable titers, rates, and yields. An established mitigation approach, chemical induced chromosomal expansion (CIChE), expands a singly integrated pathway to plasmid-like copy numbers while maintaining stability in the absence of antibiotic selection pressure. Here, we report parallel integration and chromosomal expansion (PIACE), extensions to CIChE that enable independent expansions of pathway components across multiple loci, use suicide vectors to achieve high-efficiency site-specific integration of sequence-validated multigene components, and introduce a heat-curable plasmid to obviate recA deletion post pathway expansion. We applied PIACE to stabilize an isopentenol pathway across three loci in E. coli DH1 and then generate libraries of pathway component copy number variants to screen for improved titers. Polynomial regressor statistical modeling of the production screening data suggests that increasing copy numbers of all isopentenol pathway components would further improve titers.
Assuntos
Cromossomos Bacterianos/genética , Engenharia Metabólica/métodos , Redes e Vias Metabólicas/fisiologia , Cromossomos Bacterianos/metabolismo , Variações do Número de Cópias de DNA , Proteínas de Ligação a DNA/genética , Escherichia coli/genética , Loci Gênicos , Aprendizado de Máquina , Pentanóis/metabolismo , Plasmídeos/genética , Plasmídeos/metabolismoRESUMO
Determination of internal metabolic fluxes is crucial for fundamental and applied biology because they map how carbon and electrons flow through metabolism to enable cell function. 13 C Metabolic Flux Analysis ( 13 C MFA) and Two-Scale 13 C Metabolic Flux Analysis (2S- 13 C MFA) are two techniques used to determine such fluxes. Both operate on the simplifying approximation that metabolic flux from peripheral metabolism into central "core" carbon metabolism is minimal, and can be omitted when modeling isotopic labeling in core metabolism. The validity of this "two-scale" or "bow tie" approximation is supported both by the ability to accurately model experimental isotopic labeling data, and by experimentally verified metabolic engineering predictions using these methods. However, the boundaries of core metabolism that satisfy this approximation can vary across species, and across cell culture conditions. Here, we present a set of algorithms that (1) systematically calculate flux bounds for any specified "core" of a genome-scale model so as to satisfy the bow tie approximation and (2) automatically identify an updated set of core reactions that can satisfy this approximation more efficiently. First, we leverage linear programming to simultaneously identify the lowest fluxes from peripheral metabolism into core metabolism compatible with the observed growth rate and extracellular metabolite exchange fluxes. Second, we use Simulated Annealing to identify an updated set of core reactions that allow for a minimum of fluxes into core metabolism to satisfy these experimental constraints. Together, these methods accelerate and automate the identification of a biologically reasonable set of core reactions for use with 13 C MFA or 2S- 13 C MFA, as well as provide for a substantially lower set of flux bounds for fluxes into the core as compared with previous methods. We provide an open source Python implementation of these algorithms at https://github.com/JBEI/limitfluxtocore.
RESUMO
ClusterCAD is a web-based toolkit designed to leverage the collinear structure and deterministic logic of type I modular polyketide synthases (PKSs) for synthetic biology applications. The unique organization of these megasynthases, combined with the diversity of their catalytic domain building blocks, has fueled an interest in harnessing the biosynthetic potential of PKSs for the microbial production of both novel natural product analogs and industrially relevant small molecules. However, a limited theoretical understanding of the determinants of PKS fold and function poses a substantial barrier to the design of active variants, and identifying strategies to reliably construct functional PKS chimeras remains an active area of research. In this work, we formalize a paradigm for the design of PKS chimeras and introduce ClusterCAD as a computational platform to streamline and simplify the process of designing experiments to test strategies for engineering PKS variants. ClusterCAD provides chemical structures with stereochemistry for the intermediates generated by each PKS module, as well as sequence- and structure-based search tools that allow users to identify modules based either on amino acid sequence or on the chemical structure of the cognate polyketide intermediate. ClusterCAD can be accessed at https://clustercad.jbei.org and at http://clustercad.igb.uci.edu.