Búsqueda | BVS Nicaragua

1.

EnzymeML: seamless data flow and modeling of enzymatic data.

Lauterbach, Simone; Dienhart, Hannah; Range, Jan; Malzacher, Stephan; Spöring, Jan-Dirk; Rother, Dörte; Pinto, Maria Filipa; Martins, Pedro; Lagerman, Colton E; Bommarius, Andreas S; Høst, Amalie Vang; Woodley, John M; Ngubane, Sandile; Kudanga, Tukayi; Bergmann, Frank T; Rohwer, Johann M; Iglezakis, Dorothea; Weidemann, Andreas; Wittig, Ulrike; Kettner, Carsten; Swainston, Neil; Schnell, Santiago; Pleiss, Jürgen.

Nat Methods ; 20(3): 400-402, 2023 03.

Artículo en Inglés | MEDLINE | ID: mdl-36759590

RESUMEN

The design of biocatalytic reaction systems is highly complex owing to the dependency of the estimated kinetic parameters on the enzyme, the reaction conditions, and the modeling method. Consequently, reproducibility of enzymatic experiments and reusability of enzymatic data are challenging. We developed the XML-based markup language EnzymeML to enable storage and exchange of enzymatic data such as reaction conditions, the time course of the substrate and the product, kinetic parameters and the kinetic model, thus making enzymatic data findable, accessible, interoperable and reusable (FAIR). The feasibility and usefulness of the EnzymeML toolbox is demonstrated in six scenarios, for which data and metadata of different enzymatic reactions are collected and analyzed. EnzymeML serves as a seamless communication channel between experimental platforms, electronic lab notebooks, tools for modeling of enzyme kinetics, publication platforms and enzymatic reaction databases. EnzymeML is open and transparent, and invites the community to contribute. All documents and codes are freely available at https://enzymeml.org .

Asunto(s)

Manejo de Datos , Metadatos , Reproducibilidad de los Resultados , Bases de Datos Factuales , Cinética

2.

The automated Galaxy-SynBioCAD pipeline for synthetic biology design and engineering.

Hérisson, Joan; Duigou, Thomas; du Lac, Melchior; Bazi-Kabbaj, Kenza; Sabeti Azad, Mahnaz; Buldum, Gizem; Telle, Olivier; El Moubayed, Yorgo; Carbonell, Pablo; Swainston, Neil; Zulkower, Valentin; Kushwaha, Manish; Baldwin, Geoff S; Faulon, Jean-Loup.

Nat Commun ; 13(1): 5082, 2022 08 29.

Artículo en Inglés | MEDLINE | ID: mdl-36038542

RESUMEN

Here we introduce the Galaxy-SynBioCAD portal, a toolshed for synthetic biology, metabolic engineering, and industrial biotechnology. The tools and workflows currently shared on the portal enables one to build libraries of strains producing desired chemical targets covering an end-to-end metabolic pathway design and engineering process from the selection of strains and targets, the design of DNA parts to be assembled, to the generation of scripts driving liquid handlers for plasmid assembly and strain transformations. Standard formats like SBML and SBOL are used throughout to enforce the compatibility of the tools. In a study carried out at four different sites, we illustrate the link between pathway design and engineering with the building of a library of E. coli lycopene-producing strains. We also benchmark our workflows on literature and expert validated pathways. Overall, we find an 83% success rate in retrieving the validated pathways among the top 10 pathways generated by the workflows.

Asunto(s)

Escherichia coli , Biología Sintética , Biotecnología , Escherichia coli/genética , Ingeniería Metabólica , Programas Informáticos

3.

SpeedyGenesXL: an Automated, High-Throughput Platform for the Preparation of Bespoke Ultralarge Variant Libraries for Directed Evolution.

Sadler, Joanna C; Swainston, Neil; Dunstan, Mark S; Currin, Andrew; Kell, Douglas B.

Methods Mol Biol ; 2461: 67-83, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35727444

RESUMEN

Directed evolution of proteins is a highly effective strategy for tailoring biocatalysts to a particular application, and is capable of engineering improvements such as kcat, thermostability and organic solvent tolerance. It is recognized that large and systematic libraries are required to navigate a protein's vast and rugged sequence landscape effectively, yet their preparation is nontrivial and commercial libraries are extremely costly. To address this, we have developed SpeedyGenesXL, an automated, high-throughput platform for the production of wild-type genes, Boolean OR, combinatorial, or combinatorial-OR-type libraries based on the SpeedyGenes methodology. Together this offers a flexible platform for library synthesis, capable of generating many different bespoke, diverse libraries simultaneously.

Asunto(s)

Evolución Molecular Dirigida , Proteínas , Evolución Molecular Dirigida/métodos , Biblioteca de Genes

4.

EnzymeML-a data exchange format for biocatalysis and enzymology.

Range, Jan; Halupczok, Colin; Lohmann, Jens; Swainston, Neil; Kettner, Carsten; Bergmann, Frank T; Weidemann, Andreas; Wittig, Ulrike; Schnell, Santiago; Pleiss, Jürgen.

FEBS J ; 289(19): 5864-5874, 2022 10.

Artículo en Inglés | MEDLINE | ID: mdl-34890097

RESUMEN

EnzymeML is an XML-based data exchange format that supports the comprehensive documentation of enzymatic data by describing reaction conditions, time courses of substrate and product concentrations, the kinetic model, and the estimated kinetic constants. EnzymeML is based on the Systems Biology Markup Language, which was extended by implementing the STRENDA Guidelines. An EnzymeML document serves as a container to transfer data between experimental platforms, modeling tools, and databases. EnzymeML supports the scientific community by introducing a standardized data exchange format to make enzymatic data findable, accessible, interoperable, and reusable according to the FAIR data principles. An application programming interface in Python supports the integration of software tools for data acquisition, data analysis, and publication. The feasibility of a seamless data flow using EnzymeML is demonstrated by creating an EnzymeML document from a structured spreadsheet or from a STRENDA DB database entry, by kinetic modeling using the modeling platform COPASI, and by uploading to the enzymatic reaction kinetics database SABIO-RK.

Asunto(s)

Programas Informáticos , Biocatálisis , Bases de Datos Factuales

5.

MassGenie: A Transformer-Based Deep Learning Method for Identifying Small Molecules from Their Mass Spectra.

Shrivastava, Aditya Divyakant; Swainston, Neil; Samanta, Soumitra; Roberts, Ivayla; Wright Muelas, Marina; Kell, Douglas B.

Biomolecules ; 11(12)2021 11 30.

Artículo en Inglés | MEDLINE | ID: mdl-34944436

RESUMEN

The 'inverse problem' of mass spectrometric molecular identification ('given a mass spectrum, calculate/predict the 2D structure of the molecule whence it came') is largely unsolved, and is especially acute in metabolomics where many small molecules remain unidentified. This is largely because the number of experimentally available electrospray mass spectra of small molecules is quite limited. However, the forward problem ('calculate a small molecule's likely fragmentation and hence at least some of its mass spectrum from its structure alone') is much more tractable, because the strengths of different chemical bonds are roughly known. This kind of molecular identification problem may be cast as a language translation problem in which the source language is a list of high-resolution mass spectral peaks and the 'translation' a representation (for instance in SMILES) of the molecule. It is thus suitable for attack using the deep neural networks known as transformers. We here present MassGenie, a method that uses a transformer-based deep neural network, trained on ~6 million chemical structures with augmented SMILES encoding and their paired molecular fragments as generated in silico, explicitly including the protonated molecular ion. This architecture (containing some 400 million elements) is used to predict the structure of a molecule from the various fragments that may be expected to be observed when some of its bonds are broken. Despite being given essentially no detailed nor explicit rules about molecular fragmentation methods, isotope patterns, rearrangements, neutral losses, and the like, MassGenie learns the effective properties of the mass spectral fragment and valency space, and can generate candidate molecular structures that are very close or identical to those of the 'true' molecules. We also use VAE-Sim, a previously published variational autoencoder, to generate candidate molecules that are 'similar' to the top hit. In addition to using the 'top hits' directly, we can produce a rank order of these by 'round-tripping' candidate molecules and comparing them with the true molecules, where known. As a proof of principle, we confine ourselves to positive electrospray mass spectra from molecules with a molecular mass of 500Da or lower, including those in the last CASMI challenge (for which the results are known), getting 49/93 (53%) precisely correct. The transformer method, applied here for the first time to mass spectral interpretation, works extremely effectively both for mass spectra generated in silico and on experimentally obtained mass spectra from pure compounds. It seems to act as a Las Vegas algorithm, in that it either gives the correct answer or simply states that it cannot find one. The ability to create and to 'learn' millions of fragmentation patterns in silico, and therefrom generate candidate structures (that do not have to be in existing libraries) directly, thus opens up entirely the field of de novo small molecule structure prediction from experimental mass spectra.

Asunto(s)

Metabolómica/métodos , Bibliotecas de Moléculas Pequeñas/análisis , Algoritmos , Aprendizaje Profundo , Espectrometría de Masas , Estructura Molecular

6.

Deep learning and generative methods in cheminformatics and chemical biology: navigating small molecule space intelligently.

Kell, Douglas B; Samanta, Soumitra; Swainston, Neil.

Biochem J ; 477(23): 4559-4580, 2020 12 11.

Artículo en Inglés | MEDLINE | ID: mdl-33290527

RESUMEN

The number of 'small' molecules that may be of interest to chemical biologists - chemical space - is enormous, but the fraction that have ever been made is tiny. Most strategies are discriminative, i.e. have involved 'forward' problems (have molecule, establish properties). However, we normally wish to solve the much harder generative or inverse problem (describe desired properties, find molecule). 'Deep' (machine) learning based on large-scale neural networks underpins technologies such as computer vision, natural language processing, driverless cars, and world-leading performance in games such as Go; it can also be applied to the solution of inverse problems in chemical biology. In particular, recent developments in deep learning admit the in silico generation of candidate molecular structures and the prediction of their properties, thereby allowing one to navigate (bio)chemical space intelligently. These methods are revolutionary but require an understanding of both (bio)chemistry and computer science to be exploited to best advantage. We give a high-level (non-mathematical) background to the deep learning revolution, and set out the crucial issue for chemical biology and informatics as a two-way mapping from the discrete nature of individual molecules to the continuous but high-dimensional latent representation that may best reflect chemical space. A variety of architectures can do this; we focus on a particular type known as variational autoencoders. We then provide some examples of recent successes of these kinds of approach, and a look towards the future.

Asunto(s)

Quimioinformática , Simulación por Computador , Aprendizaje Profundo

7.

Engineering Escherichia coli towards de novo production of gatekeeper (2S)-flavanones: naringenin, pinocembrin, eriodictyol and homoeriodictyol.

Dunstan, Mark S; Robinson, Christopher J; Jervis, Adrian J; Yan, Cunyu; Carbonell, Pablo; Hollywood, Katherine A; Currin, Andrew; Swainston, Neil; Feuvre, Rosalind Le; Micklefield, Jason; Faulon, Jean-Loup; Breitling, Rainer; Turner, Nicholas; Takano, Eriko; Scrutton, Nigel S.

Synth Biol (Oxf) ; 5(1): ysaa012, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-33195815

RESUMEN

Natural plant-based flavonoids have drawn significant attention as dietary supplements due to their potential health benefits, including anti-cancer, anti-oxidant and anti-asthmatic activities. Naringenin, pinocembrin, eriodictyol and homoeriodictyol are classified as (2S)-flavanones, an important sub-group of naturally occurring flavonoids, with wide-reaching applications in human health and nutrition. These four compounds occupy a central position as branch point intermediates towards a broad spectrum of naturally occurring flavonoids. Here, we report the development of Escherichia coli production chassis for each of these key gatekeeper flavonoids. Selection of key enzymes, genetic construct design and the optimization of process conditions resulted in the highest reported titers for naringenin (484 mg/l), improved production of pinocembrin (198 mg/l) and eriodictyol (55 mg/l from caffeic acid), and provided the first example of in vivo production of homoeriodictyol directly from glycerol (17 mg/l). This work provides a springboard for future production of diverse downstream natural and non-natural flavonoid targets.

8.

VAE-Sim: A Novel Molecular Similarity Measure Based on a Variational Autoencoder.

Samanta, Soumitra; O'Hagan, Steve; Swainston, Neil; Roberts, Timothy J; Kell, Douglas B.

Molecules ; 25(15)2020 Jul 29.

Artículo en Inglés | MEDLINE | ID: mdl-32751155

RESUMEN

Molecular similarity is an elusive but core "unsupervised" cheminformatics concept, yet different "fingerprint" encodings of molecular structures return very different similarity values, even when using the same similarity metric. Each encoding may be of value when applied to other problems with objective or target functions, implying that a priori none are "better" than the others, nor than encoding-free metrics such as maximum common substructure (MCSS). We here introduce a novel approach to molecular similarity, in the form of a variational autoencoder (VAE). This learns the joint distribution p(z|x) where z is a latent vector and x are the (same) input/output data. It takes the form of a "bowtie"-shaped artificial neural network. In the middle is a "bottleneck layer" or latent vector in which inputs are transformed into, and represented as, a vector of numbers (encoding), with a reverse process (decoding) seeking to return the SMILES string that was the input. We train a VAE on over six million druglike molecules and natural products (including over one million in the final holdout set). The VAE vector distances provide a rapid and novel metric for molecular similarity that is both easily and rapidly calculated. We describe the method and its application to a typical similarity problem in cheminformatics.

Asunto(s)

Quimioinformática/métodos , Modelos Moleculares , Estructura Molecular , Algoritmos , Descubrimiento de Drogas

9.

SBML Level 3: an extensible format for the exchange and reuse of biological models.

Keating, Sarah M; Waltemath, Dagmar; König, Matthias; Zhang, Fengkai; Dräger, Andreas; Chaouiya, Claudine; Bergmann, Frank T; Finney, Andrew; Gillespie, Colin S; Helikar, Tomás; Hoops, Stefan; Malik-Sheriff, Rahuman S; Moodie, Stuart L; Moraru, Ion I; Myers, Chris J; Naldi, Aurélien; Olivier, Brett G; Sahle, Sven; Schaff, James C; Smith, Lucian P; Swat, Maciej J; Thieffry, Denis; Watanabe, Leandro; Wilkinson, Darren J; Blinov, Michael L; Begley, Kimberly; Faeder, James R; Gómez, Harold F; Hamm, Thomas M; Inagaki, Yuichiro; Liebermeister, Wolfram; Lister, Allyson L; Lucio, Daniel; Mjolsness, Eric; Proctor, Carole J; Raman, Karthik; Rodriguez, Nicolas; Shaffer, Clifford A; Shapiro, Bruce E; Stelling, Joerg; Swainston, Neil; Tanimura, Naoki; Wagner, John; Meier-Schellersheim, Martin; Sauro, Herbert M; Palsson, Bernhard; Bolouri, Hamid; Kitano, Hiroaki; Funahashi, Akira; Hermjakob, Henning.

Mol Syst Biol ; 16(8): e9110, 2020 08.

Artículo en Inglés | MEDLINE | ID: mdl-32845085

RESUMEN

Systems biology has experienced dramatic growth in the number, size, and complexity of computational models. To reproduce simulation results and reuse models, researchers must exchange unambiguous model descriptions. We review the latest edition of the Systems Biology Markup Language (SBML), a format designed for this purpose. A community of modelers and software authors developed SBML Level 3 over the past decade. Its modular form consists of a core suited to representing reaction-based models and packages that extend the core with features suited to other model types including constraint-based models, reaction-diffusion models, logical network models, and rule-based models. The format leverages two decades of SBML and a rich software ecosystem that transformed how systems biologists build and interact with models. More recently, the rise of multiscale models of whole cells and organs, and new data sources such as single-cell measurements and live imaging, has precipitated new ways of integrating data with models. We provide our perspectives on the challenges presented by these developments and how SBML Level 3 provides the foundation needed to support this evolution.

Asunto(s)

Biología de Sistemas/métodos , Animales , Humanos , Modelos Logísticos , Modelos Biológicos , Programas Informáticos

10.

Rapid prototyping of microbial production strains for the biomanufacture of potential materials monomers.

Robinson, Christopher J; Carbonell, Pablo; Jervis, Adrian J; Yan, Cunyu; Hollywood, Katherine A; Dunstan, Mark S; Currin, Andrew; Swainston, Neil; Spiess, Reynard; Taylor, Sandra; Mulherin, Paul; Parker, Steven; Rowe, William; Matthews, Nicholas E; Malone, Kirk J; Le Feuvre, Rosalind; Shapira, Philip; Barran, Perdita; Turner, Nicholas J; Micklefield, Jason; Breitling, Rainer; Takano, Eriko; Scrutton, Nigel S.

Metab Eng ; 60: 168-182, 2020 07.

Artículo en Inglés | MEDLINE | ID: mdl-32335188

RESUMEN

Bio-based production of industrial chemicals using synthetic biology can provide alternative green routes from renewable resources, allowing for cleaner production processes. To efficiently produce chemicals on-demand through microbial strain engineering, biomanufacturing foundries have developed automated pipelines that are largely compound agnostic in their time to delivery. Here we benchmark the capabilities of a biomanufacturing pipeline to enable rapid prototyping of microbial cell factories for the production of chemically diverse industrially relevant material building blocks. Over 85 days the pipeline was able to produce 17 potential material monomers and key intermediates by combining 160 genetic parts into 115 unique biosynthetic pathways. To explore the scale-up potential of our prototype production strains, we optimized the enantioselective production of mandelic acid and hydroxymandelic acid, achieving gram-scale production in fed-batch fermenters. The high success rate in the rapid design and prototyping of microbially-produced material building blocks reveals the potential role of biofoundries in leading the transition to sustainable materials production.

Asunto(s)

Bacterias/metabolismo , Microbiología Industrial/métodos , Ingeniería Metabólica/métodos , Benchmarking , Vías Biosintéticas , Industria Química , Simulación por Computador , Fermentación , Ácidos Mandélicos/metabolismo , Estereoisomerismo

11.

DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach.

Khemchandani, Yash; O'Hagan, Stephen; Samanta, Soumitra; Swainston, Neil; Roberts, Timothy J; Bollegala, Danushka; Kell, Douglas B.

J Cheminform ; 12(1): 53, 2020 Sep 04.

Artículo en Inglés | MEDLINE | ID: mdl-33431037

RESUMEN

We address the problem of generating novel molecules with desired interaction properties as a multi-objective optimization problem. Interaction binding models are learned from binding data using graph convolution networks (GCNs). Since the experimentally obtained property scores are recognised as having potentially gross errors, we adopted a robust loss for the model. Combinations of these terms, including drug likeness and synthetic accessibility, are then optimized using reinforcement learning based on a graph convolution policy approach. Some of the molecules generated, while legitimate chemically, can have excellent drug-likeness scores but appear unusual. We provide an example based on the binding potency of small molecules to dopamine transporters. We extend our method successfully to use a multi-objective reward function, in this case for generating novel molecules that bind with dopamine transporters but not with those for norepinephrine. Our method should be generally applicable to the generation in silico of molecules with desirable properties.

12.

An automated pipeline for the screening of diverse monoterpene synthase libraries.

Leferink, Nicole G H; Dunstan, Mark S; Hollywood, Katherine A; Swainston, Neil; Currin, Andrew; Jervis, Adrian J; Takano, Eriko; Scrutton, Nigel S.

Sci Rep ; 9(1): 11936, 2019 08 15.

Artículo en Inglés | MEDLINE | ID: mdl-31417136

RESUMEN

Monoterpenoids are a structurally diverse group of natural products with applications as pharmaceuticals, flavourings, fragrances, pesticides, and biofuels. Recent advances in synthetic biology offer new routes to this chemical diversity through the introduction of heterologous isoprenoid production pathways into engineered microorganisms. Due to the nature of the branched reaction mechanism, monoterpene synthases often produce multiple products when expressed in monoterpenoid production platforms. Rational engineering of terpene synthases is challenging due to a lack of correlation between protein sequence and cyclisation reaction catalysed. Directed evolution offers an attractive alternative protein engineering strategy as limited prior sequence-function knowledge is required. However, directed evolution of terpene synthases is hampered by the lack of a convenient high-throughput screening assay for the detection of multiple volatile terpene products. Here we applied an automated pipeline for the screening of diverse monoterpene synthase libraries, employing robotic liquid handling platforms coupled to GC-MS, and automated data extraction. We used the pipeline to screen pinene synthase variant libraries, with mutations in three areas of plasticity, capable of producing multiple monoterpene products. We successfully identified variants with altered product profiles and demonstrated good agreement between the results of the automated screen and traditional shake-flask cultures. In addition, useful insights into the cyclisation reaction catalysed by pinene synthase were obtained, including the identification of positions with the highest level of plasticity, and the significance of region 2 in carbocation cyclisation. The results obtained will aid the prediction and design of novel terpene synthase activities towards clean monoterpenoid products.

Asunto(s)

Transferasas Alquil y Aril/metabolismo , Ensayos Analíticos de Alto Rendimiento , Monoterpenos/metabolismo , Transferasas Alquil y Aril/química , Automatización , Ciclización , Liasas Intramoleculares/química , Liasas Intramoleculares/metabolismo , Monoterpenos/química , Dominios Proteicos , Reproducibilidad de los Resultados

13.

GeneORator: An Effective Strategy for Navigating Protein Sequence Space More Efficiently through Boolean OR-Type DNA Libraries.

Currin, Andrew; Kwok, Jane; Sadler, Joanna C; Bell, Elizabeth L; Swainston, Neil; Ababi, Maria; Day, Philip; Turner, Nicholas J; Kell, Douglas B.

ACS Synth Biol ; 8(6): 1371-1378, 2019 06 21.

Artículo en Inglés | MEDLINE | ID: mdl-31132850

RESUMEN

Directed evolution requires the creation of genetic diversity and subsequent screening or selection for improved variants. For DNA mutagenesis, conventional site-directed methods implicitly utilize the Boolean AND operator (creating all mutations simultaneously), producing a combinatorial explosion in the number of genetic variants as the number of mutations increases. We introduce GeneORator, a novel strategy for creating DNA libraries based on the Boolean logical OR operator. Here, a single library is divided into many subsets, each containing different combinations of the desired mutations. Consequently, the effect of adding more mutations on the number of genetic combinations is additive (Boolean OR logic) and not exponential (AND logic). We demonstrate this strategy with large-scale mutagenesis studies, using monoamine oxidase-N ( Aspergillus niger) as the exemplar target. First, we mutated every residue in the secondary structure-containing regions (276 out of a total 495 amino acids) to screen for improvements in kcat. Second, combinatorial OR-type libraries permitted screening of diverse mutation combinations in the enzyme active site to detect activity toward novel substrates. In both examples, OR-type libraries effectively reduced the number of variants searched up to 1010-fold, dramatically reducing the screening effort required to discover variants with improved and/or novel activity. Importantly, this approach enables the screening of a greater diversity of mutation combinations, accessing a larger area of a protein's sequence space. OR-type libraries can be applied to any biological engineering objective requiring DNA mutagenesis, and the approach has wide ranging applications in, for example, enzyme engineering, antibody engineering, and synthetic biology.

Asunto(s)

Evolución Molecular Dirigida/métodos , Biblioteca de Genes , Mutagénesis Sitio-Dirigida/métodos , Proteínas/genética , Biología Sintética/métodos , Secuencia de Aminoácidos/genética , Dominio Catalítico/genética , Proteínas/química

14.

Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries.

Currin, Andrew; Swainston, Neil; Dunstan, Mark S; Jervis, Adrian J; Mulherin, Paul; Robinson, Christopher J; Taylor, Sandra; Carbonell, Pablo; Hollywood, Katherine A; Yan, Cunyu; Takano, Eriko; Scrutton, Nigel S; Breitling, Rainer.

Synth Biol (Oxf) ; 4(1): ysz025, 2019.

Artículo en Inglés | MEDLINE | ID: mdl-32995546

RESUMEN

Synthetic biology utilizes the Design-Build-Test-Learn pipeline for the engineering of biological systems. Typically, this requires the construction of specifically designed, large and complex DNA assemblies. The availability of cheap DNA synthesis and automation enables high-throughput assembly approaches, which generates a heavy demand for DNA sequencing to verify correctly assembled constructs. Next-generation sequencing is ideally positioned to perform this task, however with expensive hardware costs and bespoke data analysis requirements few laboratories utilize this technology in-house. Here a workflow for highly multiplexed sequencing is presented, capable of fast and accurate sequence verification of DNA assemblies using nanopore technology. A novel sample barcoding system using polymerase chain reaction is introduced, and sequencing data are analyzed through a bespoke analysis algorithm. Crucially, this algorithm overcomes the problem of high-error rate nanopore data (which typically prevents identification of single nucleotide variants) through statistical analysis of strand bias, permitting accurate sequence analysis with single-base resolution. As an example, 576 constructs (6 × 96 well plates) were processed in a single workflow in 72 h (from Escherichia coli colonies to analyzed data). Given our procedure's low hardware costs and highly multiplexed capability, this provides cost-effective access to powerful DNA sequencing for any laboratory, with applications beyond synthetic biology including directed evolution, single nucleotide polymorphism analysis and gene synthesis.

15.

Machine Learning of Designed Translational Control Allows Predictive Pathway Optimization in Escherichia coli.

Jervis, Adrian J; Carbonell, Pablo; Vinaixa, Maria; Dunstan, Mark S; Hollywood, Katherine A; Robinson, Christopher J; Rattray, Nicholas J W; Yan, Cunyu; Swainston, Neil; Currin, Andrew; Sung, Rehana; Toogood, Helen; Taylor, Sandra; Faulon, Jean-Loup; Breitling, Rainer; Takano, Eriko; Scrutton, Nigel S.

ACS Synth Biol ; 8(1): 127-136, 2019 01 18.

Artículo en Inglés | MEDLINE | ID: mdl-30563328

RESUMEN

The field of synthetic biology aims to make the design of biological systems predictable, shrinking the huge design space to practical numbers for testing. When designing microbial cell factories, most optimization efforts have focused on enzyme and strain selection/engineering, pathway regulation, and process development. In silico tools for the predictive design of bacterial ribosome binding sites (RBSs) and RBS libraries now allow translational tuning of biochemical pathways; however, methods for predicting optimal RBS combinations in multigene pathways are desirable. Here we present the implementation of machine learning algorithms to model the RBS sequence-phenotype relationship from representative subsets of large combinatorial RBS libraries allowing the accurate prediction of optimal high-producers. Applied to a recombinant monoterpenoid production pathway in Escherichia coli, our approach was able to boost production titers by over 60% when screening under 3% of a library. To facilitate library screening, a multiwell plate fermentation procedure was developed, allowing increased screening throughput with sufficient resolution to discriminate between high and low producers. High producers from one library did not translate during scale-up, but the reduced screening requirements allowed rapid rescreening at the larger scale. This methodology is potentially compatible with any biochemical pathway and provides a powerful tool toward predictive design of bacterial production chassis.

Asunto(s)

Escherichia coli/metabolismo , Aprendizaje Automático , Escherichia coli/genética , Ribosomas/genética , Ribosomas/metabolismo , Biología Sintética/métodos

16.

An automated Design-Build-Test-Learn pipeline for enhanced microbial production of fine chemicals.

Carbonell, Pablo; Jervis, Adrian J; Robinson, Christopher J; Yan, Cunyu; Dunstan, Mark; Swainston, Neil; Vinaixa, Maria; Hollywood, Katherine A; Currin, Andrew; Rattray, Nicholas J W; Taylor, Sandra; Spiess, Reynard; Sung, Rehana; Williams, Alan R; Fellows, Donal; Stanford, Natalie J; Mulherin, Paul; Le Feuvre, Rosalind; Barran, Perdita; Goodacre, Royston; Turner, Nicholas J; Goble, Carole; Chen, George Guoqiang; Kell, Douglas B; Micklefield, Jason; Breitling, Rainer; Takano, Eriko; Faulon, Jean-Loup; Scrutton, Nigel S.

Commun Biol ; 1: 66, 2018.

Artículo en Inglés | MEDLINE | ID: mdl-30271948

RESUMEN

The microbial production of fine chemicals provides a promising biosustainable manufacturing solution that has led to the successful production of a growing catalog of natural products and high-value chemicals. However, development at industrial levels has been hindered by the large resource investments required. Here we present an integrated Design-Build-Test-Learn (DBTL) pipeline for the discovery and optimization of biosynthetic pathways, which is designed to be compound agnostic and automated throughout. We initially applied the pipeline for the production of the flavonoid (2S)-pinocembrin in Escherichia coli, to demonstrate rapid iterative DBTL cycling with automation at every stage. In this case, application of two DBTL cycles successfully established a production pathway improved by 500-fold, with competitive titers up to 88 mg L-1. The further application of the pipeline to optimize an alkaloids pathway demonstrates how it could facilitate the rapid optimization of microbial strains for production of any chemical compound of interest.

17.

Multifragment DNA Assembly of Biochemical Pathways via Automated Ligase Cycling Reaction.

Robinson, Christopher J; Dunstan, Mark S; Swainston, Neil; Titchmarsh, James; Takano, Eriko; Scrutton, Nigel S; Jervis, Adrian J.

Methods Enzymol ; 608: 369-392, 2018.

Artículo en Inglés | MEDLINE | ID: mdl-30173770

RESUMEN

The microbial production of commodity, fine, and specialty chemicals is a driving force in biotechnology. An essential requirement is to introduce biosynthetic pathways to the target compound(s) into chassis organisms. First suitable enzymes must be selected and characterized, and then genetic pathways must be designed and assembled into suitable expression vectors. The design of these pathways is crucial for balancing the pathway for efficient in vivo activity. This can be achieved through optimization of the pathway regulation by altering transcription and translation rates. The possible permutations of a multigene pathway create a vast design space which is intractable to explore using traditional time-consuming and laborious pathway assembly methods. The advent of multifragment DNA assembly technologies has enabled simultaneous, multiplexed pathway construction allowing an increased capability to sample the design space. Furthermore, the implementation of laboratory automation allows error-reduced, high-throughput (HTP) construction of pathways. In this chapter, we present a workflow that combines automated in silico design of DNA parts followed by pathway assembly using the ligase cycling reaction on robotics platforms, to allow multiplexed assembly of plasmid-borne gene pathways with high efficiency. Details and considerations in designing DNA parts for expression bacterial chassis are discussed followed by laboratory protocols for HTP pathway assembly and screening using robotics platforms. This workflow is employed in the SYNBIOCHEM Synthetic Biology Research Center, providing the capability to assemble over 96 plasmids simultaneously, with over 40% of clones from each assembly harboring the correctly assembled plasmids. This workflow is easy to modify for use in other laboratories and will help to accelerate synthetic biology projects with diverse applications.

Asunto(s)

Vías Biosintéticas , ADN/genética , Escherichia coli/genética , Ligasas/genética , Plásmidos/genética , Programas Informáticos , Simulación por Computador , ADN/metabolismo , Escherichia coli/metabolismo , Ligasas/metabolismo , Plásmidos/metabolismo , Biología Sintética/métodos , Flujo de Trabajo

18.

Fast and Flexible Synthesis of Combinatorial Libraries for Directed Evolution.

Sadler, Joanna C; Green, Lucy; Swainston, Neil; Kell, Douglas B; Currin, Andrew.

Methods Enzymol ; 608: 59-79, 2018.

Artículo en Inglés | MEDLINE | ID: mdl-30173773

RESUMEN

Directed evolution (DE) is a powerful tool for optimizing an enzyme's properties toward a particular objective, such as broader substrate scope, greater thermostability, or increased kcat. A successful DE project requires the generation of genetic diversity and subsequent screening or selection to identify variants with improved fitness. In contrast to random methods (error-prone PCR or DNA shuffling), site-directed mutagenesis enables the rational design of variant libraries and provides control over the nature and frequency of the encoded mutations. Knowledge of protein structure, dynamics, enzyme mechanisms, and natural evolution demonstrates that multiple (combinatorial) mutations are required to discover the most improved variants. To this end, we describe an experimentally straightforward and low-cost method for the preparation of combinatorial variant libraries. Our approach employs a two-step PCR protocol, first producing mutagenic megaprimers, which can then be combined in a "mix-and-match" fashion to generate diverse sets of combinatorial variant libraries both quickly and accurately.

Asunto(s)

Evolución Molecular Dirigida/métodos , Ingeniería de Proteínas/métodos , Secuencia de Bases , Biocatálisis , ADN/genética , Cartilla de ADN/genética , Evolución Molecular Dirigida/economía , Biblioteca de Genes , Mutagénesis , Reacción en Cadena de la Polimerasa/economía , Reacción en Cadena de la Polimerasa/métodos , Ingeniería de Proteínas/economía , Biología Sintética/economía , Biología Sintética/métodos

19.

PartsGenie: an integrated tool for optimizing and sharing synthetic biology parts.

Swainston, Neil; Dunstan, Mark; Jervis, Adrian J; Robinson, Christopher J; Carbonell, Pablo; Williams, Alan R; Faulon, Jean-Loup; Scrutton, Nigel S; Kell, Douglas B.

Bioinformatics ; 34(13): 2327-2329, 2018 07 01.

Artículo en Inglés | MEDLINE | ID: mdl-29949952

RESUMEN

Motivation: Synthetic biology is typified by developing novel genetic constructs from the assembly of reusable synthetic DNA parts, which contain one or more features such as promoters, ribosome binding sites, coding sequences and terminators. PartsGenie is introduced to facilitate the computational design of such synthetic biology parts, bridging the gap between optimization tools for the design of novel parts, the representation of such parts in community-developed data standards such as Synthetic Biology Open Language, and their sharing in journal-recommended data repositories. Consisting of a drag-and-drop web interface, a number of DNA optimization algorithms, and an interface to the well-used data repository JBEI ICE, PartsGenie facilitates the design, optimization and dissemination of reusable synthetic biology parts through an integrated application. Availability and implementation: PartsGenie is freely available at https://parts.synbiochem.co.uk.

Asunto(s)

ADN/análisis , Programas Informáticos , Biología Sintética , Algoritmos , ADN/química

20.

Rationalizing Context-Dependent Performance of Dynamic RNA Regulatory Devices.

Kent, Ross; Halliwell, Samantha; Young, Kate; Swainston, Neil; Dixon, Neil.

ACS Synth Biol ; 7(7): 1660-1668, 2018 07 20.

Artículo en Inglés | MEDLINE | ID: mdl-29928800

RESUMEN

The ability of RNA to sense, regulate, and store information is an attractive attribute for a variety of functional applications including the development of regulatory control devices for synthetic biology. RNA folding and function is known to be highly context sensitive, which limits the modularity and reuse of RNA regulatory devices to control different heterologous sequences and genes. We explored the cause and effect of sequence context sensitivity for translational ON riboswitches located in the 5' UTR, by constructing and screening a library of N-terminal synonymous codon variants. By altering the N-terminal codon usage we were able to obtain RNA devices with a broad range of functional performance properties (ON, OFF, fold-change). Linear regression and calculated metrics were used to rationalize the major determining features leading to optimal riboswitch performance, and to identify multiple interactions between the explanatory metrics. Finally, partial least squared (PLS) analysis was employed in order to understand the metrics and their respective effect on performance. This PLS model was shown to provide good explanation of our library. This study provides a novel multivariant analysis framework to rationalize the codon context performance of allosteric RNA-devices. The framework will also serve as a platform for future riboswitch context engineering endeavors.

Asunto(s)

ARN/química , ARN/metabolismo , Animales , Codón/genética , Humanos , Pliegue del ARN , Riboswitch/genética , Biología Sintética/métodos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA