Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Nucleic Acids Res ; 51(D1): D532-D538, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36416273

RESUMO

Megasynthase enzymes such as type I modular polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs) play a central role in microbial chemical warfare because they can evolve rapidly by shuffling parts (catalytic domains) to produce novel chemicals. If we can understand the design rules to reshuffle these parts, PKSs and NRPSs will provide a systematic and modular way to synthesize millions of molecules including pharmaceuticals, biomaterials, and biofuels. However, PKS and NRPS engineering remains difficult due to a limited understanding of the determinants of PKS and NRPS fold and function. We developed ClusterCAD to streamline and simplify the process of designing and testing engineered PKS variants. Here, we present the highly improved ClusterCAD 2.0 release, available at https://clustercad.jbei.org. ClusterCAD 2.0 boasts support for PKS-NRPS hybrid and NRPS clusters in addition to PKS clusters; a vastly enlarged database of curated PKS, PKS-NRPS hybrid, and NRPS clusters; a diverse set of chemical 'starters' and loading modules; the new Domain Architecture Cluster Search Tool; and an offline Jupyter Notebook workspace, among other improvements. Together these features massively expand the chemical space that can be accessed by enzymes engineered with ClusterCAD.


Assuntos
Peptídeo Sintases , Policetídeo Sintases , Software , Peptídeo Sintases/biossíntese , Peptídeo Sintases/química , Peptídeo Sintases/genética , Policetídeo Sintases/biossíntese , Policetídeo Sintases/química , Policetídeo Sintases/genética , Biotecnologia/métodos
2.
PLoS Comput Biol ; 19(11): e1011111, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37948450

RESUMO

Metabolic fluxes, the number of metabolites traversing each biochemical reaction in a cell per unit time, are crucial for assessing and understanding cell function. 13C Metabolic Flux Analysis (13C MFA) is considered to be the gold standard for measuring metabolic fluxes. 13C MFA typically works by leveraging extracellular exchange fluxes as well as data from 13C labeling experiments to calculate the flux profile which best fit the data for a small, central carbon, metabolic model. However, the nonlinear nature of the 13C MFA fitting procedure means that several flux profiles fit the experimental data within the experimental error, and traditional optimization methods offer only a partial or skewed picture, especially in "non-gaussian" situations where multiple very distinct flux regions fit the data equally well. Here, we present a method for flux space sampling through Bayesian inference (BayFlux), that identifies the full distribution of fluxes compatible with experimental data for a comprehensive genome-scale model. This Bayesian approach allows us to accurately quantify uncertainty in calculated fluxes. We also find that, surprisingly, the genome-scale model of metabolism produces narrower flux distributions (reduced uncertainty) than the small core metabolic models traditionally used in 13C MFA. The different results for some reactions when using genome-scale models vs core metabolic models advise caution in assuming strong inferences from 13C MFA since the results may depend significantly on the completeness of the model used. Based on BayFlux, we developed and evaluated novel methods (P-13C MOMA and P-13C ROOM) to predict the biological results of a gene knockout, that improve on the traditional MOMA and ROOM methods by quantifying prediction uncertainty.


Assuntos
Análise do Fluxo Metabólico , Modelos Biológicos , Teorema de Bayes , Incerteza , Análise do Fluxo Metabólico/métodos , Isótopos de Carbono/metabolismo
3.
J Am Chem Soc ; 142(22): 9896-9901, 2020 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-32412752

RESUMO

Polyketide synthase (PKS) engineering is an attractive method to generate new molecules such as commodity, fine and specialty chemicals. A significant challenge is re-engineering a partially reductive PKS module to produce a saturated ß-carbon through a reductive loop (RL) exchange. In this work, we sought to establish that chemoinformatics, a field traditionally used in drug discovery, offers a viable strategy for RL exchanges. We first introduced a set of donor RLs of diverse genetic origin and chemical substrates  into the first extension module of the lipomycin PKS (LipPKS1). Product titers of these engineered unimodular PKSs correlated with chemical structure similarity between the substrate of the donor RLs and recipient LipPKS1, reaching a titer of 165 mg/L of short-chain fatty acids produced by the host Streptomyces albus J1074. Expanding this method to larger intermediates that require bimodular communication, we introduced RLs of divergent chemosimilarity into LipPKS2 and determined triketide lactone production. Collectively, we observed a statistically significant correlation between atom pair chemosimilarity and production, establishing a new chemoinformatic method that may aid in the engineering of PKSs to produce desired, unnatural products.


Assuntos
Biologia Computacional , Policetídeo Sintases/química , Engenharia de Proteínas , Estrutura Molecular , Policetídeo Sintases/metabolismo
4.
Nucleic Acids Res ; 46(D1): D509-D515, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29040649

RESUMO

ClusterCAD is a web-based toolkit designed to leverage the collinear structure and deterministic logic of type I modular polyketide synthases (PKSs) for synthetic biology applications. The unique organization of these megasynthases, combined with the diversity of their catalytic domain building blocks, has fueled an interest in harnessing the biosynthetic potential of PKSs for the microbial production of both novel natural product analogs and industrially relevant small molecules. However, a limited theoretical understanding of the determinants of PKS fold and function poses a substantial barrier to the design of active variants, and identifying strategies to reliably construct functional PKS chimeras remains an active area of research. In this work, we formalize a paradigm for the design of PKS chimeras and introduce ClusterCAD as a computational platform to streamline and simplify the process of designing experiments to test strategies for engineering PKS variants. ClusterCAD provides chemical structures with stereochemistry for the intermediates generated by each PKS module, as well as sequence- and structure-based search tools that allow users to identify modules based either on amino acid sequence or on the chemical structure of the cognate polyketide intermediate. ClusterCAD can be accessed at https://clustercad.jbei.org and at http://clustercad.igb.uci.edu.


Assuntos
Antibacterianos/biossíntese , Proteínas de Bactérias/genética , Policetídeo Sintases/genética , Policetídeos/metabolismo , Engenharia de Proteínas/métodos , Software , Biologia Sintética/métodos , Sequência de Aminoácidos , Antibacterianos/química , Proteínas de Bactérias/metabolismo , Biocatálise , Domínio Catalítico , Desenho de Fármacos , Expressão Gênica , Internet , Família Multigênica , Policetídeo Sintases/metabolismo , Policetídeos/química , Streptomyces/química , Streptomyces/enzimologia , Streptomyces/genética , Relação Estrutura-Atividade , Especificidade por Substrato
5.
J Ind Microbiol Biotechnol ; 46(8): 1225-1235, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31115703

RESUMO

Engineered polyketide synthases (PKSs) are promising synthetic biology platforms for the production of chemicals with diverse applications. The dehydratase (DH) domain within modular type I PKSs generates an α,ß-unsaturated bond in nascent polyketide intermediates through a dehydration reaction. Several crystal structures of DH domains have been solved, providing important structural insights into substrate selection and dehydration. Here, we present two DH domain structures from two chemically diverse PKSs. The first DH domain, isolated from the third module in the borrelidin PKS, is specific towards a trans-cyclopentane-carboxylate-containing polyketide substrate. The second DH domain, isolated from the first module in the fluvirucin B1 PKS, accepts an amide-containing polyketide intermediate. Sequence-structure analysis of these domains, in addition to previously published DH structures, display many significant similarities and key differences pertaining to substrate selection. The two major differences between BorA DH M3, FluA DH M1 and other DH domains are found in regions of unmodeled residues or residues containing high B-factors. These two regions are located between α3-ß11 and ß7-α2. From the catalytic Asp located in α3 to a conserved Pro in ß11, the residues between them form part of the bottom of the substrate-binding cavity responsible for binding to acyl-ACP intermediates.


Assuntos
Policetídeo Sintases/química , Sítios de Ligação , Álcoois Graxos/química , Álcoois Graxos/metabolismo , Modelos Moleculares , Policetídeo Sintases/metabolismo , Estrutura Terciária de Proteína , Especificidade por Substrato
6.
J Ind Microbiol Biotechnol ; 45(7): 621-633, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-29423743

RESUMO

Complex reduced polyketides represent the largest class of natural products that have applications in medicine, agriculture, and animal health. This structurally diverse class of compounds shares a common methodology of biosynthesis employing modular enzyme systems called polyketide synthases (PKSs). The modules are composed of enzymatic domains that share sequence and functional similarity across all known PKSs. We have used the nomenclature of synthetic biology to classify the enzymatic domains and modules as parts and devices, respectively, and have generated detailed lists of both. In addition, we describe the chassis (hosts) that are used to assemble, express, and engineer the parts and devices to produce polyketides. We describe a recently developed software tool to design PKS system and provide an example of its use. Finally, we provide perspectives of what needs to be accomplished to fully realize the potential that synthetic biology approaches bring to this class of molecules.


Assuntos
Produtos Biológicos/metabolismo , Engenharia Genética/métodos , Policetídeo Sintases/metabolismo , Biologia Sintética/métodos , Animais , Policetídeos , Software
7.
BMC Bioinformatics ; 18(1): 205, 2017 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-28381205

RESUMO

BACKGROUND: Modeling of microbial metabolism is a topic of growing importance in biotechnology. Mathematical modeling helps provide a mechanistic understanding for the studied process, separating the main drivers from the circumstantial ones, bounding the outcomes of experiments and guiding engineering approaches. Among different modeling schemes, the quantification of intracellular metabolic fluxes (i.e. the rate of each reaction in cellular metabolism) is of particular interest for metabolic engineering because it describes how carbon and energy flow throughout the cell. In addition to flux analysis, new methods for the effective use of the ever more readily available and abundant -omics data (i.e. transcriptomics, proteomics and metabolomics) are urgently needed. RESULTS: The jQMM library presented here provides an open-source, Python-based framework for modeling internal metabolic fluxes and leveraging other -omics data for the scientific study of cellular metabolism and bioengineering purposes. Firstly, it presents a complete toolbox for simultaneously performing two different types of flux analysis that are typically disjoint: Flux Balance Analysis and 13C Metabolic Flux Analysis. Moreover, it introduces the capability to use 13C labeling experimental data to constrain comprehensive genome-scale models through a technique called two-scale 13C Metabolic Flux Analysis (2S-13C MFA). In addition, the library includes a demonstration of a method that uses proteomics data to produce actionable insights to increase biofuel production. Finally, the use of the jQMM library is illustrated through the addition of several Jupyter notebook demonstration files that enhance reproducibility and provide the capability to be adapted to the user's specific needs. CONCLUSIONS: jQMM will facilitate the design and metabolic engineering of organisms for biofuels and other chemicals, as well as investigations of cellular metabolism and leveraging -omics data. As an open source software project, we hope it will attract additions from the community and grow with the rapidly changing field of metabolic engineering.


Assuntos
Interface Usuário-Computador , Biocombustíveis , Isótopos de Carbono/química , Escherichia coli/metabolismo , Internet , Análise do Fluxo Metabólico/métodos , Metabolômica , Modelos Biológicos , Análise de Componente Principal , Proteômica
8.
BMC Bioinformatics ; 17: 388, 2016 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-27650223

RESUMO

BACKGROUND: Next-generation sequencing (NGS) has revolutionized how research is carried out in many areas of biology and medicine. However, the analysis of NGS data remains a major obstacle to the efficient utilization of the technology, as it requires complex multi-step processing of big data demanding considerable computational expertise from users. While substantial effort has been invested on the development of software dedicated to the individual analysis steps of NGS experiments, insufficient resources are currently available for integrating the individual software components within the widely used R/Bioconductor environment into automated workflows capable of running the analysis of most types of NGS applications from start-to-finish in a time-efficient and reproducible manner. RESULTS: To address this need, we have developed the R/Bioconductor package systemPipeR. It is an extensible environment for both building and running end-to-end analysis workflows with automated report generation for a wide range of NGS applications. Its unique features include a uniform workflow interface across different NGS applications, automated report generation, and support for running both R and command-line software on local computers and computer clusters. A flexible sample annotation infrastructure efficiently handles complex sample sets and experimental designs. To simplify the analysis of widely used NGS applications, the package provides pre-configured workflows and reporting templates for RNA-Seq, ChIP-Seq, VAR-Seq and Ribo-Seq. Additional workflow templates will be provided in the future. CONCLUSIONS: systemPipeR accelerates the extraction of reproducible analysis results from NGS experiments. By combining the capabilities of many R/Bioconductor and command-line tools, it makes efficient use of existing software resources without limiting the user to a set of predefined methods or environments. systemPipeR is freely available for all common operating systems from Bioconductor ( http://bioconductor.org/packages/devel/systemPipeR ).


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , Fluxo de Trabalho
9.
J Chem Inf Model ; 56(7): 1237-42, 2016 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-27367556

RESUMO

Despite a large and rapidly growing body of small molecule bioactivity screens available in the public domain, systematic leverage of the data to assess target druggability and compound selectivity has been confounded by a lack of suitable cross-target analysis software. We have developed bioassayR, a computational tool that enables simultaneous analysis of thousands of bioassay experiments performed over a diverse set of compounds and biological targets. Unique features include support for large-scale cross-target analyses of both public and custom bioassays, generation of high throughput screening fingerprints (HTSFPs), and an optional preloaded database that provides access to a substantial portion of publicly available bioactivity data. bioassayR is implemented as an open-source R/Bioconductor package available from https://bioconductor.org/packages/bioassayR/ .


Assuntos
Bioensaio , Biologia Computacional/métodos , Bibliotecas de Moléculas Pequenas/farmacologia , Bases de Dados de Produtos Farmacêuticos , Software
10.
Proc Natl Acad Sci U S A ; 110(24): E2173-81, 2013 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-23633570

RESUMO

Juvenile hormone III (JH) plays a key role in regulating the reproduction of female mosquitoes. Microarray time-course analysis revealed dynamic changes in gene expression during posteclosion (PE) development in the fat body of female Aedes aegypti. Hierarchical clustering identified three major gene clusters: 1,843 early-PE (EPE) genes maximally expressed at 6 h PE, 457 mid-PE (MPE) genes at 24 h PE, and 1,815 late-PE (LPE) genes at 66 h PE. The RNAi microarray screen for the JH receptor Methoprene-tolerant (Met) showed that 27% of EPE and 40% of MPE genes were up-regulated whereas 36% of LPE genes were down-regulated in the absence of this receptor. Met repression of EPE and MPE and activation of LPE genes were validated by an in vitro fat-body culture experiment using Met RNAi. Sequence motif analysis revealed the consensus for a 9-mer Met-binding motif, CACG(C)/TG(A)/G(T)/AG. Met-binding motif variants were overrepresented within the first 300 bases of the promoters of Met RNAi-down-regulated (LPE) genes but not in Met RNAi-up-regulated (EPE) genes. EMSAs using a combination of mutational and anti-Met antibody supershift analyses confirmed the binding properties of the Met consensus motif variants. There was a striking temporal separation of expression profiles among major functional gene groups, with carbohydrate, lipid, and xenobiotics metabolism belonging to the EPE and MPE clusters and transcription and translation to the LPE cluster. This study represents a significant advancement in the understanding of the regulation of gene expression by JH and its receptor Met during female mosquito reproduction.


Assuntos
Aedes/genética , Perfilação da Expressão Gênica , Hormônios Juvenis/metabolismo , Metoprene/metabolismo , Aedes/crescimento & desenvolvimento , Aedes/metabolismo , Animais , Sequência de Bases , Sítios de Ligação/genética , Análise por Conglomerados , Corpo Adiposo/crescimento & desenvolvimento , Corpo Adiposo/metabolismo , Feminino , Regulação da Expressão Gênica no Desenvolvimento/efeitos dos fármacos , Hormônios Juvenis/farmacologia , Metoprene/farmacologia , Motivos de Nucleotídeos/genética , Análise de Sequência com Séries de Oligonucleotídeos , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Fatores de Tempo
11.
Bioinformatics ; 29(21): 2792-4, 2013 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-23962615

RESUMO

MOTIVATION: The ability to accurately measure structural similarities among small molecules is important for many analysis routines in drug discovery and chemical genomics. Algorithms used for this purpose include fragment-based fingerprint and graph-based maximum common substructure (MCS) methods. MCS approaches provide one of the most accurate similarity measures. However, their rigid matching policies limit them to the identification of perfect MCSs. To eliminate this restriction, we introduce a new mismatch tolerant search method for identifying flexible MCSs (FMCSs) containing a user-definable number of atom and/or bond mismatches. RESULTS: The fmcsR package provides an R interface, with the time-consuming steps of the FMCS algorithm implemented in C++. It includes utilities for pairwise compound comparisons, structure similarity searching, clustering and visualization of MCSs. In comparison with an existing MCS tool, fmcsR shows better time performance over a wide range of compound sizes. When mismatching of atoms or bonds is turned on, the compute times increase as expected, and the resulting FMCSs are often substantially larger than their strict MCS counterparts. Based on extensive virtual screening (VS) tests, the flexible matching feature enhances the enrichment of active structures at the top of MCS-based similarity search results. With respect to overall and early enrichment performance, FMCS outperforms most of the seven other VS methods considered in these tests. AVAILABILITY: fmcsR is freely available for all common operating systems from the Bioconductor site (http://www.bioconductor.org/packages/devel/bioc/html/fmcsR.html). CONTACT: thomas.girke@ucr.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Conformação Molecular , Software , Algoritmos , Análise por Conglomerados , Biologia Computacional/métodos , Descoberta de Drogas
12.
Nucleic Acids Res ; 39(Web Server issue): W486-91, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21576229

RESUMO

ChemMine Tools is an online service for small molecule data analysis. It provides a web interface to a set of cheminformatics and data mining tools that are useful for various analysis routines performed in chemical genomics and drug discovery. The service also offers programmable access options via the R library ChemmineR. The primary functionalities of ChemMine Tools fall into five major application areas: data visualization, structure comparisons, similarity searching, compound clustering and prediction of chemical properties. First, users can upload compound data sets to the online Compound Workbench. Numerous utilities are provided for compound viewing, structure drawing and format interconversion. Second, pairwise structural similarities among compounds can be quantified. Third, interfaces to ultra-fast structure similarity search algorithms are available to efficiently mine the chemical space in the public domain. These include fingerprint and embedding/indexing algorithms. Fourth, the service includes a Clustering Toolbox that integrates cheminformatic algorithms with data mining utilities to enable systematic structure and activity based analyses of custom compound sets. Fifth, physicochemical property descriptors of custom compound sets can be calculated. These descriptors are important for assessing the bioactivity profile of compounds in silico and quantitative structure-activity relationship (QSAR) analyses. ChemMine Tools is available at: http://chemmine.ucr.edu.


Assuntos
Descoberta de Drogas , Software , Algoritmos , Análise por Conglomerados , Mineração de Dados , Genômica , Internet , Preparações Farmacêuticas/química , Bibliotecas de Moléculas Pequenas , Relação Estrutura-Atividade
13.
Curr Opin Biotechnol ; 79: 102881, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36603501

RESUMO

Self-driving labs (SDLs) combine fully automated experiments with artificial intelligence (AI) that decides the next set of experiments. Taken to their ultimate expression, SDLs could usher a new paradigm of scientific research, where the world is probed, interpreted, and explained by machines for human benefit. While there are functioning SDLs in the fields of chemistry and materials science, we contend that synthetic biology provides a unique opportunity since the genome provides a single target for affecting the incredibly wide repertoire of biological cell behavior. However, the level of investment required for the creation of biological SDLs is only warranted if directed toward solving difficult and enabling biological questions. Here, we discuss challenges and opportunities in creating SDLs for synthetic biology.


Assuntos
Inteligência Artificial , Biologia Sintética , Humanos
15.
RNA ; 15(5): 992-1002, 2009 May.
Artigo em Inglês | MEDLINE | ID: mdl-19307293

RESUMO

The advent of high-throughput sequencing (HTS) methods has enabled direct approaches to quantitatively profile small RNA populations. However, these methods have been limited by several factors, including representational artifacts and lack of established statistical methods of analysis. Furthermore, massive HTS data sets present new problems related to data processing and mapping to a reference genome. Here, we show that cluster-based sequencing-by-synthesis technology is highly reproducible as a quantitative profiling tool for several classes of small RNA from Arabidopsis thaliana. We introduce the use of synthetic RNA oligoribonucleotide standards to facilitate objective normalization between HTS data sets, and adapt microarray-type methods for statistical analysis of multiple samples. These methods were tested successfully using mutants with small RNA biogenesis (miRNA-defective dcl1 mutant and siRNA-defective dcl2 dcl3 dcl4 triple mutant) or effector protein (ago1 mutant) deficiencies. Computational methods were also developed to rapidly and accurately parse, quantify, and map small RNA data.


Assuntos
Arabidopsis/genética , Perfilação da Expressão Gênica , RNA de Plantas/genética , Biologia Computacional , Análise de Sequência de RNA
16.
Front Bioeng Biotechnol ; 9: 612893, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33634086

RESUMO

Biology has changed radically in the past two decades, growing from a purely descriptive science into also a design science. The availability of tools that enable the precise modification of cells, as well as the ability to collect large amounts of multimodal data, open the possibility of sophisticated bioengineering to produce fuels, specialty and commodity chemicals, materials, and other renewable bioproducts. However, despite new tools and exponentially increasing data volumes, synthetic biology cannot yet fulfill its true potential due to our inability to predict the behavior of biological systems. Here, we showcase a set of computational tools that, combined, provide the ability to store, visualize, and leverage multiomics data to predict the outcome of bioengineering efforts. We show how to upload, visualize, and output multiomics data, as well as strain information, into online repositories for several isoprenol-producing strain designs. We then use these data to train machine learning algorithms that recommend new strain designs that are correctly predicted to improve isoprenol production by 23%. This demonstration is done by using synthetic data, as provided by a novel library, that can produce credible multiomics data for testing algorithms and computational tools. In short, this paper provides a step-by-step tutorial to leverage these computational tools to improve production in bioengineered strains.

17.
Nucleic Acids Res ; 36(Database issue): D982-5, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17999994

RESUMO

Development of the Arabidopsis Small RNA Project (ASRP) Database, which provides information and tools for the analysis of microRNA, endogenous siRNA and other small RNA-related features, has been driven by the introduction of high-throughput sequencing technology. To accommodate the demands of increased data, numerous improvements and updates have been made to ASRP, including new ways to access data, more efficient algorithms for handling data, and increased integration with community-wide resources. New search and visualization tools have also been developed to improve access to small RNA classes and their targets. ASRP is publicly available through a web interface at http://asrp.cgrb.oregonstate.edu/db/.


Assuntos
Arabidopsis/genética , Bases de Dados de Ácidos Nucleicos , MicroRNAs/química , RNA de Plantas/química , RNA Interferente Pequeno/química , Internet , RNA não Traduzido/química , Interface Usuário-Computador
18.
Metabolites ; 8(1)2018 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-29300340

RESUMO

Determination of internal metabolic fluxes is crucial for fundamental and applied biology because they map how carbon and electrons flow through metabolism to enable cell function. 13 C Metabolic Flux Analysis ( 13 C MFA) and Two-Scale 13 C Metabolic Flux Analysis (2S- 13 C MFA) are two techniques used to determine such fluxes. Both operate on the simplifying approximation that metabolic flux from peripheral metabolism into central "core" carbon metabolism is minimal, and can be omitted when modeling isotopic labeling in core metabolism. The validity of this "two-scale" or "bow tie" approximation is supported both by the ability to accurately model experimental isotopic labeling data, and by experimentally verified metabolic engineering predictions using these methods. However, the boundaries of core metabolism that satisfy this approximation can vary across species, and across cell culture conditions. Here, we present a set of algorithms that (1) systematically calculate flux bounds for any specified "core" of a genome-scale model so as to satisfy the bow tie approximation and (2) automatically identify an updated set of core reactions that can satisfy this approximation more efficiently. First, we leverage linear programming to simultaneously identify the lowest fluxes from peripheral metabolism into core metabolism compatible with the observed growth rate and extracellular metabolite exchange fluxes. Second, we use Simulated Annealing to identify an updated set of core reactions that allow for a minimum of fluxes into core metabolism to satisfy these experimental constraints. Together, these methods accelerate and automate the identification of a biologically reasonable set of core reactions for use with 13 C MFA or 2S- 13 C MFA, as well as provide for a substantially lower set of flux bounds for fluxes into the core as compared with previous methods. We provide an open source Python implementation of these algorithms at https://github.com/JBEI/limitfluxtocore.

19.
PLoS One ; 12(2): e0171413, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28178331

RESUMO

This study presents an analysis of the small molecule bioactivity profiles across large quantities of diverse protein families represented in PubChem BioAssay. We compared the bioactivity profiles of FDA approved drugs to non-FDA approved compounds, and report several distinct patterns characteristic of the approved drugs. We found that a large fraction of the previously reported higher target promiscuity among FDA approved compounds, compared to non-FDA approved bioactives, was frequently due to cross-reactivity within rather than across protein families. We identified 804 potentially novel protein target candidates for FDA approved drugs, as well as 901 potentially novel target candidates with active non-FDA approved compounds, but no FDA approved drugs with activity against these targets. We also identified 486348 potentially novel compounds active against the same targets as FDA approved drugs, as well as 153402 potentially novel compounds active against targets without active FDA approved drugs. By quantifying the agreement among replicated screens, we estimated that more than half of these novel outcomes are reproducible. Using biclustering, we identified many dense clusters of FDA approved drugs with enriched activity against a common set of protein targets. We also report the distribution of compound promiscuity using a Bayesian statistical model, and report the sensitivity and specificity of two common methods for identifying promiscuous compounds. Aggregator assays exhibited greater accuracy in identifying highly promiscuous compounds, while PAINS substructures were able to identify a much larger set of "middle range" promiscuous compounds. Additionally, we report a large number of promiscuous compounds not identified as aggregators or PAINS. In summary, the results of this study represent a rich reference for selecting novel drug and target protein candidates, as well as for eliminating candidate compounds with unselective activities.


Assuntos
Descoberta de Drogas , Proteoma , Proteômica , Bibliotecas de Moléculas Pequenas , Análise por Conglomerados , Biologia Computacional/métodos , Mineração de Dados , Descoberta de Drogas/métodos , Ensaios de Triagem em Larga Escala , Modelos Estatísticos , Ligação Proteica , Proteômica/métodos , Reprodutibilidade dos Testes
20.
ACS Synth Biol ; 6(12): 2248-2259, 2017 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-28826210

RESUMO

Although recent advances in synthetic biology allow us to produce biological designs more efficiently than ever, our ability to predict the end result of these designs is still nascent. Predictive models require large amounts of high-quality data to be parametrized and tested, which are not generally available. Here, we present the Experiment Data Depot (EDD), an online tool designed as a repository of experimental data and metadata. EDD provides a convenient way to upload a variety of data types, visualize these data, and export them in a standardized fashion for use with predictive algorithms. In this paper, we describe EDD and showcase its utility for three different use cases: storage of characterized synthetic biology parts, leveraging proteomics data to improve biofuel yield, and the use of extracellular metabolite concentrations to predict intracellular metabolic fluxes.


Assuntos
Armazenamento e Recuperação da Informação , Metadados , Modelos Biológicos , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa