Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 118(17)2021 04 27.
Artigo em Inglês | MEDLINE | ID: mdl-33883278

RESUMO

Cancer cells can survive chemotherapy-induced stress, but how they recover from it is not known. Using a temporal multiomics approach, we delineate the global mechanisms of proteotoxic stress resolution in multiple myeloma cells recovering from proteasome inhibition. Our observations define layered and protracted programs for stress resolution that encompass extensive changes across the transcriptome, proteome, and metabolome. Cellular recovery from proteasome inhibition involved protracted and dynamic changes of glucose and lipid metabolism and suppression of mitochondrial function. We demonstrate that recovering cells are more vulnerable to specific insults than acutely stressed cells and identify the general control nonderepressable 2 (GCN2)-driven cellular response to amino acid scarcity as a key recovery-associated vulnerability. Using a transcriptome analysis pipeline, we further show that GCN2 is also a stress-independent bona fide target in transcriptional signature-defined subsets of solid cancers that share molecular characteristics. Thus, identifying cellular trade-offs tied to the resolution of chemotherapy-induced stress in tumor cells may reveal new therapeutic targets and routes for cancer therapy optimization.


Assuntos
Neoplasias/tratamento farmacológico , Estresse Fisiológico/efeitos dos fármacos , Antineoplásicos/farmacologia , Autofagia/fisiologia , Linhagem Celular Tumoral , Humanos , Metaboloma/genética , Mitocôndrias/metabolismo , Mieloma Múltiplo/metabolismo , Neoplasias/metabolismo , Neoplasias/fisiopatologia , Inibidores de Proteassoma/farmacologia , Proteólise , Proteoma/genética , Análise de Sistemas , Transcriptoma/genética
2.
Semin Cancer Biol ; 86(Pt 3): 706-731, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-34062265

RESUMO

Microbial polysaccharides (MPs) offer immense diversity in structural and functional properties. They are extensively used in advance biomedical science owing to their superior biodegradability, hemocompatibility, and capability to imitate the natural extracellular matrix microenvironment. Ease in tailoring, inherent bio-activity, distinct mucoadhesiveness, ability to absorb hydrophobic drugs, and plentiful availability of MPs make them prolific green biomaterials to overcome the significant constraints of cancer chemotherapeutics. Many studies have demonstrated their application to obstruct tumor development and extend survival through immune activation, apoptosis induction, and cell cycle arrest by MPs. Synoptic investigations of MPs are compulsory to decode applied basics in recent inclinations towards cancer regimens. The current review focuses on the anticancer properties of commercially available and newly explored MPs, and outlines their direct and indirect mode of action. The review also highlights cutting-edge MPs-based drug delivery systems to augment the specificity and efficiency of available chemotherapeutics, as well as their emerging role in theranostics.


Assuntos
Materiais Biocompatíveis , Neoplasias , Humanos , Materiais Biocompatíveis/uso terapêutico , Materiais Biocompatíveis/química , Polissacarídeos/uso terapêutico , Polissacarídeos/química , Polissacarídeos/farmacologia , Sistemas de Liberação de Medicamentos , Neoplasias/diagnóstico , Neoplasias/tratamento farmacológico , Microambiente Tumoral
3.
Biochem Soc Trans ; 51(5): 1871-1879, 2023 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-37656433

RESUMO

Dynamic pathway engineering aims to build metabolic production systems embedded with intracellular control mechanisms for improved performance. These control systems enable host cells to self-regulate the temporal activity of a production pathway in response to perturbations, using a combination of biosensors and feedback circuits for controlling expression of heterologous enzymes. Pathway design, however, requires assembling together multiple biological parts into suitable circuit architectures, as well as careful calibration of the function of each component. This results in a large design space that is costly to navigate through experimentation alone. Methods from artificial intelligence (AI) and machine learning are gaining increasing attention as tools to accelerate the design cycle, owing to their ability to identify hidden patterns in data and rapidly screen through large collections of designs. In this review, we discuss recent developments in the application of machine learning methods to the design of dynamic pathways and their components. We cover recent successes and offer perspectives for future developments in the field. The integration of AI into metabolic engineering pipelines offers great opportunities to streamline design and discover control systems for improved production of high-value chemicals.


Assuntos
Inteligência Artificial , Técnicas Biossensoriais , Aprendizado de Máquina , Engenharia Metabólica/métodos
4.
Biophys J ; 119(5): 1002-1014, 2020 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-32814062

RESUMO

Transcriptional bursting is a major source of noise in gene expression. The telegraph model of gene expression, whereby transcription switches between on and off states, is the dominant model for bursting. Recently, it was shown that the telegraph model cannot explain a number of experimental observations from perturbation data. Here, we study an alternative model that is consistent with the data and which explicitly describes RNA polymerase recruitment and polymerase pause release, two steps necessary for messenger RNA (mRNA) production. We derive the exact steady-state distribution of mRNA numbers and an approximate steady-state distribution of protein numbers, which are given by generalized hypergeometric functions. The theory is used to calculate the relative sensitivity of the coefficient of variation of mRNA fluctuations for thousands of genes in mouse fibroblasts. This indicates that the size of fluctuations is mostly sensitive to the rate of burst initiation and the mRNA degradation rate. Furthermore, we show that 1) the time-dependent distribution of mRNA numbers is accurately approximated by a modified telegraph model with a Michaelis-Menten like dependence of the effective transcription rate on RNA polymerase abundance, and 2) the model predicts that if the polymerase recruitment rate is comparable or less than the pause release rate, then upon gene replication, the mean number of RNA per cell remains approximately constant. This gene dosage compensation property has been experimentally observed and cannot be explained by the telegraph model with constant rates.


Assuntos
Modelos Genéticos , Estabilidade de RNA , Animais , Expressão Gênica , Camundongos , RNA Mensageiro/genética , Processos Estocásticos , Transcrição Gênica
5.
J Theor Biol ; 462: 259-269, 2019 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-30445000

RESUMO

Interactions between gene regulatory networks and metabolism produce a diversity of dynamics, including multistability and oscillations. Here, we characterize a regulatory mechanism that drives the emergence of periodic oscillations in metabolic networks subject to genetic feedback regulation by pathway intermediates. We employ a qualitative formalism based on piecewise linear models to systematically analyze the behavior of gene-regulated metabolic pathways. For a pathway with two metabolites and three enzymes, we prove the existence of two co-existing oscillatory behaviors: damped oscillations towards a fixed point or sustained oscillations along a periodic orbit. We show that this mechanism closely resembles the "metabolator", a genetic-metabolic circuit engineered to produce autonomous oscillations in vivo.


Assuntos
Relógios Biológicos , Redes Reguladoras de Genes , Modelos Lineares , Redes e Vias Metabólicas , Retroalimentação , Modelos Genéticos
6.
J Ind Microbiol Biotechnol ; 45(7): 535-543, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-29380150

RESUMO

Advances in metabolic engineering have led to the synthesis of a wide variety of valuable chemicals in microorganisms. The key to commercializing these processes is the improvement of titer, productivity, yield, and robustness. Traditional approaches to enhancing production use the "push-pull-block" strategy that modulates enzyme expression under static control. However, strains are often optimized for specific laboratory set-up and are sensitive to environmental fluctuations. Exposure to sub-optimal growth conditions during large-scale fermentation often reduces their production capacity. Moreover, static control of engineered pathways may imbalance cofactors or cause the accumulation of toxic intermediates, which imposes burden on the host and results in decreased production. To overcome these problems, the last decade has witnessed the emergence of a new technology that uses synthetic regulation to control heterologous pathways dynamically, in ways akin to regulatory networks found in nature. Here, we review natural metabolic control strategies and recent developments in how they inspire the engineering of dynamically regulated pathways. We further discuss the challenges of designing and engineering dynamic control and highlight how model-based design can provide a powerful formalism to engineer dynamic control circuits, which together with the tools of synthetic biology, can work to enhance microbial production.


Assuntos
Proteínas de Bactérias/metabolismo , Engenharia Metabólica/métodos , Redes e Vias Metabólicas , Biologia Sintética/métodos , Técnicas Biossensoriais , Fermentação
7.
Proc Natl Acad Sci U S A ; 112(9): E1038-47, 2015 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-25695966

RESUMO

Intracellular processes rarely work in isolation but continually interact with the rest of the cell. In microbes, for example, we now know that gene expression across the whole genome typically changes with growth rate. The mechanisms driving such global regulation, however, are not well understood. Here we consider three trade-offs that, because of limitations in levels of cellular energy, free ribosomes, and proteins, are faced by all living cells and we construct a mechanistic model that comprises these trade-offs. Our model couples gene expression with growth rate and growth rate with a growing population of cells. We show that the model recovers Monod's law for the growth of microbes and two other empirical relationships connecting growth rate to the mass fraction of ribosomes. Further, we can explain growth-related effects in dosage compensation by paralogs and predict host-circuit interactions in synthetic biology. Simulating competitions between strains, we find that the regulation of metabolic pathways may have evolved not to match expression of enzymes to levels of extracellular substrates in changing environments but rather to balance a trade-off between exploiting one type of nutrient over another. Although coarse-grained, the trade-offs that the model embodies are fundamental, and, as such, our modeling framework has potentially wide application, including in both biotechnology and medicine.


Assuntos
Bactérias/metabolismo , Fenômenos Fisiológicos Bacterianos , Proliferação de Células/fisiologia , Regulação Bacteriana da Expressão Gênica/fisiologia , Modelos Biológicos
8.
J Theor Biol ; 365: 469-85, 2015 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-25451533

RESUMO

The regulation of metabolic activity by tuning enzyme expression levels is crucial to sustain cellular growth in changing environments. Metabolic networks are often studied at steady state using constraint-based models and optimization techniques. However, metabolic adaptations driven by changes in gene expression cannot be analyzed by steady state models, as these do not account for temporal changes in biomass composition. Here we present a dynamic optimization framework that integrates the metabolic network with the dynamics of biomass production and composition. An approximation by a timescale separation leads to a coupled model of quasi-steady state constraints on the metabolic reactions, and differential equations for the substrate concentrations and biomass composition. We propose a dynamic optimization approach to determine reaction fluxes for this model, explicitly taking into account enzyme production costs and enzymatic capacity. In contrast to the established dynamic flux balance analysis, our approach allows predicting dynamic changes in both the metabolic fluxes and the biomass composition during metabolic adaptations. Discretization of the optimization problems leads to a linear program that can be efficiently solved. We applied our algorithm in two case studies: a minimal nutrient uptake network, and an abstraction of core metabolic processes in bacteria. In the minimal model, we show that the optimized uptake rates reproduce the empirical Monod growth for bacterial cultures. For the network of core metabolic processes, the dynamic optimization algorithm predicted commonly observed metabolic adaptations, such as a diauxic switch with a preference ranking for different nutrients, re-utilization of waste products after depletion of the original substrate, and metabolic adaptation to an impending nutrient depletion. These examples illustrate how dynamic adaptations of enzyme expression can be predicted solely from an optimization principle.


Assuntos
Regulação da Expressão Gênica , Redes e Vias Metabólicas/genética , Biocatálise , Biomassa , Carbono/metabolismo , Simulação por Computador , Redes Reguladoras de Genes , Cinética , Modelos Biológicos , Oxigênio/metabolismo , Fatores de Tempo
10.
Methods Mol Biol ; 2760: 345-369, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38468098

RESUMO

The identification of essential genes is a key challenge in systems and synthetic biology, particularly for engineering metabolic pathways that convert feedstocks into valuable products. Assessment of gene essentiality at a genome scale requires large and costly growth assays of knockout strains. Here we describe a strategy to predict the essentiality of metabolic genes using binary classification algorithms. The approach combines elements from genome-scale metabolic models, directed graphs, and machine learning into a predictive model that can be trained on small knockout data. We demonstrate the efficacy of this approach using the most complete metabolic model of Escherichia coli and various machine learning algorithms for binary classification.


Assuntos
Algoritmos , Aprendizado de Máquina , Escherichia coli/genética , Escherichia coli/metabolismo , Genes Essenciais , Redes e Vias Metabólicas/genética
11.
NPJ Syst Biol Appl ; 10(1): 24, 2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38448436

RESUMO

Genome-scale metabolic models are powerful tools for understanding cellular physiology. Flux balance analysis (FBA), in particular, is an optimization-based approach widely employed for predicting metabolic phenotypes. In model microbes such as Escherichia coli, FBA has been successful at predicting essential genes, i.e. those genes that impair survival when deleted. A central assumption in this approach is that both wild type and deletion strains optimize the same fitness objective. Although the optimality assumption may hold for the wild type metabolic network, deletion strains are not subject to the same evolutionary pressures and knock-out mutants may steer their metabolism to meet other objectives for survival. Here, we present FlowGAT, a hybrid FBA-machine learning strategy for predicting essentiality directly from wild type metabolic phenotypes. The approach is based on graph-structured representation of metabolic fluxes predicted by FBA, where nodes correspond to enzymatic reactions and edges quantify the propagation of metabolite mass flow between a reaction and its neighbours. We integrate this information into a graph neural network that can be trained on knock-out fitness assay data. Comparisons across different model architectures reveal that FlowGAT predictions for E. coli are close to those of FBA for several growth conditions. This suggests that essentiality of enzymatic genes can be predicted by exploiting the inherent network structure of metabolism. Our approach demonstrates the benefits of combining the mechanistic insights afforded by genome-scale models with the ability of deep learning to infer patterns from complex datasets.


Assuntos
Escherichia coli , Aprendizado de Máquina , Escherichia coli/genética , Redes Neurais de Computação , Fenótipo
12.
Curr Opin Biotechnol ; 81: 102941, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37087839

RESUMO

Advances in high-throughput DNA synthesis and sequencing have fuelled the use of massively parallel reporter assays for strain characterization. These experiments produce large datasets that map DNA sequences to protein expression levels, and have sparked increased interest in data-driven methods for sequence-to-expression modeling. Here, we highlight progress in deep learning models of protein expression and their potential for optimizing strains engineered to produce recombinant proteins. We discuss recent works that built highly accurate models as well as the challenges that hinder wider adoption by end users. There is a need to better align this technology with the requirements and capabilities encountered in strain engineering, particularly the cost of data acquisition and the need for interpretable models that generalize beyond the training data. Overcoming these barriers will help to incentivize academic and industrial laboratories to tap into a new era of data-centric strain engineering.


Assuntos
Bioengenharia , Aprendizado Profundo , Proteínas , Proteínas Recombinantes
13.
ACS Synth Biol ; 12(7): 2073-2082, 2023 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-37339382

RESUMO

Recent advances in synthetic biology have enabled the construction of molecular circuits that operate across multiple scales of cellular organization, such as gene regulation, signaling pathways, and cellular metabolism. Computational optimization can effectively aid the design process, but current methods are generally unsuited for systems with multiple temporal or concentration scales, as these are slow to simulate due to their numerical stiffness. Here, we present a machine learning method for the efficient optimization of biological circuits across scales. The method relies on Bayesian optimization, a technique commonly used to fine-tune deep neural networks, to learn the shape of a performance landscape and iteratively navigate the design space toward an optimal circuit. This strategy allows the joint optimization of both circuit architecture and parameters, and provides a feasible approach to solve a highly nonconvex optimization problem in a mixed-integer input space. We illustrate the applicability of the method on several gene circuits for controlling biosynthetic pathways with strong nonlinearities, multiple interacting scales, and using various performance objectives. The method efficiently handles large multiscale problems and enables parametric sweeps to assess circuit robustness to perturbations, serving as an efficient in silico screening method prior to experimental implementation.


Assuntos
Regulação da Expressão Gênica , Redes Reguladoras de Genes , Teorema de Bayes , Redes Reguladoras de Genes/genética , Transdução de Sinais , Redes Neurais de Computação
14.
ACS Synth Biol ; 12(3): 709-721, 2023 03 17.
Artigo em Inglês | MEDLINE | ID: mdl-36802585

RESUMO

The discovery of clustered, regularly interspaced, short palindromic repeats (CRISPR) and the Cas9 RNA-guided nuclease provides unprecedented opportunities to selectively kill specific populations or species of bacteria. However, the use of CRISPR-Cas9 to clear bacterial infections in vivo is hampered by the inefficient delivery of cas9 genetic constructs into bacterial cells. Here, we use a broad-host-range P1-derived phagemid to deliver the CRISPR-Cas9 chromosomal-targeting system into Escherichia coli and the dysentery-causing Shigella flexneri to achieve DNA sequence-specific killing of targeted bacterial cells. We show that genetic modification of the helper P1 phage DNA packaging site (pac) significantly enhances the purity of packaged phagemid and improves the Cas9-mediated killing of S. flexneri cells. We further demonstrate that P1 phage particles can deliver chromosomal-targeting cas9 phagemids into S. flexneri in vivo using a zebrafish larvae infection model, where they significantly reduce the bacterial load and promote host survival. Our study highlights the potential of combining P1 bacteriophage-based delivery with the CRISPR chromosomal-targeting system to achieve DNA sequence-specific cell lethality and efficient clearance of bacterial infection.


Assuntos
Anti-Infecciosos , Sistemas CRISPR-Cas , Sistemas CRISPR-Cas/genética , Edição de Genes , Bacteriófago P1/genética , Peixe-Zebra/genética , Shigella flexneri/genética , Animais
15.
Nat Commun ; 14(1): 3445, 2023 06 10.
Artigo em Inglês | MEDLINE | ID: mdl-37301862

RESUMO

Cellular senescence is a stress response involved in ageing and diverse disease processes including cancer, type-2 diabetes, osteoarthritis and viral infection. Despite growing interest in targeted elimination of senescent cells, only few senolytics are known due to the lack of well-characterised molecular targets. Here, we report the discovery of three senolytics using cost-effective machine learning algorithms trained solely on published data. We computationally screened various chemical libraries and validated the senolytic action of ginkgetin, periplocin and oleandrin in human cell lines under various modalities of senescence. The compounds have potency comparable to known senolytics, and we show that oleandrin has improved potency over its target as compared to best-in-class alternatives. Our approach led to several hundred-fold reduction in drug screening costs and demonstrates that artificial intelligence can take maximum advantage of small and heterogeneous drug screening data, paving the way for new open science approaches to early-stage drug discovery.


Assuntos
Inteligência Artificial , Senoterapia , Humanos , Envelhecimento/fisiologia , Senescência Celular , Aprendizado de Máquina
16.
J Theor Biol ; 295: 139-53, 2012 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-22137968

RESUMO

Genetic control of enzyme activity drives metabolic adaptations to environmental changes, and therefore the feedback interaction between gene expression and metabolism is essential to cell fitness. In this paper we develop a new formalism to detect the equilibrium regimes of an unbranched metabolic network under transcriptional feedback from one metabolite. Our results indicate that one-to-all transcriptional feedback can induce a wide range of metabolic phenotypes, including mono-, multistability and oscillatory behavior. The analysis is based on the use of switch-like models for transcriptional control and the exploitation of the time scale separation between metabolic and genetic dynamics. For any combination of activation and repression feedback loops, we derive conditions for the emergence of a specific phenotype in terms of genetic parameters such as enzyme expression rates and regulatory thresholds. We find that metabolic oscillations can emerge under uniform thresholds and, in the case of operon-controlled networks, the analysis reveals how nutrient-induced bistability and oscillations can emerge as a consequence of the transcriptional feedback.


Assuntos
Relógios Biológicos/fisiologia , Redes Reguladoras de Genes/fisiologia , Redes e Vias Metabólicas/genética , Modelos Genéticos , Animais , Retroalimentação Fisiológica/fisiologia , Regulação Enzimológica da Expressão Gênica/fisiologia , Óperon/genética , Transcrição Gênica/fisiologia
17.
J Pharmacokinet Pharmacodyn ; 39(2): 125-39, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22399130

RESUMO

Cell-level kinetic models for therapeutically relevant processes increasingly benefit the early stages of drug development. Later stages of the drug development processes, however, rely on pharmacokinetic compartment models while cell-level dynamics are typically neglected. We here present a systematic approach to integrate cell-level kinetic models and pharmacokinetic compartment models. Incorporating target dynamics into pharmacokinetic models is especially useful for the development of therapeutic antibodies because their effect and pharmacokinetics are inherently interdependent. The approach is illustrated by analysing the F(ab)-mediated inhibitory effect of therapeutic antibodies targeting the epidermal growth factor receptor. We build a multi-level model for anti-EGFR antibodies by combining a systems biology model with in vitro determined parameters and a pharmacokinetic model based on in vivo pharmacokinetic data. Using this model, we investigated in silico the impact of biochemical properties of anti-EGFR antibodies on their F(ab)-mediated inhibitory effect. The multi-level model suggests that the F(ab)-mediated inhibitory effect saturates with increasing drug-receptor affinity, thereby limiting the impact of increasing antibody affinity on improving the effect. This indicates that observed differences in the therapeutic effects of high affinity antibodies in the market and in clinical development may result mainly from Fc-mediated indirect mechanisms such as antibody-dependent cell cytotoxicity.


Assuntos
Anticorpos Monoclonais/farmacocinética , Membrana Celular/metabolismo , Fragmentos Fab das Imunoglobulinas/fisiologia , Modelos Biológicos , Animais , Anticorpos Monoclonais Humanizados , Linhagem Celular Tumoral , Membrana Celular/efeitos dos fármacos , Previsões , Humanos , Macaca fascicularis , Transdução de Sinais/efeitos dos fármacos , Transdução de Sinais/fisiologia
18.
J R Soc Interface ; 19(188): 20210762, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35259958

RESUMO

A key goal in synthetic biology is the construction of molecular circuits that robustly adapt to perturbations. Although many natural systems display perfect adaptation, whereby stationary molecular concentrations are insensitive to perturbations, its de novo engineering has proven elusive. The discovery of the antithetic control motif was a significant step towards a universal mechanism for engineering perfect adaptation. Antithetic control provides perfect adaptation in a wide range of systems, but it can lead to oscillatory dynamics due to loss of stability; moreover, it can lose perfect adaptation in fast growing cultures. Here, we introduce an extended antithetic control motif that resolves these limitations. We show that molecular buffering, a widely conserved mechanism for homeostatic control in Nature, stabilizes oscillations and allows for near-perfect adaptation during rapid growth. We study multiple buffering topologies and compare their performance in terms of their stability and adaptation properties. We illustrate the benefits of our proposed strategy in exemplar models for biofuel production and growth rate control in bacterial cultures. Our results provide an improved circuit for robust control of biomolecular systems.


Assuntos
Modelos Biológicos , Biologia Sintética , Aclimatação , Adaptação Fisiológica , Homeostase
19.
ACS Synth Biol ; 11(1): 228-240, 2022 01 21.
Artigo em Inglês | MEDLINE | ID: mdl-34968029

RESUMO

Recent progress in synthetic biology allows the construction of dynamic control circuits for metabolic engineering. This technology promises to overcome many challenges encountered in traditional pathway engineering, thanks to its ability to self-regulate gene expression in response to bioreactor perturbations. The central components in these control circuits are metabolite biosensors that read out pathway signals and actuate enzyme expression. However, the construction of metabolite biosensors is a major bottleneck for strain design, and a key challenge is to understand the relation between biosensor dose-response curves and pathway performance. Here we employ multiobjective optimization to quantify performance trade-offs that arise in the design of metabolite biosensors. Our approach reveals strategies for tuning dose-response curves along an optimal trade-off between production flux and the cost of an increased expression burden on the host. We explore properties of control architectures built in the literature and identify their advantages and caveats in terms of performance and robustness to growth conditions and leaky promoters. We demonstrate the optimality of a control circuit for glucaric acid production in Escherichia coli, which has been shown to increase the titer by 2.5-fold as compared to static designs. Our results lay the groundwork for the automated design of control circuits for pathway engineering, with applications in the food, energy, and pharmaceutical sectors.


Assuntos
Técnicas Biossensoriais , Engenharia Metabólica , Técnicas Biossensoriais/métodos , Escherichia coli/genética , Escherichia coli/metabolismo , Engenharia Metabólica/métodos , Regiões Promotoras Genéticas , Biologia Sintética/métodos
20.
Nat Commun ; 13(1): 7755, 2022 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-36517468

RESUMO

Synthetic biology often involves engineering microbial strains to express high-value proteins. Thanks to progress in rapid DNA synthesis and sequencing, deep learning has emerged as a promising approach to build sequence-to-expression models for strain optimization. But such models need large and costly training data that create steep entry barriers for many laboratories. Here we study the relation between accuracy and data efficiency in an atlas of machine learning models trained on datasets of varied size and sequence diversity. We show that deep learning can achieve good prediction accuracy with much smaller datasets than previously thought. We demonstrate that controlled sequence diversity leads to substantial gains in data efficiency and employed Explainable AI to show that convolutional neural networks can finely discriminate between input DNA sequences. Our results provide guidelines for designing genotype-phenotype screens that balance cost and quality of training data, thus helping promote the wider adoption of deep learning in the biotechnology sector.


Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Aprendizado de Máquina , Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA