RESUMO
3α-Hydroxysteroid dehydrogenase (3α-HSD) from Comamonas testosteroni is widely used in clinical settings to measure serum total bile acid levels. However, its low enzymatic activity leads to high operational costs. In this study, we employed a combinatorial mutagenesis approach to systematically identify potential key mutation sites within the enzyme. The enzyme molecule was segmented into distinct regions, and a comprehensive strategy integrating substrate pocket engineering, binding energy calculations, and deep learning techniques was used. Through experimental verification, single-point mutants from the mutation library with enhanced enzymatic activity by at least 1.5-fold were identified. Through iterative combinatorial mutations of them, the optimal mutant H119A/R201G/R216L was obtained. This mutant exhibited a specific activity of 34.18â¯U/mg towards deoxycholic acid, representing a 6.85-fold increase over the wild-type (WT) enzyme. Additionally, the optimal temperature of the mutant increased from 35⯰C to 40⯰C, and its turnover number and catalytic efficiency increased by 6.4-fold and 9.4-fold, respectively. Quantum mechanics/molecular mechanics (QM/MM) calculations indicated that the energy barrier of the dehydrogenase reaction was reduced in the H119A/R201G/R216L mutant compared to that of the WT enzyme. Specifically, the R201G mutation significantly reduced the electric field strength along the 3α-hydroxyl group, facilitating its deprotonation. This study provides insights into enhancing enzymatic efficiency through strategic mutagenesis and elucidates mechanistic changes that optimize enzyme performance for clinical and biotechnological applications.
RESUMO
A strategy to obtain the greatest number of best-performing variants with least amount of experimental effort over the vast combinatorial mutational landscape would have enormous utility in boosting resource producibility for protein engineering. Toward this goal, we present a simple and effective machine learning-based strategy that outperforms other state-of-the-art methods. Our strategy integrates zero-shot prediction and multi-round sampling to direct active learning via experimenting with only a few predicted top variants. We find that four rounds of low-N pick-and-validate sampling of 12 variants for machine learning yielded the best accuracy of up to 92.6% in selecting the true top 1% variants in combinatorial mutant libraries, whereas two rounds of 24 variants can also be used. We demonstrate our strategy in successfully discovering high-performance protein variants from diverse families including the CRISPR-based genome editors, supporting its generalizable application for solving protein engineering tasks. A record of this paper's transparent peer review process is included in the supplemental information.
Assuntos
Aprendizado de Máquina , Engenharia de Proteínas , Humanos , Mutação/genética , GenomaRESUMO
Combinatorial mutagenesis is a method where multiple user-defined mutations are encoded at defined positions in a sequence. Combinatorial mutagenic libraries can be used in a variety of applications including evaluating fundamental questions about molecular evolution, directed evolution workflows for enzyme engineering, and in better understanding of biological processes like antibody affinity maturation. Here, we show a method of combinatorial mutagenesis utilizing the template-based nicking mutagenesis with several modifications. We show an example for generating a combinatorial library with 14 mutated positions, a total of 16,384 library variants, and a protocol for the generation of large, user-defined combinatorial libraries. The reader can use this protocol to create such libraries in 2 days.
Assuntos
Evolução Molecular Direcionada , Engenharia de Proteínas , Evolução Molecular Direcionada/métodos , Biblioteca Gênica , Mutagênese , Mutagênese Sítio-Dirigida , Mutação , Engenharia de Proteínas/métodosRESUMO
Zika virus (ZIKV) is a flavivirus that can cause severe disease, but there are no approved treatments or vaccines. A complication for flavivirus vaccine development is the potential of immunogens to enhance infection via antibody-dependent enhancement (ADE), a process mediated by poorly neutralizing and cross-reactive antibodies. Thus, there is a great need to develop immunogens that minimize the potential to elicit enhancing antibodies. Here we utilized structure-based protein engineering to develop "resurfaced" (rs) ZIKV immunogens based on E glycoprotein domain III (ZDIIIs), in which epitopes bound by variably neutralizing antibodies were masked by combinatorial mutagenesis. We identified one resurfaced ZDIII immunogen (rsZDIII-2.39) that elicited a protective but immune-focused response. Compared to wild type ZDIII, immunization with resurfaced rsZDIII-2.39 protein nanoparticles produced fewer numbers of ZIKV EDIII antigen-reactive B cells and elicited serum that had a lower magnitude of induced ADE against dengue virus serotype 1 (DENV1) Our findings enhance our understanding of the structural and functional determinants of antibody protection against ZIKV.
Assuntos
Vírus da Dengue , Nanopartículas , Infecção por Zika virus , Zika virus , Anticorpos Neutralizantes , Anticorpos Antivirais , Vírus da Dengue/química , Humanos , Proteínas do Envelope Viral/química , Proteínas do Envelope Viral/genética , Infecção por Zika virus/prevenção & controleRESUMO
As one of the main influenza antigens, neuraminidase (NA) in H3N2 virus has evolved extensively for more than 50 years due to continuous immune pressure. While NA has recently emerged as an effective vaccine target, biophysical constraints on the antigenic evolution of NA remain largely elusive. Here, we apply combinatorial mutagenesis and next-generation sequencing to characterize the local fitness landscape in an antigenic region of NA in six different human H3N2 strains that were isolated around 10 years apart. The local fitness landscape correlates well among strains and the pairwise epistasis is highly conserved. Our analysis further demonstrates that local net charge governs the pairwise epistasis in this antigenic region. In addition, we show that residue coevolution in this antigenic region is correlated with the pairwise epistasis between charge states. Overall, this study demonstrates the importance of quantifying epistasis and the underlying biophysical constraint for building a model of influenza evolution.
Assuntos
Antígenos Virais/imunologia , Evolução Molecular , Vírus da Influenza A Subtipo H3N2/imunologia , Neuraminidase/genética , Proteínas Virais/genética , Humanos , Influenza Humana/imunologia , Neuraminidase/imunologia , Proteínas Virais/imunologiaRESUMO
Generating combinatorial libraries of specific sets of mutations are essential for addressing protein engineering questions involving contingency in molecular evolution, epistatic relationships between mutations, as well as functional antibody and enzyme engineering. Here we present optimization of a combinatorial mutagenesis method involving template-based nicking mutagenesis, which allows for the generation of libraries with >99% coverage for tens of thousands of user-defined variants. The non-optimized method resulted in low library coverage, which could be rationalized by a model of oligonucleotide annealing bias resulting from the nucleotide mismatch free-energy difference between mutagenic oligo and template. The optimized method mitigated this thermodynamic bias using longer primer sets and faster annealing conditions. Our updated method, applied to two antibody fragments, delivered between 99.0% (32451/32768 library members) to >99.9% coverage (32757/32768) for our desired libraries in 2 days and at an approximate 140-fold sequencing depth of coverage.
Assuntos
Engenharia de Proteínas , Biblioteca Gênica , Mutagênese , MutaçãoRESUMO
Directed evolution of proteins often involves a greedy optimization in which the mutation in the highest-fitness variant identified in each round of single-site mutagenesis is fixed. The efficiency of such a single-step greedy walk depends on the order in which beneficial mutations are identified-the process is path dependent. Here, we investigate and optimize a path-independent machine learning-assisted directed evolution (MLDE) protocol that allows in silico screening of full combinatorial libraries. In particular, we evaluate the importance of different protein encoding strategies, training procedures, models, and training set design strategies on MLDE outcome, finding the most important consideration to be the implementation of strategies that reduce inclusion of minimally informative "holes" (protein variants with zero or extremely low fitness) in training data. When applied to an epistatic, hole-filled, four-site combinatorial fitness landscape, our optimized protocol achieved the global fitness maximum up to 81-fold more frequently than single-step greedy optimization. A record of this paper's transparent peer review process is included in the supplemental information.
Assuntos
Aprendizado de Máquina , Proteínas , Mutagênese , Mutação/genética , Proteínas/genéticaRESUMO
Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild-type variant. Even with a high-throughput screening on pooled libraries and Next-Generation Sequencing to boost the scale of read-outs, surveying all the variants with combinatorial mutations for their empirical fitness scores is still of magnitudes beyond the capacity of existing experimental settings. To tackle this challenge, in-silico approaches using machine learning to predict the fitness of novel variants based on a subset of empirical measurements are now employed. These machine learning models turn out to be useful in many cases, with the premise that the experimentally determined fitness scores and the amino-acid descriptors of the models are informative. The machine learning models can guide the search for the highest fitness variants, resolve complex epistatic relationships, and highlight bio-physical rules for protein folding. Using machine learning-guided approaches, researchers can build more focused libraries, thus relieving themselves from labor-intensive screens and fast-tracking the optimization process. Here, we describe the current advances in massive-scale variant screens, and how machine learning and mutagenesis strategies can be integrated to accelerate protein engineering. More specifically, we examine strategies to make screens more economical, informative, and effective in discovery of useful variants.
RESUMO
Naringinase was mainly obtained by microbial fermentation, and mutagenesis was a major way for obtaining excellent mutants. The aim of this study was to screen out a high naringinase yielding mutant to enhance the potential application value of its industrialization and compare the effects of different mutagenic methods on the enzyme activity of the strain. A novel producing naringinase strain, Aspergillus tubingensis MN589840, was isolated from mildewed pomelo peel, later subjected to mutagenesis including UV, ARTP and UV-ARTP. After five rounds iterative mutagenesis, the mutants U1, A6 and UA13 were screened out with 1448·49, 1848·71, 2475·16 U mg-1 enzyme activity, the naringinase productivity raised by 79·08, 123·56 and 206%, respectively. In addition, the naringinase activity of three mutants rose after each round of iterative mutagenesis. These results indicated that the mutagenesis efficiency of UV-ARTP was higher than that of single ARTP, and both are better than UV. In summary, the iterative UV-ARTP mutagenesis is an effective strategy for screening high naringinase-producing strains.
Assuntos
Aspergillus/genética , Aspergillus/metabolismo , Complexos Multienzimáticos/biossíntese , beta-Glucosidase/biossíntese , Aspergillus/classificação , Fermentação , Complexos Multienzimáticos/genética , Mutagênese , beta-Glucosidase/genéticaRESUMO
Exploring how combinatorial mutations can be combined to optimize protein functions is important to guide protein engineering. Given the vast combinatorial space of changing multiple amino acids, identifying the top-performing variants from a large number of mutants might not be possible without a high-throughput gene assembly and screening strategy. Here we describe the CombiSEAL platform, a strategy that allows for modularization of any protein sequence into multiple segments for mutagenesis and barcoding, and seamless single-pot ligations of different segments to generate a library of combination mutants linked with concatenated barcodes at one end. By reading the barcodes using next-generation sequencing, activities of each protein variant during the protein selection process can be easily tracked in a high-throughput manner. CombiSEAL not only allows the identification of better protein variants but also enables the systematic analyses to distinguish the beneficial, deleterious, and neutral effects of combining different mutations on protein functions.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Mutagênese , Engenharia de Proteínas , Proteínas Recombinantes/genéticaRESUMO
The combination of high-quality mutagenesis and effective screening can improve the efficiency of enzyme directed evolution. In this study, a high efficiency cloning construction method of Multi-points Combinational Mutagenesis (MCM) was developed. Efficient multi-point combination mutations were performed in this MCM method by introducing DNA assembly, fusion PCR and hybridization techniques. After optimization, the efficiency of MCM was tested by directed evolution of benzoylformate decarboxylase. The obtained number of Colony Forming Units (CFUs) by electroporation to competent cells E. coli Trelief™ 5α exceeded 106 CFUs/µg DNA. Test results show that 90/100 clones were precisely assembled. The efficiency of simultaneous mutation at 5 sites (L109, L110, H281, Q282 and A460) was up to 88%. Finally, a mutant enzyme (L109Y, L110D, H281G, Q282V and A460M) with a 10-fold increase in kcat/Km was obtained. Therefore, this method can effectively create diverse mutant libraries and promote the rapid development of enzyme directed evolution.
Assuntos
Evolução Molecular Direcionada , Escherichia coli , Clonagem Molecular , Biblioteca Gênica , MutagêneseRESUMO
Conversion of the free energy of NTP hydrolysis efficiently into mechanical work and/or information by transducing enzymes sustains living systems far from equilibrium, and so has been of interest for many decades. Detailed molecular mechanisms, however, remain puzzling and incomplete. We previously reported that catalysis of tryptophan activation by tryptophanyl-tRNA synthetase, TrpRS, requires relative domain motion to re-position the catalytic Mg2+ ion, noting the analogy between that conditional hydrolysis of ATP and the escapement mechanism of a mechanical clock. The escapement allows the time-keeping mechanism to advance discretely, one gear at a time, if and only if the pendulum swings, thereby converting energy from the weight driving the pendulum into rotation of the hands. Coupling of catalysis to domain motion, however, mimics only half of the escapement mechanism, suggesting that domain motion may also be reciprocally coupled to catalysis, completing the escapement metaphor. Computational studies of the free energy surface restraining the domain motion later confirmed that reciprocal coupling: the catalytic domain motion is thermodynamically unfavorable unless the PPi product is released from the active site. These two conditional phenomena-demonstrated together only for the TrpRS mechanism-function as reciprocally-coupled gates. As we and others have noted, such an escapement mechanism is essential to the efficient transduction of NTP hydrolysis free energy into other useful forms of mechanical or chemical work and/or information. Some implementation of both gating mechanisms-catalysis by domain motion and domain motion by catalysis-will thus likely be found in many other systems.
Assuntos
Trifosfato de Adenosina/química , Proteínas de Bactérias/química , Geobacillus stearothermophilus/enzimologia , Magnésio/química , Triptofano-tRNA Ligase/química , Triptofano/química , Trifosfato de Adenosina/metabolismo , Regulação Alostérica , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Sítios de Ligação , Biocatálise , Fenômenos Biomecânicos , Domínio Catalítico , Cátions Bivalentes , Geobacillus stearothermophilus/química , Geobacillus stearothermophilus/genética , Cinética , Magnésio/metabolismo , Modelos Moleculares , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas , Transdução de Sinais , Especificidade por Substrato , Termodinâmica , Triptofano/metabolismo , Triptofano-tRNA Ligase/genética , Triptofano-tRNA Ligase/metabolismoRESUMO
Defining the extent of epistasis-the nonindependence of the effects of mutations-is essential for understanding the relationship of genotype, phenotype, and fitness in biological systems. The applications cover many areas of biological research, including biochemistry, genomics, protein and systems engineering, medicine, and evolutionary biology. However, the quantitative definitions of epistasis vary among fields, and the analysis beyond just pairwise effects can be problematic. Here, we demonstrate the application of a particular mathematical formalism, the weighted Walsh-Hadamard transform, which unifies a number of different definitions of epistasis. We provide a computational implementation of such analysis using a computer-generated higher-order mutational dataset. We discuss general considerations regarding the null hypothesis for independent mutational effects, which then allows a quantitative identification of epistasis in an experimental dataset.
Assuntos
Mutação/genética , Proteínas/genética , Evolução Biológica , Biologia Computacional , Epistasia Genética/genética , Evolução Molecular , Genótipo , Modelos Genéticos , Seleção Genética/genéticaRESUMO
Generating mutant strains is an essential component of microbial genetics. Natural genetic transformation, a process for the uptake and integration of foreign DNA, is shared by diverse microbial species and can be exploited for making mutant strains. Canonically, this process has been used to generate single mutants and sequentially to generate strains with multiple mutations. Recently, we have described a method for multiplex genome editing by natural transformation (MuGENT), which allows the generation of strains with multiple scarless mutations in a single step. Here, we provide a detailed description of the methods used for mutagenesis of the cholera pathogen Vibrio cholerae with a particular emphasis on mutagenesis via MuGENT.
Assuntos
Edição de Genes , Genoma Bacteriano , Transformação Bacteriana , Vibrio cholerae/genética , Evolução Molecular , Genômica/métodos , Mutagênese , MutaçãoRESUMO
Strain engineering for industrial production requires a targeted improvement of multiple complex traits, which range from pathway flux to tolerance to mixed sugar utilization. Here, we report the use of an iterative CRISPR EnAbled Trackable genome Engineering (iCREATE) method to engineer rapid glucose and xylose co-consumption and tolerance to hydrolysate inhibitors in E. coli. Deep mutagenesis libraries were rationally designed, constructed, and screened to target ~40,000 mutations across 30 genes. These libraries included global and high-level regulators that regulate global gene expression, transcription factors that play important roles in genome-level transcription, enzymes that function in the sugar transport system, NAD(P)H metabolism, and the aldehyde reduction system. Specific mutants that conferred increased growth in mixed sugars and hydrolysate tolerance conditions were isolated, confirmed, and evaluated for changes in genome-wide expression levels. We tested the strain with positive combinatorial mutations for 3-hydroxypropionic acid (3HP) production under high furfural and high acetate hydrolysate fermentation, which demonstrated a 7- and 8-fold increase in 3HP productivity relative to the parent strain, respectively.
Assuntos
Escherichia coli/genética , Edição de Genes/métodos , Engenharia Metabólica/métodos , Mutagênese , Escherichia coli/metabolismoRESUMO
Aiming to improve thermostability of the mesophilic xylanase A from Bacillus subtilis (XynA), five single mutants (S22E, S27E, N32D, N54E and N181R) were used to construct a random combinatorial library, and screening of this library for thermostable XynA variants identified a double mutant (S22E/N32D). All 6 mutants were expressed in Escherichia coli (BL21) and purified. Xylanase activity showed all mutants have an optimum catalytic temperature (Topt) of 55°C, and with the exception of the S27E mutant, a higher specific activity than the wild-type XynA. The time for loss of 50% activity at 55°C (t50) decreased in the order S22E/N32D>N181R>S22E>Wild-type>S27E=N32D≈N54E. The values of the van't Hoff denaturation enthalpy change (ΔHND), melting temperature (Tm) and heat capacity at constant pressure (ΔCp) between the native and denatured states were estimated from thermal denaturation curves monitored by circular dichroism ellipticity changes. The decreasing order of Gibbs free energy change at 328K (ΔG328) S22E/N32D>N181R>S22E>Wild-type>S27E≈N54E>N32D correlates well with the thermotolerance results, and is dominated by changes in ΔHND which is consistent with increased in hydrogen bonding in the thermostable mutants.
Assuntos
Bacillus subtilis/enzimologia , Endo-1,4-beta-Xilanases/química , Endo-1,4-beta-Xilanases/genética , Mutagênese Sítio-Dirigida , Temperatura , Bacillus subtilis/genética , Endo-1,4-beta-Xilanases/metabolismo , Estabilidade Enzimática , Modelos Moleculares , Mutação , Estrutura Secundária de ProteínaRESUMO
DNA engineering is the fundamental motive driving the rapid development of modern biotechnology. Here, we present a versatile evolution method termed "rapidly efficient combinatorial oligonucleotides for directed evolution" (RECODE) for rapidly introducing multiple combinatorial mutations to the target DNA by combined action of a thermostable high-fidelity DNA polymerase and a thermostable DNA Ligase in one reaction system. By applying this method, we rapidly constructed a variant library of the rpoS promoters (with activity of 8-460%), generated a novel heparinase from the highly specific leech hyaluronidase (with more than 30 mutant residues) and optimized the heme biosynthetic pathway by combinatorial evolution of regulatory elements and pathway enzymes (2500 ± 120 mg L(-1) with 20-fold increase). The simple RECODE method enabled researchers the unparalleled ability to efficiently create diverse mutant libraries for rapid evolution and optimization of enzymes and synthetic pathways.
Assuntos
Evolução Molecular Direcionada , Heparina Liase/genética , Hialuronoglucosaminidase/genética , Sequência de Aminoácidos , Animais , Sequência de Bases , Clonagem Molecular , DNA Ligases/metabolismo , DNA Polimerase Dirigida por DNA/metabolismo , Engenharia Genética , Heparina Liase/metabolismo , Hialuronoglucosaminidase/metabolismo , Sanguessugas/enzimologia , Dados de Sequência Molecular , Mutagênese Sítio-Dirigida , Reação em Cadeia da Polimerase , Regiões Promotoras Genéticas , Especificidade por SubstratoRESUMO
In the current study, a three-tiered mutagenesis strategy was employed to simultaneously improve the thermostability and activity of halohydrin dehalogenase from Agrobacterium radiobacter AD1 (HheC) by engineering the last ten amino acids (Met245â¼Glu254) of its C-terminal region. Initially, truncated mutagenesis results displayed that C-terminal deletions decreased the thermostability and/or activity of HheC. Then ten residues were subjected to single-site saturation mutagenesis, resulting in 20 beneficial single-point variants related to the thermostability or activity of HheC. The results clearly indicated that residues Met252â¼Glu254 and Trp249 are crucial for regulating enzyme thermostability and activity, respectively. Finally, the beneficial substitutions were combined using efficient multi-site combinatorial mutagenesis approaches, leading to an outstanding variant PX14 (Trp249Pro/Met252Leu/Pro253Asp), which had a 17.8-fold higher half-life and a 4.0-fold higher kcat value than that of wild-type HheC. These results indicated that the C-terminal residues play an important role in modulating both the thermostability and activity of HheC.
Assuntos
Agrobacterium tumefaciens/enzimologia , Aminoácidos/genética , Hidrolases/genética , Hidrolases/metabolismo , Sequência de Aminoácidos , Estabilidade Enzimática , Escherichia coli/genética , Escherichia coli/metabolismo , Temperatura Alta , Dados de Sequência Molecular , Mutagênese , Alinhamento de SequênciaRESUMO
Bacterial resistance to ß-lactam antibiotics is a global issue threatening the success of infectious disease treatments worldwide. Mycobacterium tuberculosis has been particularly resilient to ß-lactam treatment, primarily due to the chromosomally encoded BlaC ß-lactamase, a broad-spectrum hydrolase that renders ineffective the vast majority of relevant ß-lactam compounds currently in use. Recent laboratory and clinical studies have nevertheless shown that specific ß-lactam-BlaC inhibitor combinations can be used to inhibit the growth of extensively drug-resistant strains of M. tuberculosis, effectively offering new tools for combined treatment regimens against resistant strains. In the present work, we performed combinatorial active-site replacements in BlaC to demonstrate that specific inhibitor-resistant (IRT) substitutions at positions 69, 130, 220, and/or 234 can act synergistically to yield active-site variants with several thousand fold greater in vitro resistance to clavulanate, the most common clinical ß-lactamase inhibitor. While most single and double variants remain sensitive to clavulanate, double mutants R220S-K234R and S130G-K234R are substantially less affected by time-dependent clavulanate inactivation, showing residual ß-lactam hydrolytic activities of 46% and 83% after 24 h incubation with a clinically relevant inhibitor concentration (5 µg/ml, 25 µM). These results demonstrate that active-site alterations in BlaC yield resistant variants that remain active and stable over prolonged bacterial generation times compatible with mycobacterial proliferation. These results also emphasize the formidable adaptive potential of inhibitor-resistant substitutions in ß-lactamases, potentially casting a shadow on specific ß-lactam-BlaC inhibitor combination treatments against M. tuberculosis.
Assuntos
Ácido Clavulânico/farmacologia , Farmacorresistência Bacteriana/genética , Mycobacterium tuberculosis/efeitos dos fármacos , Mycobacterium tuberculosis/genética , beta-Lactamases/química , beta-Lactamases/genética , Cinética , Modelos Moleculares , Mutação/genética , beta-Lactamases/metabolismoRESUMO
Protein trans-splicing catalyzed by split inteins is a powerful technique for assembling a polypeptide backbone from two separate parts. However, split inteins with robust efficiencies and short fragments suitable for peptide synthesis are rare and have mostly been artificially created. The novel split intein AceL-TerL was identified from metagenomic data and characterized. It represents the first naturally occurring, atypically split intein. The N-terminal fragment of only 25 amino acids is the shortest natural intein fragment to date and was easily amenable to chemical synthesis with a fluorescent label. Optimal protein trans-splicing activity was observed at low temperatures. Further improved mutants were selected by directed protein evolution. The engineered intein variants with up to 50-fold increased rates showed unprecedented efficiency in chemically labeling of a diverse set of proteins. These inteins should prove valuable tools for protein semi-synthesis and other intein-related biotechnological applications.