Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 1.345
Filter
Add more filters

Publication year range
1.
Cell ; 186(18): 3983-4002.e26, 2023 08 31.
Article in English | MEDLINE | ID: mdl-37657419

ABSTRACT

Prime editing enables a wide variety of precise genome edits in living cells. Here we use protein evolution and engineering to generate prime editors with reduced size and improved efficiency. Using phage-assisted evolution, we improved editing efficiencies of compact reverse transcriptases by up to 22-fold and generated prime editors that are 516-810 base pairs smaller than the current-generation editor PEmax. We discovered that different reverse transcriptases specialize in different types of edits and used this insight to generate reverse transcriptases that outperform PEmax and PEmaxΔRNaseH, the truncated editor used in dual-AAV delivery systems. Finally, we generated Cas9 domains that improve prime editing. These resulting editors (PE6a-g) enhance therapeutically relevant editing in patient-derived fibroblasts and primary human T-cells. PE6 variants also enable longer insertions to be installed in vivo following dual-AAV delivery, achieving 40% loxP insertion in the cortex of the murine brain, a 24-fold improvement compared to previous state-of-the-art prime editors.


Subject(s)
Bacteriophages , Protein Engineering , Humans , Animals , Mice , Bacteriophages/genetics , Brain , Cerebral Cortex , DNA-Directed RNA Polymerases
2.
Cell ; 185(21): 4008-4022.e14, 2022 10 13.
Article in English | MEDLINE | ID: mdl-36150393

ABSTRACT

The continual evolution of SARS-CoV-2 and the emergence of variants that show resistance to vaccines and neutralizing antibodies threaten to prolong the COVID-19 pandemic. Selection and emergence of SARS-CoV-2 variants are driven in part by mutations within the viral spike protein and in particular the ACE2 receptor-binding domain (RBD), a primary target site for neutralizing antibodies. Here, we develop deep mutational learning (DML), a machine-learning-guided protein engineering technology, which is used to investigate a massive sequence space of combinatorial mutations, representing billions of RBD variants, by accurately predicting their impact on ACE2 binding and antibody escape. A highly diverse landscape of possible SARS-CoV-2 variants is identified that could emerge from a multitude of evolutionary trajectories. DML may be used for predictive profiling on current and prospective variants, including highly mutated variants such as Omicron, thus guiding the development of therapeutic antibody treatments and vaccines for COVID-19.


Subject(s)
Angiotensin-Converting Enzyme 2/metabolism , COVID-19 , SARS-CoV-2 , Spike Glycoprotein, Coronavirus/metabolism , Angiotensin-Converting Enzyme 2/chemistry , Angiotensin-Converting Enzyme 2/genetics , Antibodies, Neutralizing , Antibodies, Viral , COVID-19 Vaccines , Humans , Mutation , Pandemics , Protein Binding , SARS-CoV-2/genetics , Spike Glycoprotein, Coronavirus/chemistry , Spike Glycoprotein, Coronavirus/genetics
3.
Cell ; 184(19): 4919-4938.e22, 2021 09 16.
Article in English | MEDLINE | ID: mdl-34506722

ABSTRACT

Replacing or editing disease-causing mutations holds great promise for treating many human diseases. Yet, delivering therapeutic genetic modifiers to specific cells in vivo has been challenging, particularly in large, anatomically distributed tissues such as skeletal muscle. Here, we establish an in vivo strategy to evolve and stringently select capsid variants of adeno-associated viruses (AAVs) that enable potent delivery to desired tissues. Using this method, we identify a class of RGD motif-containing capsids that transduces muscle with superior efficiency and selectivity after intravenous injection in mice and non-human primates. We demonstrate substantially enhanced potency and therapeutic efficacy of these engineered vectors compared to naturally occurring AAV capsids in two mouse models of genetic muscle disease. The top capsid variants from our selection approach show conserved potency for delivery across a variety of inbred mouse strains, and in cynomolgus macaques and human primary myotubes, with transduction dependent on target cell expressed integrin heterodimers.


Subject(s)
Capsid/metabolism , Dependovirus/metabolism , Directed Molecular Evolution , Gene Transfer Techniques , Muscle, Skeletal/metabolism , Amino Acid Sequence , Animals , Capsid/chemistry , Cells, Cultured , Disease Models, Animal , HEK293 Cells , Humans , Integrins/metabolism , Macaca fascicularis , Mice, Inbred BALB C , Mice, Inbred C57BL , Muscle Fibers, Skeletal/metabolism , Muscular Dystrophy, Duchenne/pathology , Muscular Dystrophy, Duchenne/therapy , Myopathies, Structural, Congenital/pathology , Myopathies, Structural, Congenital/therapy , Protein Multimerization , Protein Tyrosine Phosphatases, Non-Receptor/genetics , Protein Tyrosine Phosphatases, Non-Receptor/metabolism , Protein Tyrosine Phosphatases, Non-Receptor/therapeutic use , RNA, Guide, Kinetoplastida/metabolism , Recombination, Genetic/genetics , Species Specificity , Transgenes
4.
Cell ; 178(3): 748-761.e17, 2019 07 25.
Article in English | MEDLINE | ID: mdl-31280962

ABSTRACT

Directed evolution, artificial selection toward designed objectives, is routinely used to develop new molecular tools and therapeutics. Successful directed molecular evolution campaigns repeatedly test diverse sequences with a designed selective pressure. Unicellular organisms and their viral pathogens are exceptional for this purpose and have been used for decades. However, many desirable targets of directed evolution perform poorly or unnaturally in unicellular backgrounds. Here, we present a system for facile directed evolution in mammalian cells. Using the RNA alphavirus Sindbis as a vector for heredity and diversity, we achieved 24-h selection cycles surpassing 10-3 mutations per base. Selection is achieved through genetically actuated sequences internal to the host cell, thus the system's name: viral evolution of genetically actuating sequences, or "VEGAS." Using VEGAS, we evolve transcription factors, GPCRs, and allosteric nanobodies toward functional signaling endpoints each in less than 1 weeks' time.


Subject(s)
Directed Molecular Evolution/methods , Allosteric Regulation , Amino Acid Sequence , Animals , Fluorescence Resonance Energy Transfer , Genetic Vectors/genetics , Genetic Vectors/metabolism , HEK293 Cells , Humans , Mutation , Receptors, G-Protein-Coupled/chemistry , Receptors, G-Protein-Coupled/genetics , Receptors, G-Protein-Coupled/metabolism , Sequence Alignment , Sindbis Virus/genetics , Single-Domain Antibodies/chemistry , Single-Domain Antibodies/genetics , Single-Domain Antibodies/metabolism , Transcription Factors/chemistry , Transcription Factors/genetics , Transcription Factors/metabolism
5.
Annu Rev Biochem ; 87: 159-185, 2018 06 20.
Article in English | MEDLINE | ID: mdl-29589959

ABSTRACT

Flavin-dependent halogenases (FDHs) catalyze the halogenation of organic substrates by coordinating reactions of reduced flavin, molecular oxygen, and chloride. Targeted and random mutagenesis of these enzymes have been used to both understand and alter their reactivity. These studies have led to insights into residues essential for catalysis and FDH variants with improved stability, expanded substrate scope, and altered site selectivity. Mutations throughout FDH structures have contributed to all of these advances. More recent studies have sought to rationalize the impact of these mutations on FDH function and to identify new FDHs to deepen our understanding of this enzyme class and to expand their utility for biocatalytic applications.


Subject(s)
Flavins/metabolism , Halogenation/genetics , Halogenation/physiology , Oxidoreductases/genetics , Oxidoreductases/metabolism , Biocatalysis , Catalytic Domain/genetics , Directed Molecular Evolution , Drug Design , Enzyme Stability/genetics , Hydrocarbons, Halogenated/chemistry , Hydrocarbons, Halogenated/metabolism , Metabolic Networks and Pathways , Models, Molecular , Mutagenesis , Oxidoreductases/chemistry , Substrate Specificity
6.
Cell ; 175(7): 1946-1957.e13, 2018 12 13.
Article in English | MEDLINE | ID: mdl-30415839

ABSTRACT

Directed evolution is a powerful approach for engineering biomolecules and understanding adaptation. However, experimental strategies for directed evolution are notoriously labor intensive and low throughput, limiting access to demanding functions, multiple functions in parallel, and the study of molecular evolution in replicate. We report OrthoRep, an orthogonal DNA polymerase-plasmid pair in yeast that stably mutates ∼100,000-fold faster than the host genome in vivo, exceeding the error threshold of genomic replication that causes single-generation extinction. User-defined genes in OrthoRep continuously and rapidly evolve through serial passaging, a highly straightforward and scalable process. Using OrthoRep, we evolved drug-resistant malarial dihydrofolate reductases (DHFRs) in 90 independent replicates. We uncovered a more complex fitness landscape than previously realized, including common adaptive trajectories constrained by epistasis, rare outcomes that avoid a frequent early adaptive mutation, and a suboptimal fitness peak that occasionally traps evolving populations. OrthoRep enables a new paradigm of routine, high-throughput evolution of biomolecular and cellular function.


Subject(s)
Adaptation, Physiological/genetics , Genome, Fungal , Models, Genetic , Mutation Rate , Saccharomyces cerevisiae/genetics , DNA-Directed DNA Polymerase/genetics , DNA-Directed DNA Polymerase/metabolism , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism
7.
Trends Biochem Sci ; 49(5): 457-469, 2024 May.
Article in English | MEDLINE | ID: mdl-38531696

ABSTRACT

Gene delivery vehicles based on adeno-associated viruses (AAVs) are enabling increasing success in human clinical trials, and they offer the promise of treating a broad spectrum of both genetic and non-genetic disorders. However, delivery efficiency and targeting must be improved to enable safe and effective therapies. In recent years, considerable effort has been invested in creating AAV variants with improved delivery, and computational approaches have been increasingly harnessed for AAV engineering. In this review, we discuss how computationally designed AAV libraries are enabling directed evolution. Specifically, we highlight approaches that harness sequences outputted by next-generation sequencing (NGS) coupled with machine learning (ML) to generate new functional AAV capsids and related regulatory elements, pushing the frontier of what vector engineering and gene therapy may achieve.


Subject(s)
Dependovirus , Gene Transfer Techniques , Dependovirus/genetics , Humans , Genetic Therapy/methods , Genetic Vectors/metabolism , Genetic Engineering , Animals , Computational Biology/methods
8.
Proc Natl Acad Sci U S A ; 121(11): e2311726121, 2024 Mar 12.
Article in English | MEDLINE | ID: mdl-38451939

ABSTRACT

Proteins are a diverse class of biomolecules responsible for wide-ranging cellular functions, from catalyzing reactions to recognizing pathogens. The ability to evolve proteins rapidly and inexpensively toward improved properties is a common objective for protein engineers. Powerful high-throughput methods like fluorescent activated cell sorting and next-generation sequencing have dramatically improved directed evolution experiments. However, it is unclear how to best leverage these data to characterize protein fitness landscapes more completely and identify lead candidates. In this work, we develop a simple yet powerful framework to improve protein optimization by predicting continuous protein properties from simple directed evolution experiments using interpretable, linear machine learning models. Importantly, we find that these models, which use data from simple but imprecise experimental estimates of protein fitness, have predictive capabilities that approach more precise but expensive data. Evaluated across five diverse protein engineering tasks, continuous properties are consistently predicted from readily available deep sequencing data, demonstrating that protein fitness space can be reasonably well modeled by linear relationships among sequence mutations. To prospectively test the utility of this approach, we generated a library of stapled peptides and applied the framework to predict affinity and specificity from simple cell sorting data. We then coupled integer linear programming, a method to optimize protein fitness from linear weights, with mutation scores from machine learning to identify variants in unseen sequence space that have improved and co-optimal properties. This approach represents a versatile tool for improved analysis and identification of protein variants across many domains of protein engineering.


Subject(s)
Machine Learning , Proteins , Proteins/metabolism , Protein Engineering/methods , Mutation , Gene Library
9.
Proc Natl Acad Sci U S A ; 121(44): e2413668121, 2024 Oct 29.
Article in English | MEDLINE | ID: mdl-39436654

ABSTRACT

An RNA ligase ribozyme that catalyzes the joining of RNA molecules of the opposite chiral handedness was optimized for the ability to synthesize its own enantiomer from two component fragments. The mirror-image D- and L-ligases operate in concert to provide a system for cross-chiral replication, whereby they catalyze each other's synthesis and undergo mutual amplification at constant temperature, with apparent exponential growth and a doubling time of about 1 h. Neither the D- nor the L-RNA components alone can achieve autocatalytic self-replication. Cross-chiral exponential amplification can be continued indefinitely through a serial-transfer process that provides an ongoing supply of the component D- and L-substrates. Unlike the familiar paradigm of semiconservative nucleic acid replication that relies on Watson-Crick pairing between complementary strands, cross-chiral replication relies on tertiary interactions between structured nucleic acids "across the mirror." There are few examples, outside of biology, of autocatalytic self-replication systems that undergo exponential amplification and there are no prior examples, in either biological or chemical systems, of cross-chiral replication enabling exponential amplification.


Subject(s)
RNA, Catalytic , RNA, Catalytic/chemistry , RNA, Catalytic/metabolism , Stereoisomerism , RNA Ligase (ATP)/metabolism , RNA Ligase (ATP)/chemistry , RNA Ligase (ATP)/genetics , Nucleic Acid Conformation , RNA/metabolism , RNA/chemistry
10.
Proc Natl Acad Sci U S A ; 121(11): e2321592121, 2024 Mar 12.
Article in English | MEDLINE | ID: mdl-38437533

ABSTRACT

An RNA polymerase ribozyme that was obtained by directed evolution can propagate a functional RNA through repeated rounds of replication and selection, thereby enabling Darwinian evolution. Earlier versions of the polymerase did not have sufficient copying fidelity to propagate functional information, but a new variant with improved fidelity can replicate the hammerhead ribozyme through reciprocal synthesis of both the hammerhead and its complement, with the products then being selected for RNA-cleavage activity. Two evolutionary lineages were carried out in parallel, using either the prior low-fidelity or the newer high-fidelity polymerase. The former lineage quickly lost hammerhead functionality as the population diverged toward random sequences, whereas the latter evolved new hammerhead variants with improved fitness compared to the starting RNA. The increase in fitness was attributable to specific mutations that improved the replicability of the hammerhead, counterbalanced by a small decrease in hammerhead activity. Deep sequencing analysis was used to follow the course of evolution, revealing the emergence of a succession of variants that progressively diverged from the starting hammerhead as fitness increased. This study demonstrates the critical importance of replication fidelity for maintaining heritable information in an RNA-based evolving system, such as is thought to have existed during the early history of life on Earth. Attempts to recreate RNA-based life in the laboratory must achieve further improvements in replication fidelity to enable the fully autonomous Darwinian evolution of RNA enzymes as complex as the polymerase itself.


Subject(s)
RNA, Catalytic , RNA, Catalytic/genetics , RNA/genetics , Earth, Planet , Exercise , Nucleotidyltransferases , Catalysis
11.
Proc Natl Acad Sci U S A ; 121(31): e2403585121, 2024 Jul 30.
Article in English | MEDLINE | ID: mdl-39042685

ABSTRACT

Nature is home to a variety of microorganisms that create materials under environmentally friendly conditions. While this offers an attractive approach for sustainable manufacturing, the production of materials by native microorganisms is usually slow and synthetic biology tools to engineer faster microorganisms are only available when prior knowledge of genotype-phenotype links is available. Here, we utilize a high-throughput directed evolution platform to enhance the fitness of whole microorganisms under selection pressure and identify genetic pathways to enhance the material production capabilities of native species. Using Komagataeibacter sucrofermentans as a model cellulose-producing microorganism, we show that our droplet-based microfluidic platform enables the directed evolution of these bacteria toward a small number of cellulose overproducers from an initial pool of 40,000 random mutants. Sequencing of the evolved strains reveals an unexpected link between the cellulose-forming ability of the bacteria and a gene encoding a protease complex responsible for protein turnover in the cell. The ability to enhance the fitness of microorganisms toward a specific phenotype and to unravel genotype-phenotype links makes this high-throughput directed evolution platform a promising tool for the development of new strains for the sustainable manufacturing of materials.


Subject(s)
Cellulose , Directed Molecular Evolution , Cellulose/metabolism , Cellulose/biosynthesis , Directed Molecular Evolution/methods , Acetobacteraceae/metabolism , Acetobacteraceae/genetics , Phenotype , Mutation
12.
Proc Natl Acad Sci U S A ; 121(32): e2400439121, 2024 Aug 06.
Article in English | MEDLINE | ID: mdl-39074291

ABSTRACT

Protein engineering often targets binding pockets or active sites which are enriched in epistasis-nonadditive interactions between amino acid substitutions-and where the combined effects of multiple single substitutions are difficult to predict. Few existing sequence-fitness datasets capture epistasis at large scale, especially for enzyme catalysis, limiting the development and assessment of model-guided enzyme engineering approaches. We present here a combinatorially complete, 160,000-variant fitness landscape across four residues in the active site of an enzyme. Assaying the native reaction of a thermostable ß-subunit of tryptophan synthase (TrpB) in a nonnative environment yielded a landscape characterized by significant epistasis and many local optima. These effects prevent simulated directed evolution approaches from efficiently reaching the global optimum. There is nonetheless wide variability in the effectiveness of different directed evolution approaches, which together provide experimental benchmarks for computational and machine learning workflows. The most-fit TrpB variants contain a substitution that is nearly absent in natural TrpB sequences-a result that conservation-based predictions would not capture. Thus, although fitness prediction using evolutionary data can enrich in more-active variants, these approaches struggle to identify and differentiate among the most-active variants, even for this near-native function. Overall, this work presents a large-scale testing ground for model-guided enzyme engineering and suggests that efficient navigation of epistatic fitness landscapes can be improved by advances in both machine learning and physical modeling.


Subject(s)
Catalytic Domain , Epistasis, Genetic , Tryptophan Synthase , Catalytic Domain/genetics , Tryptophan Synthase/genetics , Tryptophan Synthase/metabolism , Tryptophan Synthase/chemistry , Protein Engineering/methods , Amino Acid Substitution , Models, Molecular
13.
Proc Natl Acad Sci U S A ; 121(35): e2317027121, 2024 08 27.
Article in English | MEDLINE | ID: mdl-39159366

ABSTRACT

The enzyme 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) functions in the shikimate pathway which is responsible for the production of aromatic amino acids and precursors of other essential secondary metabolites in all plant species. EPSPS is also the molecular target of the herbicide glyphosate. While some plant EPSPS variants have been characterized with reduced glyphosate sensitivity and have been used in biotechnology, the glyphosate insensitivity typically comes with a cost to catalytic efficiency. Thus, there exists a need to generate additional EPSPS variants that maintain both high catalytic efficiency and high glyphosate tolerance. Here, we create a synthetic yeast system to rapidly study and evolve heterologous EPSP synthases for these dual traits. Using known EPSPS variants, we first validate that our synthetic yeast system is capable of recapitulating growth characteristics observed in plants grown in varying levels of glyphosate. Next, we demonstrate that variants from mutagenesis libraries with distinct phenotypic traits can be isolated depending on the selection criteria applied. By applying strong dual-trait selection pressure, we identify a notable EPSPS mutant after just a single round of evolution that displays robust glyphosate tolerance (Ki of nearly 1 mM) and improved enzymatic efficiency over the starting point (~2.5 fold). Finally, we show the crystal structure of corn EPSPS and the top resulting mutants and demonstrate that certain mutants have the potential to outperform previously reported glyphosate-resistant EPSPS mutants, such as T102I and P106S (denoted as TIPS), in whole-plant testing. Altogether, this platform helps explore the trade-off between glyphosate resistance and enzymatic efficiency.


Subject(s)
3-Phosphoshikimate 1-Carboxyvinyltransferase , Glycine , Glyphosate , Saccharomyces cerevisiae , 3-Phosphoshikimate 1-Carboxyvinyltransferase/genetics , 3-Phosphoshikimate 1-Carboxyvinyltransferase/metabolism , Glycine/analogs & derivatives , Glycine/metabolism , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/metabolism , Herbicides/pharmacology , Herbicides/metabolism , Plant Proteins/genetics , Plant Proteins/metabolism , Herbicide Resistance/genetics
14.
Trends Biochem Sci ; 47(5): 403-416, 2022 05.
Article in English | MEDLINE | ID: mdl-35427479

ABSTRACT

Noncovalent interactions between biomolecules such as proteins and nucleic acids coordinate all cellular processes through changes in proximity. Tools that perturb these interactions are and will continue to be highly valuable for basic and translational scientific endeavors. By taking cues from natural systems, such as the adaptive immune system, we can design directed evolution platforms that can generate proteins that bind to biomolecules of interest. In recent years, the platforms used to direct the evolution of biomolecular binders have greatly expanded the range of types of interactions one can evolve. Herein, we review recent advances in methods to evolve protein-protein, protein-RNA, and protein-DNA interactions.


Subject(s)
DNA , Nucleic Acids , Directed Molecular Evolution/methods , Proteins/genetics , RNA
15.
Trends Biochem Sci ; 47(5): 375-389, 2022 05.
Article in English | MEDLINE | ID: mdl-34544655

ABSTRACT

Recent years have seen an explosion of interest in understanding the physicochemical parameters that shape enzyme evolution, as well as substantial advances in computational enzyme design. This review discusses three areas where evolutionary information can be used as part of the design process: (i) using ancestral sequence reconstruction (ASR) to generate new starting points for enzyme design efforts; (ii) learning from how nature uses conformational dynamics in enzyme evolution to mimic this process in silico; and (iii) modular design of enzymes from smaller fragments, again mimicking the process by which nature appears to create new protein folds. Using showcase examples, we highlight the importance of incorporating evolutionary information to continue to push forward the boundaries of enzyme design studies.


Subject(s)
Evolution, Molecular , Proteins , Computational Biology , Proteins/genetics
16.
Semin Cell Dev Biol ; 155(Pt A): 37-47, 2024 03 01.
Article in English | MEDLINE | ID: mdl-37085353

ABSTRACT

Rubisco catalyses the entry of almost all CO2 into the biosphere and is often the rate-limiting step in plant photosynthesis and growth. Its notoriety as the most abundant protein on Earth stems from the slow and error-prone catalytic properties that require plants, cyanobacteria, algae and photosynthetic bacteria to produce it in high amounts. Efforts to improve the CO2-fixing properties of plant Rubisco has been spurred on by the discovery of more effective isoforms in some algae with the potential to significantly improve crop productivity. Incompatibilities between the protein folding machinery of leaf and algae chloroplasts have, so far, prevented efforts to transplant these more effective Rubisco variants into plants. There is therefore increasing interest in improving Rubisco catalysis by directed (laboratory) evolution. Here we review the advances being made in, and the ongoing challenges with, improving the solubility and/or carboxylation activity of differing non-plant Rubisco lineages. We provide perspectives on new opportunities for the directed evolution of crop Rubiscos and the existing plant transformation capabilities available to evaluate the extent to which Rubisco activity improvements can benefit agricultural productivity.


Subject(s)
Carbon Dioxide , Ribulose-Bisphosphate Carboxylase , Ribulose-Bisphosphate Carboxylase/genetics , Plant Leaves , Protein Folding
17.
Trends Genet ; 39(1): 9-14, 2023 01.
Article in English | MEDLINE | ID: mdl-36402624

ABSTRACT

The first step of viral evolution takes place during genome replication via the error-prone viral polymerase. Among the mutants that arise through this process, only a few well-adapted variants will be selected by natural selection, renewing the viral genome population. Viral polymerase-mediated errors are thought to occur stochastically. However, accumulating evidence suggests that viral polymerase-mediated mutations are heterogeneously distributed throughout the viral genome. Here, we review work that supports this concept and provides mechanistic insights into how specific features of the viral genome could modulate viral polymerase-mediated errors. A predisposition to accumulate viral polymerase-mediated errors at specific loci in the viral genome may guide evolution to specific pathways, thus opening new directions of research to better understand viral evolutionary dynamics.


Subject(s)
Genome, Viral , Mutation , Genome, Viral/genetics , Genotype
18.
Brief Bioinform ; 25(5)2024 Jul 25.
Article in English | MEDLINE | ID: mdl-39120645

ABSTRACT

Predicting the strength of promoters and guiding their directed evolution is a crucial task in synthetic biology. This approach significantly reduces the experimental costs in conventional promoter engineering. Previous studies employing machine learning or deep learning methods have shown some success in this task, but their outcomes were not satisfactory enough, primarily due to the neglect of evolutionary information. In this paper, we introduce the Chaos-Attention net for Promoter Evolution (CAPE) to address the limitations of existing methods. We comprehensively extract evolutionary information within promoters using merged chaos game representation and process the overall information with modified DenseNet and Transformer structures. Our model achieves state-of-the-art results on two kinds of distinct tasks related to prokaryotic promoter strength prediction. The incorporation of evolutionary information enhances the model's accuracy, with transfer learning further extending its adaptability. Furthermore, experimental results confirm CAPE's efficacy in simulating in silico directed evolution of promoters, marking a significant advancement in predictive modeling for prokaryotic promoter strength. Our paper also presents a user-friendly website for the practical implementation of in silico directed evolution on promoters. The source code implemented in this study and the instructions on accessing the website can be found in our GitHub repository https://github.com/BobYHY/CAPE.


Subject(s)
Deep Learning , Promoter Regions, Genetic , Algorithms , Evolution, Molecular , Computer Simulation , Nonlinear Dynamics , Computational Biology/methods
19.
Trends Immunol ; 44(5): 384-396, 2023 05.
Article in English | MEDLINE | ID: mdl-37024340

ABSTRACT

Our immune systems constantly coevolve with the pathogens that challenge them, as pathogens adapt to evade our defense responses, with our immune repertoires shifting in turn. These coevolutionary dynamics take place across a vast and high-dimensional landscape of potential pathogen and immune receptor sequence variants. Mapping the relationship between these genotypes and the phenotypes that determine immune-pathogen interactions is crucial for understanding, predicting, and controlling disease. Here, we review recent developments applying high-throughput methods to create large libraries of immune receptor and pathogen protein sequence variants and measure relevant phenotypes. We describe several approaches that probe different regions of the high-dimensional sequence space and comment on how combinations of these methods may offer novel insight into immune-pathogen coevolution.


Subject(s)
Adaptation, Physiological , Phenotype , Genotype
20.
Proc Natl Acad Sci U S A ; 120(11): e2218428120, 2023 03 14.
Article in English | MEDLINE | ID: mdl-36893280

ABSTRACT

A versatile strategy to create an inducible protein assembly with predefined geometry is demonstrated. The assembly is triggered by a binding protein that staples two identical protein bricks together in a predictable spatial conformation. The brick and staple proteins are designed for mutual directional affinity and engineered by directed evolution from a synthetic modular repeat protein library. As a proof of concept, this article reports on the spontaneous, extremely fast and quantitative self-assembly of two designed alpha-repeat (αRep) brick and staple proteins into macroscopic tubular superhelices at room temperature. Small-angle X-ray scattering (SAXS) and transmission electron microscopy (TEM with staining agent and cryoTEM) elucidate the resulting superhelical arrangement that precisely matches the a priori intended 3D assembly. The highly ordered, macroscopic biomolecular construction sustains temperatures as high as 75 °C thanks to the robust αRep building blocks. Since the α-helices of the brick and staple proteins are highly programmable, their design allows encoding the geometry and chemical surfaces of the final supramolecular protein architecture. This work opens routes toward the design and fabrication of multiscale protein origami with arbitrarily programmed shapes and chemical functions.


Subject(s)
Nanostructures , Proteins , X-Ray Diffraction , Scattering, Small Angle , Proteins/chemistry , Temperature , Microscopy, Electron, Transmission , Nanostructures/chemistry , Nucleic Acid Conformation
SELECTION OF CITATIONS
SEARCH DETAIL