RESUMO
RNA oxidation, predominantly through the accumulation of 8-oxo-7,8-dihydroguanosine (8-oxo-rG), represents an important biomarker for cellular oxidative stress. Polynucleotide phosphorylase (PNPase) is a 3'-5' exoribonuclease that has been shown to preferentially recognize 8-oxo-rG-containing RNA and protect Escherichia coli cells from oxidative stress. However, the impact of 8-oxo-rG on PNPase-mediated RNA degradation has not been studied. Here, we show that the presence of 8-oxo-rG in RNA leads to catalytic stalling of E. coli PNPase through in vitro RNA degradation experiments and electrophoretic analysis. We also link this stalling to the active site of the enzyme through resolution of single-particle cryo-EM structures for PNPase in complex with singly or doubly oxidized RNA oligonucleotides. Following identification of Arg399 as a key residue in recognition of both single and sequential 8-oxo-rG nucleotides, we perform follow-up in vitro analysis to confirm the importance of this residue in 8-oxo-rG-specific PNPase stalling. Finally, we investigate the effects of mutations to active site residues implicated in 8-oxo-rG binding through E. coli cell growth experiments under H2O2-induced oxidative stress. Specifically, Arg399 mutations show significant effects on cell growth under oxidative stress. Overall, we demonstrate that 8-oxo-rG-specific stalling of PNPase is relevant to bacterial survival under oxidative stress and speculate that this enzyme might associate with other cellular factors to mediate this stress.
Assuntos
Domínio Catalítico , Escherichia coli , Guanosina , Estresse Oxidativo , Polirribonucleotídeo Nucleotidiltransferase , Polirribonucleotídeo Nucleotidiltransferase/metabolismo , Polirribonucleotídeo Nucleotidiltransferase/genética , Escherichia coli/genética , Escherichia coli/metabolismo , Guanosina/análogos & derivados , Guanosina/metabolismo , Estabilidade de RNA , Oxirredução , Proteínas de Escherichia coli/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/química , Microscopia CrioeletrônicaRESUMO
T7 RNA polymerase has enabled orthogonal control of gene expression and recombinant protein production across diverse prokaryotic host chassis organisms for decades. However, the absence of 5' methyl guanosine caps on T7 RNAP derived transcripts has severely limited its utility and widespread adoption in eukaryotic systems. To address this shortcoming, we evolved a fusion enzyme combining T7 RNAP with the single subunit capping enzyme from African swine fever virus using Saccharomyces cerevisiae. We isolated highly active variants of this fusion enzyme, which exhibited roughly two orders of magnitude higher protein expression compared to the wild-type enzyme. We demonstrate the programmable control of gene expression using T7 RNAP-based genetic circuits in yeast and validate enhanced performance of these engineered variants in mammalian cells. This study presents a robust, orthogonal gene regulatory system applicable across diverse eukaryotic hosts, enhancing the versatility and efficiency of synthetic biology applications.
RESUMO
A sustainable operation for harvesting metals in the lanthanide series is needed to meet the rising demand for rare earth elements across diverse global industries. However, existing methods are limited in their capacity for detection and capture at environmentally and industrially relevant lanthanide concentrations. Supercharged fluorescent proteins have solvent-exposed, negatively charged residues that potentially create multiple direct chelation pockets for free lanthanide cations. Here, we demonstrate that negatively supercharged proteins can bind and quantitatively report concentrations of lanthanides via an underutilized lanthanide-to-chromophore pathway of energy transfer. The top-performing sensors detect lanthanides in the micromolar to millimolar range and remain unperturbed by environmentally significant concentrations of competing metals. As a demonstration of the versatility and adaptability of this energy transfer method, we show proximity and signal transmission between the lanthanides and a supramolecular assembly of supercharged proteins, paving the way for the detection of lanthanides via programmable protein oligomers and materials.
Assuntos
Elementos da Série dos Lantanídeos , Proteínas Luminescentes , Elementos da Série dos Lantanídeos/química , Elementos da Série dos Lantanídeos/metabolismo , Proteínas Luminescentes/metabolismo , Proteínas Luminescentes/genética , Proteínas Luminescentes/química , Transferência de Energia , Transferência Ressonante de Energia de Fluorescência/métodosRESUMO
Engineering stabilized proteins is a fundamental challenge in the development of industrial and pharmaceutical biotechnologies. We present Stability Oracle: a structure-based graph-transformer framework that achieves SOTA performance on accurately identifying thermodynamically stabilizing mutations. Our framework introduces several innovations to overcome well-known challenges in data scarcity and bias, generalization, and computation time, such as: Thermodynamic Permutations for data augmentation, structural amino acid embeddings to model a mutation with a single structure, a protein structure-specific attention-bias mechanism that makes transformers a viable alternative to graph neural networks. We provide training/test splits that mitigate data leakage and ensure proper model evaluation. Furthermore, to examine our data engineering contributions, we fine-tune ESM2 representations (Prostata-IFML) and achieve SOTA for sequence-based models. Notably, Stability Oracle outperforms Prostata-IFML even though it was pretrained on 2000X less proteins and has 548X less parameters. Our framework establishes a path for fine-tuning structure-based transformers to virtually any phenotype, a necessary task for accelerating the development of protein-based biotechnologies.
Assuntos
Mutação , Estabilidade Proteica , Proteínas , Termodinâmica , Proteínas/genética , Proteínas/química , Engenharia de Proteínas/métodos , Modelos Moleculares , Algoritmos , Redes Neurais de Computação , Conformação Proteica , Biologia Computacional/métodosRESUMO
Prokaryotic transcription factors can be repurposed into biosensors for the ligand-inducible control of gene expression, but the landscape of chemical ligands for which biosensors exist is extremely limited. To expand this landscape, we developed Ligify, a web application that leverages information in enzyme reaction databases to predict transcription factors that may be responsive to user-defined chemicals. Candidate transcription factors are then incorporated into automatically generated plasmid sequences that are designed to express GFP in response to the target chemical. Our benchmarking analyses demonstrated that Ligify correctly predicted 31/100 previously validated biosensors and highlighted strategies for further improvement. We then used Ligify to build a panel of genetic circuits that could induce a 47-fold, 5-fold, 9-fold, and 27-fold change in fluorescence in response to D-ribose, L-sorbose, isoeugenol, and 4-vinylphenol, respectively. Ligify should enhance the ability of researchers to quickly develop biosensors for an expanded range of chemicals and is publicly available at https://ligify.groov.bio.
Assuntos
Técnicas Biossensoriais , Fatores de Transcrição , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Técnicas Biossensoriais/métodos , Ligantes , Plasmídeos/genética , Software , Escherichia coli/genética , Escherichia coli/metabolismo , Proteínas de Fluorescência Verde/genética , Proteínas de Fluorescência Verde/metabolismoRESUMO
Yeast expression of human G-protein-coupled receptors (GPCRs) can be used as a biosensor platform for the detection of pharmaceuticals. Cannabinoid receptor type 1 (CB1R) is of particular interest, given the cornucopia of natural and synthetic cannabinoids being explored as therapeutics. We show for the first time that engineering the N-terminus of CB1R allows for efficient signal transduction in yeast, and that engineering the sterol composition of the yeast membrane modulates its performance. Using an engineered cannabinoid biosensor, we demonstrate that large libraries of synthetic cannabinoids and terpenes can be quickly screened to elucidate known and novel structure-activity relationships. The biosensor strains offer a ready platform for evaluating the activity of new synthetic cannabinoids, monitoring drugs of abuse, and developing therapeutic molecules.
Assuntos
Técnicas Biossensoriais , Canabinoides , Receptor CB1 de Canabinoide , Saccharomyces cerevisiae , Técnicas Biossensoriais/métodos , Humanos , Canabinoides/química , Canabinoides/farmacologia , Canabinoides/metabolismo , Receptor CB1 de Canabinoide/metabolismo , Receptor CB1 de Canabinoide/genética , Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Relação Estrutura-Atividade , Transdução de Sinais/efeitos dos fármacosRESUMO
The introduction of noncanonical amino acids into proteins has enabled researchers to modify fundamental physicochemical and functional properties of proteins. While the alteration of the genetic code, via the introduction of orthogonal aminoacyl-tRNA synthetase:tRNA pairs, has driven many of these efforts, the various components involved in the process of translation are important for the development of new genetic codes. In this review, we will focus on recent advances in engineering ribosomal machinery for noncanonical amino acid incorporation and genetic code modification. The engineering of the ribosome itself will be considered, as well as the many factors that interact closely with the ribosome, including both tRNAs and accessory factors, such as the all-important EF-Tu. Given the success of genome re-engineering efforts, future paths for radical alterations of the genetic code will require more expansive alterations in the translation machinery.
Assuntos
Aminoácidos , Código Genético , RNA de Transferência , Ribossomos , Aminoácidos/metabolismo , Aminoácidos/química , Ribossomos/metabolismo , RNA de Transferência/metabolismo , RNA de Transferência/genética , RNA de Transferência/química , Biossíntese de Proteínas , Engenharia de Proteínas , Aminoacil-tRNA Sintetases/metabolismo , Aminoacil-tRNA Sintetases/genéticaRESUMO
The fundamental goal of small molecule discovery is to generate chemicals with target functionality. While this often proceeds through structure-based methods, we set out to investigate the practicality of methods that leverage the extensive corpus of chemical literature. We hypothesize that a sufficiently large text-derived chemical function dataset would mirror the actual landscape of chemical functionality. Such a landscape would implicitly capture complex physical and biological interactions given that chemical function arises from both a molecule's structure and its interacting partners. To evaluate this hypothesis, we built a Chemical Function (CheF) dataset of patent-derived functional labels. This dataset, comprising 631 K molecule-function pairs, was created using an LLM- and embedding-based method to obtain 1.5 K unique functional labels for approximately 100 K randomly selected molecules from their corresponding 188 K unique patents. We carry out a series of analyses demonstrating that the CheF dataset contains a semantically coherent textual representation of the functional landscape congruent with chemical structural relationships, thus approximating the actual chemical function landscape. We then demonstrate through several examples that this text-based functional landscape can be leveraged to identify drugs with target functionality using a model able to predict functional profiles from structure alone. We believe that functional label-guided molecular discovery may serve as an alternative approach to traditional structure-based methods in the pursuit of designing novel functional molecules.
RESUMO
Photoenzymatic intermolecular hydroalkylations of olefins are highly enantioselective for chiral centers formed during radical termination but poorly selective for centers set in the C-C bond-forming event. Here, we report the evolution of a flavin-dependent "ene"-reductase to catalyze the coupling of α,α-dichloroamides with alkenes to afford α-chloroamides in good yield with excellent chemo- and stereoselectivity. These products can serve as linchpins in the synthesis of pharmaceutically valuable motifs. Mechanistic studies indicate that radical formation occurs by exciting a charge-transfer complex templated by the protein. Precise control over the orientation of molecules within the charge-transfer complex potentially accounts for the observed stereoselectivity. The work expands the types of motifs that can be prepared using photoenzymatic catalysis.
Assuntos
Alcenos , CatáliseRESUMO
A major challenge to achieving industry-scale biomanufacturing of therapeutic alkaloids is the slow process of biocatalyst engineering. Amaryllidaceae alkaloids, such as the Alzheimer's medication galantamine, are complex plant secondary metabolites with recognized therapeutic value. Due to their difficult synthesis they are regularly sourced by extraction and purification from the low-yielding daffodil Narcissus pseudonarcissus. Here, we propose an efficient biosensor-machine learning technology stack for biocatalyst development, which we apply to engineer an Amaryllidaceae enzyme in Escherichia coli. Directed evolution is used to develop a highly sensitive (EC50 = 20 µM) and specific biosensor for the key Amaryllidaceae alkaloid branchpoint 4'-O-methylnorbelladine. A structure-based residual neural network (MutComputeX) is subsequently developed and used to generate activity-enriched variants of a plant methyltransferase, which are rapidly screened with the biosensor. Functional enzyme variants are identified that yield a 60% improvement in product titer, 2-fold higher catalytic activity, and 3-fold lower off-product regioisomer formation. A solved crystal structure elucidates the mechanism behind key beneficial mutations.
Assuntos
Alcaloides , Alcaloides de Amaryllidaceae , Amaryllidaceae , Narcissus , Amaryllidaceae/metabolismo , Alcaloides/química , Alcaloides de Amaryllidaceae/química , Alcaloides de Amaryllidaceae/metabolismo , Narcissus/química , Narcissus/genética , Narcissus/metabolismo , Metiltransferases/metabolismo , Plantas/metabolismo , Hidrolases/metabolismoRESUMO
P2X receptors are a family of ligand gated ion channels found in a range of eukaryotic species including humans but are not naturally present in the yeast Saccharomyces cerevisiae. We demonstrate the first recombinant expression and functional gating of the P2X2 receptor in baker's yeast. We leverage the yeast host for facile genetic screens of mutant P2X2 by performing site saturation mutagenesis at residues of interest, including SNPs implicated in deafness and at residues involved in native binding. Deep mutational analysis and rounds of genetic engineering yield mutant P2X2 F303Y A304W, which has altered ligand selectivity toward the ATP analog AMP-PNP. The F303Y A304W variant shows over 100-fold increased intracellular calcium amplitudes with AMP-PNP compared to the WT receptor and has a much lower desensitization rate. Since AMP-PNP does not naturally activate P2X receptors, the F303Y A304W P2X2 may be a starting point for downstream applications in chemogenetic cellular control. Interestingly, the A304W mutation selectively destabilizes the desensitized state, which may provide a mechanistic basis for receptor opening with suboptimal agonists. The yeast system represents an inexpensive, scalable platform for ion channel characterization and engineering by circumventing the more expensive and time-consuming methodologies involving mammalian hosts.
Assuntos
Receptores Purinérgicos P2X2 , Saccharomyces cerevisiae , Humanos , Substituição de Aminoácidos , Ligantes , Engenharia de Proteínas/métodos , Receptores Purinérgicos P2X2/metabolismo , Receptores Purinérgicos P2X2/genética , Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Modelos Moleculares , Estrutura Terciária de Proteína , Estrutura Quaternária de Proteína , Homologia Estrutural de Proteína , MutaçãoRESUMO
Bioengineers increasingly rely on ligand-inducible transcription regulators for chemical-responsive control of gene expression, yet the number of regulators available is limited. Novel regulators can be mined from genomes, but an inadequate understanding of their DNA specificity complicates genetic design. Here we present Snowprint, a simple yet powerful bioinformatic tool for predicting regulator:operator interactions. Benchmarking results demonstrate that Snowprint predictions are significantly similar for >45% of experimentally validated regulator:operator pairs from organisms across nine phyla and for regulators that span five distinct structural families. We then use Snowprint to design promoters for 33 previously uncharacterized regulators sourced from diverse phylogenies, of which 28 are shown to influence gene expression and 24 produce a >20-fold dynamic range. A panel of the newly repurposed regulators are then screened for response to biomanufacturing-relevant compounds, yielding new sensors for a polyketide (olivetolic acid), terpene (geraniol), steroid (ursodiol), and alkaloid (tetrahydropapaverine) with induction ratios up to 10.7-fold. Snowprint represents a unique, protein-agnostic tool that greatly facilitates the discovery of ligand-inducible transcriptional regulators for bioengineering applications. A web-accessible version of Snowprint is available at https://snowprint.groov.bio .
Assuntos
Técnicas Biossensoriais , Biologia Computacional , Humanos , Ligantes , Regiões Promotoras Genéticas , DNARESUMO
A growing interest in aptamer research, as evidenced by the increase in aptamer publications over the years, has led to calls for a go-to site for aptamer information. A comprehensive, publicly available aptamer dataset, which may be a repository for aptamer data, standardize aptamer reporting, and generate opportunities to expand current research in the field, could meet such a demand. There have been several attempts to create aptamer databases; however, most have been abandoned or removed entirely from public view. Inspired by previous efforts, we have published the UTexas Aptamer Database, https://sites.utexas.edu/aptamerdatabase, which includes a publicly available aptamer dataset and a searchable database containing a subset of all aptamer data collected to date (1990-2022). The dataset contains aptamer sequences, binding and selection information. The information is regularly reviewed internally to ensure accuracy and consistency across all entries. To support the continued curation and review of aptamer sequence information, we have implemented sustaining mechanisms, including researcher training protocols, an aptamer submission form, data stored separately from the database platform, and a growing team of researchers committed to updating the database. Currently, the UTexas Aptamer Database is the largest in terms of the number of aptamer sequences with 1,443 internally reviewed aptamer records.
Assuntos
Aptâmeros de Nucleotídeos , Bases de Dados de Ácidos Nucleicos , Conjuntos de Dados como AssuntoRESUMO
The incorporation of unnatural amino acids is an attractive method for improving or bringing new and novel functions in peptides and proteins. Cell-free protein synthesis using the Protein Synthesis Using Recombinant Elements (PURE) system is an attractive platform for efficient unnatural amino acid incorporation. In this work, we further adapted and modified the One Pot PURE to obtain a robust and modular system for enzymatic single-site-specific incorporation of an unnatural amino acid. We demonstrated the flexibility of this system through the introduction of two different orthogonal aminoacyl tRNA synthetase:tRNA pairs that suppressed two distinctive stop codons in separate reaction mixtures.
Assuntos
Aminoácidos , Aminoacil-tRNA Sintetases , Aminoácidos/metabolismo , RNA de Transferência/genética , RNA de Transferência/metabolismo , Proteínas/genética , Aminoacil-tRNA Sintetases/metabolismo , Códon de Terminação/genéticaRESUMO
The ongoing evolution of SARS-CoV-2 into more easily transmissible and infectious variants has provided unprecedented insight into mutations enabling immune escape. Understanding how these mutations affect the dynamics of antibody-antigen interactions is crucial to the development of broadly protective antibodies and vaccines. Here we report the characterization of a potent neutralizing antibody (N3-1) identified from a COVID-19 patient during the first disease wave. Cryogenic electron microscopy revealed a quaternary binding mode that enables direct interactions with all three receptor-binding domains of the spike protein trimer, resulting in extraordinary avidity and potent neutralization of all major variants of concern until the emergence of Omicron. Structure-based rational design of N3-1 mutants improved binding to all Omicron variants but only partially restored neutralization of the conformationally distinct Omicron BA.1. This study provides new insights into immune evasion through changes in spike protein dynamics and highlights considerations for future conformationally biased multivalent vaccine designs.
Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Glicoproteína da Espícula de Coronavírus/genética , Anticorpos NeutralizantesRESUMO
Chemical similarity searches are a widely used family of in silico methods for identifying pharmaceutical leads. These methods historically relied on structure-based comparisons to compute similarity. Here, we use a chemical language model to create a vector-based chemical search. We extend previous implementations by creating a prompt engineering strategy that utilizes two different chemical string representation algorithms: one for the query and the other for the database. We explore this method by reviewing search results from nine queries with diverse targets. We find that the method identifies molecules with similar patent-derived functionality to the query, as determined by our validated LLM-assisted patent summarization pipeline. Further, many of these functionally similar molecules have different structures and scaffolds from the query, making them unlikely to be found with traditional chemical similarity searches. This method may serve as a new tool for the discovery of novel molecular structural classes that achieve target functionality.
RESUMO
DNA is an incredibly dense storage medium for digital data. However, computing on the stored information is expensive and slow, requiring rounds of sequencing, in silico computation, and DNA synthesis. Prior work on accessing and modifying data using DNA hybridization or enzymatic reactions had limited computation capabilities. Inspired by the computational power of "DNA strand displacement," we augment DNA storage with "in-memory" molecular computation using strand displacement reactions to algorithmically modify data in a parallel manner. We show programs for binary counting and Turing universal cellular automaton Rule 110, the latter of which is, in principle, capable of implementing any computer algorithm. Information is stored in the nicks of DNA, and a secondary sequence-level encoding allows high-throughput sequencing-based readout. We conducted multiple rounds of computation on 4-bit data registers, as well as random access of data (selective access and erasure). We demonstrate that large strand displacement cascades with 244 distinct strand exchanges (sequential and in parallel) can use naturally occurring DNA sequence from M13 bacteriophage without stringent sequence design, which has the potential to improve the scale of computation and decrease cost. Our work merges DNA storage and DNA computing, setting the foundation of entirely molecular algorithms for parallel manipulation of digital information preserved in DNA.
Assuntos
Computadores Moleculares , DNA , Replicação do DNA , Algoritmos , Bacteriófago M13RESUMO
The SARS-CoV-2 pandemic has highlighted the need for devices capable of carrying out rapid differential detection of viruses that may manifest similar physiological symptoms yet demand tailored treatment plans. Seasonal influenza may be exacerbated by COVID-19 infections, increasing the burden on healthcare systems. In this work, we demonstrate a technology based on liquid-gated graphene field-effect transistors (GFETs), for rapid and ultraprecise sensing and differentiation of influenza and SARS-CoV-2 surface protein. Most distinctively, the device consists of 4 onboard GFETs arranged in a quadruple architecture, where each quarter is functionalized individually (with either antibodies or chemically passivated control) but measured jointly. The sensor platform was tested against a range of concentrations of viral surface proteins from both viruses with the lowest tested and detected concentration at â¼50 ag/mL, or 88 zM for COVID-19 and 227 zM for Flu, which is 5-fold lower than the values reported previously on a similar platform. Unlike the classic real-time polymerase chain reaction test, which has a turnaround time of a few hours, the graphene technology presents an ultrafast response time of â¼10 s even in complex and clinically relevant media such as saliva. Thus, we have developed a multianalyte, highly sensitive, and fault-tolerant technology for rapid diagnostic of contemporary, emerging, and future pandemics.
Assuntos
COVID-19 , Grafite , Influenza Humana , Humanos , SARS-CoV-2 , COVID-19/diagnóstico , AnticorposRESUMO
Deep learning models are seeing increased use as methods to predict mutational effects or allowed mutations in proteins. The models commonly used for these purposes include large language models (LLMs) and 3D Convolutional Neural Networks (CNNs). These two model types have very different architectures and are commonly trained on different representations of proteins. LLMs make use of the transformer architecture and are trained purely on protein sequences whereas 3D CNNs are trained on voxelized representations of local protein structure. While comparable overall prediction accuracies have been reported for both types of models, it is not known to what extent these models make comparable specific predictions and/or generalize protein biochemistry in similar ways. Here, we perform a systematic comparison of two LLMs and two structure-based models (CNNs) and show that the different model types have distinct strengths and weaknesses. The overall prediction accuracies are largely uncorrelated between the sequence- and structure-based models. Overall, the two structure-based models are better at predicting buried aliphatic and hydrophobic residues whereas the two LLMs are better at predicting solvent-exposed polar and charged amino acids. Finally, we find that a combined model that takes the individual model predictions as input can leverage these individual model strengths and results in significantly improved overall prediction accuracy.
Assuntos
Aminoácidos , Antifibrinolíticos , Sequência de Aminoácidos , Fontes de Energia Elétrica , IdiomaRESUMO
The creation of complementary products via templating is a hallmark feature of nucleic acid replication. Outside of nucleic acid-like molecules, the templated synthesis of a hetero-complementary copy is still rare. Herein we describe one cycle of templated synthesis that creates homomeric macrocyclic peptides guided by linear instructing strands. This strategy utilizes hydrazone formation to pre-organize peptide oligomeric monomers along the template on a solid support resin, and microwave-assisted peptide synthesis to couple monomers and cyclize the strands. With a flexible templating strand, we can alter the size of the complementary macrocycle products by increasing the length and number of the binding peptide oligomers, showing the potential to precisely tune the size of macrocyclic products. For the smaller macrocyclic peptides, the products can be released via hydrolysis and characterized by ESI-MS.