Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nat Biotechnol ; 42(2): 216-228, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38361074

RESUMEN

Recent breakthroughs in AI coupled with the rapid accumulation of protein sequence and structure data have radically transformed computational protein design. New methods promise to escape the constraints of natural and laboratory evolution, accelerating the generation of proteins for applications in biotechnology and medicine. To make sense of the exploding diversity of machine learning approaches, we introduce a unifying framework that classifies models on the basis of their use of three core data modalities: sequences, structures and functional labels. We discuss the new capabilities and outstanding challenges for the practical design of enzymes, antibodies, vaccines, nanomachines and more. We then highlight trends shaping the future of this field, from large-scale assays to more robust benchmarks, multimodal foundation models, enhanced sampling strategies and laboratory automation.


Asunto(s)
Aprendizaje Automático , Proteínas , Biotecnología , Secuencia de Aminoácidos , Anticuerpos
2.
Nat Commun ; 15(1): 1639, 2024 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-38388493

RESUMEN

Recent developments in protein design rely on large neural networks with up to 100s of millions of parameters, yet it is unclear which residue dependencies are critical for determining protein function. Here, we show that amino acid preferences at individual residues-without accounting for mutation interactions-explain much and sometimes virtually all of the combinatorial mutation effects across 8 datasets (R2 ~ 78-98%). Hence, few observations (~100 times the number of mutated residues) enable accurate prediction of held-out variant effects (Pearson r > 0.80). We hypothesized that the local structural contexts around a residue could be sufficient to predict mutation preferences, and develop an unsupervised approach termed CoVES (Combinatorial Variant Effects from Structure). Our results suggest that CoVES outperforms not just model-free methods but also similarly to complex models for creating functional and diverse protein variants. CoVES offers an effective alternative to complicated models for identifying functional protein mutations.


Asunto(s)
Redes Neurales de la Computación , Proteínas , Proteínas/metabolismo , Aminoácidos/química , Mutación
3.
bioRxiv ; 2023 Dec 08.
Artículo en Inglés | MEDLINE | ID: mdl-38106144

RESUMEN

Predicting the effects of mutations in proteins is critical to many applications, from understanding genetic disease to designing novel proteins that can address our most pressing challenges in climate, agriculture and healthcare. Despite a surge in machine learning-based protein models to tackle these questions, an assessment of their respective benefits is challenging due to the use of distinct, often contrived, experimental datasets, and the variable performance of models across different protein families. Addressing these challenges requires scale. To that end we introduce ProteinGym, a large-scale and holistic set of benchmarks specifically designed for protein fitness prediction and design. It encompasses both a broad collection of over 250 standardized deep mutational scanning assays, spanning millions of mutated sequences, as well as curated clinical datasets providing high-quality expert annotations about mutation effects. We devise a robust evaluation framework that combines metrics for both fitness prediction and design, factors in known limitations of the underlying experimental methods, and covers both zero-shot and supervised settings. We report the performance of a diverse set of over 70 high-performing models from various subfields (eg., alignment-based, inverse folding) into a unified benchmark suite. We open source the corresponding codebase, datasets, MSAs, structures, model predictions and develop a user-friendly website that facilitates data access and analysis.

4.
Nature ; 622(7984): 818-825, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37821700

RESUMEN

Effective pandemic preparedness relies on anticipating viral mutations that are able to evade host immune responses to facilitate vaccine and therapeutic design. However, current strategies for viral evolution prediction are not available early in a pandemic-experimental approaches require host polyclonal antibodies to test against1-16, and existing computational methods draw heavily from current strain prevalence to make reliable predictions of variants of concern17-19. To address this, we developed EVEscape, a generalizable modular framework that combines fitness predictions from a deep learning model of historical sequences with biophysical and structural information. EVEscape quantifies the viral escape potential of mutations at scale and has the advantage of being applicable before surveillance sequencing, experimental scans or three-dimensional structures of antibody complexes are available. We demonstrate that EVEscape, trained on sequences available before 2020, is as accurate as high-throughput experimental scans at anticipating pandemic variation for SARS-CoV-2 and is generalizable to other viruses including influenza, HIV and understudied viruses with pandemic potential such as Lassa and Nipah. We provide continually revised escape scores for all current strains of SARS-CoV-2 and predict probable further mutations to forecast emerging strains as a tool for continuing vaccine development ( evescape.org ).


Asunto(s)
Evolución Molecular , Predicción , Evasión Inmune , Mutación , Pandemias , Virus , Humanos , Diseño de Fármacos , Infecciones por VIH , Evasión Inmune/genética , Evasión Inmune/inmunología , Gripe Humana , Virus Lassa , Virus Nipah , SARS-CoV-2/genética , SARS-CoV-2/inmunología , Vacunas Virales/inmunología , Virus/genética , Virus/inmunología
5.
mSystems ; 7(2): e0146621, 2022 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-35319251

RESUMEN

Suppression of the host innate immune response is a critical aspect of viral replication. Upon infection, viruses may introduce one or more proteins that inhibit key immune pathways, such as the type I interferon pathway. However, the ability to predict and evaluate viral protein bioactivity on targeted pathways remains challenging and is typically done on a single-virus or -gene basis. Here, we present a medium-throughput high-content cell-based assay to reveal the immunosuppressive effects of viral proteins. To test the predictive power of our approach, we developed a library of 800 genes encoding known, predicted, and uncharacterized human virus genes. We found that previously known immune suppressors from numerous viral families such as Picornaviridae and Flaviviridae recorded positive responses. These include a number of viral proteases for which we further confirmed that innate immune suppression depends on protease activity. A class of predicted inhibitors encoded by Rhabdoviridae viruses was demonstrated to block nuclear transport, and several previously uncharacterized proteins from uncultivated viruses were shown to inhibit nuclear transport of the transcription factors NF-κB and interferon regulatory factor 3 (IRF3). We propose that this pathway-based assay, together with early sequencing, gene synthesis, and viral infection studies, could partly serve as the basis for rapid in vitro characterization of novel viral proteins. IMPORTANCE Infectious diseases caused by viral pathogens exacerbate health care and economic burdens. Numerous viral biomolecules suppress the human innate immune system, enabling viruses to evade an immune response from the host. Despite our current understanding of viral replications and immune evasion, new viral proteins, including those encoded by uncultivated viruses or emerging viruses, are being unearthed at a rapid pace from large-scale sequencing and surveillance projects. The use of medium- and high-throughput functional assays to characterize immunosuppressive functions of viral proteins can advance our understanding of viral replication and possibly treatment of infections. In this study, we assembled a large viral-gene library from diverse viral families and developed a high-content assay to test for inhibition of innate immunity pathways. Our work expands the tools that can rapidly link sequence and protein function, representing a practical step toward early-stage evaluation of emerging and understudied viruses.


Asunto(s)
Inmunidad Innata , Virus , Humanos , FN-kappa B , Evasión Inmune , Virus/genética , Proteínas Virales/genética , Genes Virales
6.
ACS Synth Biol ; 11(3): 1292-1302, 2022 03 18.
Artículo en Inglés | MEDLINE | ID: mdl-35176859

RESUMEN

Many organisms can survive extreme conditions and successfully recover to normal life. This extremotolerant behavior has been attributed in part to repetitive, amphipathic, and intrinsically disordered proteins that are upregulated in the protected state. Here, we assemble a library of approximately 300 naturally occurring and designed extremotolerance-associated proteins to assess their ability to protect human cells from chemically induced apoptosis. We show that several proteins from tardigrades, nematodes, and the Chinese giant salamander are apoptosis-protective. Notably, we identify a region of the human ApoE protein with similarity to extremotolerance-associated proteins that also protects against apoptosis. This region mirrors the phase separation behavior seen with such proteins, like the tardigrade protein CAHS2. Moreover, we identify a synthetic protein, DHR81, that shares this combination of elevated phase separation propensity and apoptosis protection. Finally, we demonstrate that driving protective proteins into the condensate state increases apoptosis protection, and highlights the ability of DHR81 condensates to sequester caspase-7. Taken together, this work draws a link between extremotolerance-associated proteins, condensate formation, and designing human cellular protection.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Tardigrada , Animales , Apoptosis , Humanos , Proteínas Intrínsecamente Desordenadas/química , Proteínas Intrínsecamente Desordenadas/metabolismo , Tardigrada/metabolismo
7.
Sci Rep ; 11(1): 4951, 2021 03 02.
Artículo en Inglés | MEDLINE | ID: mdl-33654191

RESUMEN

Encapsulins are recently discovered protein compartments able to specifically encapsulate cargo proteins in vivo. Encapsulation is dependent on C-terminal targeting peptides (TPs). Here, we characterize and engineer TP-shell interactions in the Thermotoga maritima and Myxococcus xanthus encapsulin systems. Using force-field modeling and particle fluorescence measurements we show that TPs vary in native specificity and binding strength, and that TP-shell interactions are determined by hydrophobic and ionic interactions as well as TP flexibility. We design a set of TPs with a variety of predicted binding strengths and experimentally characterize these designs. This yields a set of TPs with novel binding characteristics representing a potentially useful toolbox for future nanoreactor engineering aimed at controlling cargo loading efficiency and the relative stoichiometry of multiple concurrently loaded cargo proteins.


Asunto(s)
Proteínas Bacterianas/química , Modelos Moleculares , Myxococcus xanthus/química , Nanoestructuras/química , Péptidos/química , Thermotoga maritima/química
8.
Elife ; 92020 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-32870157

RESUMEN

Vitamin K epoxide reductase (VKOR) drives the vitamin K cycle, activating vitamin K-dependent blood clotting factors. VKOR is also the target of the widely used anticoagulant drug, warfarin. Despite VKOR's pivotal role in coagulation, its structure and active site remain poorly understood. In addition, VKOR variants can cause vitamin K-dependent clotting factor deficiency or alter warfarin response. Here, we used multiplexed, sequencing-based assays to measure the effects of 2,695 VKOR missense variants on abundance and 697 variants on activity in cultured human cells. The large-scale functional data, along with an evolutionary coupling analysis, supports a four transmembrane domain topology, with variants in transmembrane domains exhibiting strongly deleterious effects on abundance and activity. Functionally constrained regions of the protein define the active site, and we find that, of four conserved cysteines putatively critical for function, only three are absolutely required. Finally, 25% of human VKOR missense variants show reduced abundance or activity, possibly conferring warfarin sensitivity or causing disease.


Asunto(s)
Dominio Catalítico , Variación Genética , Mutación Missense , Vitamina K Epóxido Reductasas/química , Vitamina K Epóxido Reductasas/genética , Cisteína/química , Resistencia a Medicamentos , Células HEK293 , Humanos , Errores Innatos del Metabolismo , Modelos Moleculares , Análisis de Secuencia de ADN , Warfarina/farmacología
9.
Nat Genet ; 51(7): 1170-1176, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-31209393

RESUMEN

We describe an experimental method of three-dimensional (3D) structure determination that exploits the increasing ease of high-throughput mutational scans. Inspired by the success of using natural, evolutionary sequence covariation to compute protein and RNA folds, we explored whether 'laboratory', synthetic sequence variation might also yield 3D structures. We analyzed five large-scale mutational scans and discovered that the pairs of residues with the largest positive epistasis in the experiments are sufficient to determine the 3D fold. We show that the strongest epistatic pairings from genetic screens of three proteins, a ribozyme and a protein interaction reveal 3D contacts within and between macromolecules. Using these experimental epistatic pairs, we compute ab initio folds for a GB1 domain (within 1.8 Å of the crystal structure) and a WW domain (2.1 Å). We propose strategies that reduce the number of mutants needed for contact prediction, suggesting that genomics-based techniques can efficiently predict 3D structure.


Asunto(s)
Proteínas Adaptadoras Transductoras de Señales/química , Proteínas Bacterianas/química , Epistasis Genética , Mutación , Proteínas de Unión a Poli(A)/química , Conformación Proteica , ARN Catalítico/química , Proteínas de Saccharomyces cerevisiae/química , Factores de Transcripción/química , Proteínas Adaptadoras Transductoras de Señales/genética , Proteínas Bacterianas/genética , Humanos , Proteínas de Unión a Poli(A)/genética , Dominios Proteicos , Pliegue de Proteína , ARN Catalítico/genética , Proteínas de Saccharomyces cerevisiae/genética , Factores de Transcripción/genética , Proteínas Señalizadoras YAP
10.
Appl Environ Microbiol ; 84(10)2018 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-29549102

RESUMEN

Medium-chain fatty acids are commodity chemicals. Increasing and modifying the activity of thioesterases (TEs) on medium-chain fatty acyl-acyl carrier protein (acyl-ACP) esters may enable a high-yield microbial production of these molecules. The plant Cuphea palustris harbors two distinct TEs: C. palustris FatB1 (CpFatB1) (C8 specificity, lower activity) and CpFatB2 (C14 specificity, higher activity) with 78% sequence identity. We combined structural features from these two enzymes to create several chimeric TEs, some of which showed nonnatural fatty acid production as measured by an enzymatic assay and gas chromatography-mass spectrometry (GC-MS). Notably, chimera 4 exhibited an increased C8 fatty acid production in correlation with improved microbial expression. This chimera led us to identify CpFatB2-specific amino acids between positions 219 and 272 that lead to higher protein levels. Chimera 7 produced a broad range of fatty acids and appeared to combine a fatty acid binding pocket with long-chain specificity and an ACP interaction site that may activate fatty acid extrusion. Using homology modeling and in silico docking with ACP, we identified a "positive patch" within amino acids 162 to 218, which may direct the ACP interaction and regulate access to short-chain fatty acids. On the basis of this modeling, we transplanted putative ACP interaction sequences from CpFatB1 into CpFatB2 and created a chimeric thioesterase that produced medium-chain as well as long-chain fatty acids. Thus, the engineering of chimeric enzymes and characterizing their microbial activity and chain-length specificity suggested mechanistic insights into TE functions and also generated thioesterases with potentially useful properties. These observations may inform a rational engineering of TEs to allow alkyl chain length control.IMPORTANCE Medium-chain fatty acids are important commodity chemicals. These molecules are used as plastic precursors and in shampoos and other detergents and could be used as biofuel precursors if production economics were favorable. Hydrocarbon-based liquid fuels must be optimized to have a desired boiling point, low freezing point, low viscosity, and other physical characteristics. Similarly, the solubility and harshness of detergents and the flexibility of plastic polymers can be modulated. The length and distribution of the carbon chains in the hydrophobic tails determine these properties. The biological synthesis of cell membranes and fatty acids produces chains of primarily 16 to 18 carbons, which give rise to current biofuels. The ultimate goal of the work presented here is to engineer metabolic pathways to produce designer molecules with the correct number of carbons in a chain, so that such molecules could be used directly as specialty commodity chemicals or as fuels after minimal processing.


Asunto(s)
Cuphea/enzimología , Ácidos Grasos/metabolismo , Proteínas de Plantas/química , Tioléster Hidrolasas/química , Tioléster Hidrolasas/genética , Cuphea/genética , Ácidos Grasos/química , Cromatografía de Gases y Espectrometría de Masas , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Proteínas Recombinantes de Fusión/química , Proteínas Recombinantes de Fusión/genética , Proteínas Recombinantes de Fusión/metabolismo , Especificidad por Sustrato , Tioléster Hidrolasas/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...