Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Nature ; 623(7989): 1070-1078, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37968394

RESUMO

Three billion years of evolution has produced a tremendous diversity of protein molecules1, but the full potential of proteins is likely to be much greater. Accessing this potential has been challenging for both computation and experiments because the space of possible protein molecules is much larger than the space of those likely to have functions. Here we introduce Chroma, a generative model for proteins and protein complexes that can directly sample novel protein structures and sequences, and that can be conditioned to steer the generative process towards desired properties and functions. To enable this, we introduce a diffusion process that respects the conformational statistics of polymer ensembles, an efficient neural architecture for molecular systems that enables long-range reasoning with sub-quadratic scaling, layers for efficiently synthesizing three-dimensional structures of proteins from predicted inter-residue geometries and a general low-temperature sampling algorithm for diffusion models. Chroma achieves protein design as Bayesian inference under external constraints, which can involve symmetries, substructure, shape, semantics and even natural-language prompts. The experimental characterization of 310 proteins shows that sampling from Chroma results in proteins that are highly expressed, fold and have favourable biophysical properties. The crystal structures of two designed proteins exhibit atomistic agreement with Chroma samples (a backbone root-mean-square deviation of around 1.0 Å). With this unified approach to protein design, we hope to accelerate the programming of protein matter to benefit human health, materials science and synthetic biology.


Assuntos
Algoritmos , Simulação por Computador , Conformação Proteica , Proteínas , Humanos , Teorema de Bayes , Evolução Molecular Direcionada , Aprendizado de Máquina , Modelos Moleculares , Dobramento de Proteína , Proteínas/química , Proteínas/metabolismo , Semântica , Biologia Sintética/métodos , Biologia Sintética/tendências
2.
Annu Rev Biochem ; 80: 211-37, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21548783

RESUMO

Signal transduction across biological membranes is central to life. This process generally happens through communication between different domains and hierarchical coupling of information. Here, we review structural and thermodynamic principles behind transmembrane (TM) signal transduction and discuss common themes. Communication between signaling domains can be understood in terms of thermodynamic and kinetic principles, and complex signaling patterns can arise from simple wiring of thermodynamically coupled domains. We relate this to functions of several signal transduction systems: the M2 proton channel from influenza A virus, potassium channels, integrin receptors, and bacterial kinases. We also discuss key features in the structural rearrangements responsible for signal transduction in these systems.


Assuntos
Comunicação Celular/fisiologia , Membrana Celular/fisiologia , Integrinas , Canais de Potássio , Proteínas da Matriz Viral , Integrinas/química , Integrinas/metabolismo , Íons/química , Íons/metabolismo , Ligantes , Modelos Moleculares , Canais de Potássio/química , Canais de Potássio/metabolismo , Conformação Proteica , Transdução de Sinais/fisiologia , Termodinâmica , Proteínas da Matriz Viral/química , Proteínas da Matriz Viral/metabolismo
3.
Proc Natl Acad Sci U S A ; 120(23): e2215195120, 2023 06 06.
Artigo em Inglês | MEDLINE | ID: mdl-37253004

RESUMO

The gaseous hormone ethylene is perceived in plants by membrane-bound receptors, the best studied of these being ETR1 from Arabidopsis. Ethylene receptors can mediate a response to ethylene concentrations at less than one part per billion; however, the mechanistic basis for such high-affinity ligand binding has remained elusive. Here we identify an Asp residue within the ETR1 transmembrane domain that plays a critical role in ethylene binding. Site-directed mutation of the Asp to Asn results in a functional receptor that has a reduced affinity for ethylene, but still mediates ethylene responses in planta. The Asp residue is highly conserved among ethylene receptor-like proteins in plants and bacteria, but Asn variants exist, pointing to the physiological relevance of modulating ethylene-binding kinetics. Our results also support a bifunctional role for the Asp residue in forming a polar bridge to a conserved Lys residue in the receptor to mediate changes in signaling output. We propose a new structural model for the mechanism of ethylene binding and signal transduction, one with similarities to that found in a mammalian olfactory receptor.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/metabolismo , Receptores de Superfície Celular/metabolismo , Etilenos/metabolismo , Transdução de Sinais/fisiologia
4.
Proc Natl Acad Sci U S A ; 117(2): 1059-1068, 2020 01 14.
Artigo em Inglês | MEDLINE | ID: mdl-31892539

RESUMO

Current state-of-the-art approaches to computational protein design (CPD) aim to capture the determinants of structure from physical principles. While this has led to many successful designs, it does have strong limitations associated with inaccuracies in physical modeling, such that a reliable general solution to CPD has yet to be found. Here, we propose a design framework-one based on identifying and applying patterns of sequence-structure compatibility found in known proteins, rather than approximating them from models of interatomic interactions. We carry out extensive computational analyses and an experimental validation for our method. Our results strongly argue that the Protein Data Bank is now sufficiently large to enable proteins to be designed by using only examples of structural motifs from unrelated proteins. Because our method is likely to have orthogonal strengths relative to existing techniques, it could represent an important step toward removing remaining barriers to robust CPD.


Assuntos
Motivos de Aminoácidos , Biologia Computacional/métodos , Engenharia de Proteínas/métodos , Estrutura Terciária de Proteína , Proteínas/química , Substituição de Aminoácidos , Desenho Assistido por Computador , Bases de Dados de Proteínas , Modelos Moleculares
5.
Int J Audiol ; 62(5): 383-392, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-35521916

RESUMO

OBJECTIVE: This study's objective was determining whether gap detection deficits are present in a longstanding cohort of people living with HIV (PLWH) compared to those living without HIV (PLWOH) using a new gap detection modelling technique (i.e. fitting gap responses using the Hill equation and analysing the individual gap detection resulting curves with non-linear statistics). This approach provides a measure of both gap threshold and the steepness of the gap length/correct detection relationship. DESIGN: The relationship between the correct identification rate at each gap length was modelled using the Hill equation. Results were analysed using a nonlinear mixed-effect regression model. STUDY SAMPLE: 45 PLWH (age range 41-78) and 39 PLWOH (age range 38-79) were enrolled and completed gap detection testing. RESULTS: The likelihood ratio statistic comparing the full regression model with the HIV effects to the null model, assuming one population curve for both groups, was highly significant (p < 0.001), suggesting a less precise relationship between gap length and correct detection in PLWH. CONCLUSIONS: PLWH showed degraded gap detection ability compared to PLWOH, likely due to central nervous system effects of HIV infection or treatment. The Hill equation provided a new approach for modelling gap detection ability.


Assuntos
Infecções por HIV , Humanos , Adulto , Pessoa de Meia-Idade , Idoso , Infecções por HIV/epidemiologia , Dinâmica não Linear , Inquéritos e Questionários
6.
Proc Natl Acad Sci U S A ; 113(47): E7438-E7447, 2016 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-27810958

RESUMO

Here, we systematically decompose the known protein structural universe into its basic elements, which we dub tertiary structural motifs (TERMs). A TERM is a compact backbone fragment that captures the secondary, tertiary, and quaternary environments around a given residue, comprising one or more disjoint segments (three on average). We seek the set of universal TERMs that capture all structure in the Protein Data Bank (PDB), finding remarkable degeneracy. Only ∼600 TERMs are sufficient to describe 50% of the PDB at sub-Angstrom resolution. However, more rare geometries also exist, and the overall structural coverage grows logarithmically with the number of TERMs. We go on to show that universal TERMs provide an effective mapping between sequence and structure. We demonstrate that TERM-based statistics alone are sufficient to recapitulate close-to-native sequences given either NMR or X-ray backbones. Furthermore, sequence variability predicted from TERM data agrees closely with evolutionary variation. Finally, locations of TERMs in protein chains can be predicted from sequence alone based on sequence signatures emergent from TERM instances in the PDB. For multisegment motifs, this method identifies spatially adjacent fragments that are not contiguous in sequence-a major bottleneck in structure prediction. Although all TERMs recur in diverse proteins, some appear specialized for certain functions, such as interface formation, metal coordination, or even water binding. Structural biology has benefited greatly from previously observed degeneracies in structure. The decomposition of the known structural universe into a finite set of compact TERMs offers exciting opportunities toward better understanding, design, and prediction of protein structure.


Assuntos
Proteínas/química , Proteínas/genética , Algoritmos , Sequência de Aminoácidos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Espectroscopia de Ressonância Magnética , Modelos Moleculares , Conformação Proteica
7.
Biophys J ; 110(11): 2507-2516, 2016 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-27276268

RESUMO

We present a strategy for designed self-assembly of peptides into two-dimensional monolayer crystals on the surface of graphene and graphite. As predicted by computation, designed peptides assemble on the surface of graphene to form very long, parallel, in-register ß-sheets, which we call ß-tapes. Peptides extend perpendicularly to the long axis of each ß-tape, defining its width, with hydrogen bonds running along the axis. Tapes align on the surface to create highly regular microdomains containing 4-nm pitch striations. Moreover, in agreement with calculations, the atomic structure of the underlying graphene dictates the arrangement of the ß-tapes, as they orient along one of six directions defined by graphene's sixfold symmetry. A cationic-assembled peptide surface is shown here to strongly adhere to DNA, preferentially orienting the double helix along ß-tape axes. This orientational preference is well anticipated from calculations, given the underlying peptide layer structure. These studies illustrate how designed peptides can amplify the Ångstrom-level atomic symmetry of a surface onto the micrometer scale, further imparting long-range directional order onto the next level of assembly. The remarkably stable nature of these assemblies under various environmental conditions suggests applications in enzymelike catalysis, biological interfaces for cellular recognition, and two-dimensional platforms for studying DNA-peptide interactions.


Assuntos
Grafite/química , Simulação de Dinâmica Molecular , Peptídeos/metabolismo , Multimerização Proteica , Cátions/metabolismo , DNA/metabolismo , Endopeptidase K/metabolismo , Cinética , Microscopia de Força Atômica , Ligação Proteica , Estabilidade Proteica , Estrutura Secundária de Proteína , Eletricidade Estática , Água/química
8.
Am J Physiol Gastrointest Liver Physiol ; 310(8): G586-98, 2016 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-26867566

RESUMO

The Na(+)/H(+) exchanger regulatory factor (NHERF) family of proteins is scaffolds that orchestrate interaction of receptors and cellular proteins. Previous studies have shown that NHERF1 functions as a tumor suppressor. The goal of this study is to determine whether the loss of NHERF2 alters colorectal cancer (CRC) progress. We found that NHERF2 expression is elevated in advanced-stage CRC. Knockdown of NHERF2 decreased cancer cell proliferation in vitro and in a mouse xenograft tumor model. In addition, deletion of NHERF2 in Apc(Min/+) mice resulted in decreased tumor growth in Apc(Min/+) mice and increased lifespan. Blocking NHERF2 interaction with a small peptide designed to bind the second PDZ domain of NHERF2 attenuated cancer cell proliferation. Although NHERF2 is known to facilitate the effects of lysophosphatidic acid receptor 2 (LPA2), transcriptome analysis of xenograft tumors revealed that NHERF2-dependent genes largely differ from LPA2-regulated genes. Activation of ß-catenin and ERK1/2 was mitigated in Apc(Min/+);Nherf2(-/-) adenomas. Moreover, Stat3 phosphorylation and CD24 expression levels were suppressed in Apc(Min/+);Nherf2(-/-) adenomas. Consistently, NHERF2 knockdown attenuated Stat3 activation and CD24 expression in colon cancer cells. Interestingly, CD24 was important in the maintenance of Stat3 phosphorylation, whereas NHERF2-dependent increase in CD24 expression was blocked by inhibition of Stat3, suggesting that NHERF2 regulates Stat3 phosphorylation through a positive feedback mechanism between Stat3 and CD24. In summary, this study identifies NHERF2 as a novel oncogenic protein and a potential target for cancer treatment. NHERF2 potentiates the oncogenic effects in part by regulation of Stat3 and CD24.


Assuntos
Adenoma/metabolismo , Antígeno CD24/metabolismo , Neoplasias do Colo/metabolismo , Deleção de Genes , Fosfoproteínas/metabolismo , Fator de Transcrição STAT3/metabolismo , Trocadores de Sódio-Hidrogênio/metabolismo , Adenoma/genética , Adenoma/patologia , Proteína da Polipose Adenomatosa do Colo/genética , Animais , Antígeno CD24/genética , Proliferação de Células , Neoplasias do Colo/genética , Neoplasias do Colo/patologia , Retroalimentação Fisiológica , Feminino , Células HCT116 , Células HT29 , Humanos , Camundongos , Camundongos Nus , Proteína Quinase 1 Ativada por Mitógeno/metabolismo , Proteína Quinase 3 Ativada por Mitógeno/metabolismo , Fosfoproteínas/genética , Fator de Transcrição STAT3/genética , Trocadores de Sódio-Hidrogênio/genética , Transcriptoma
9.
Nature ; 458(7240): 859-64, 2009 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-19370028

RESUMO

Interaction specificity is a required feature of biological networks and a necessary characteristic of protein or small-molecule reagents and therapeutics. The ability to alter or inhibit protein interactions selectively would advance basic and applied molecular science. Assessing or modelling interaction specificity requires treating multiple competing complexes, which presents computational and experimental challenges. Here we present a computational framework for designing protein-interaction specificity and use it to identify specific peptide partners for human basic-region leucine zipper (bZIP) transcription factors. Protein microarrays were used to characterize designed, synthetic ligands for all but one of 20 bZIP families. The bZIP proteins share strong sequence and structural similarities and thus are challenging targets to bind specifically. Nevertheless, many of the designs, including examples that bind the oncoproteins c-Jun, c-Fos and c-Maf (also called JUN, FOS and MAF, respectively), were selective for their targets over all 19 other families. Collectively, the designs exhibit a wide range of interaction profiles and demonstrate that human bZIPs have only sparsely sampled the possible interaction space accessible to them. Our computational method provides a way to systematically analyse trade-offs between stability and specificity and is suitable for use with many types of structure-scoring functions; thus, it may prove broadly useful as a tool for protein design.


Assuntos
Fatores de Transcrição de Zíper de Leucina Básica/química , Fatores de Transcrição de Zíper de Leucina Básica/metabolismo , Biologia Computacional/métodos , Engenharia de Proteínas/métodos , Motivos de Aminoácidos , Fatores de Transcrição de Zíper de Leucina Básica/classificação , Desenho de Fármacos , Humanos , Zíper de Leucina , Análise Serial de Proteínas , Ligação Proteica , Reprodutibilidade dos Testes , Especificidade por Substrato
10.
Proc Natl Acad Sci U S A ; 108(52): 20992-7, 2011 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-22178759

RESUMO

During cell entry, enveloped viruses fuse their viral membrane with a cellular membrane in a process driven by energetically favorable, large-scale conformational rearrangements of their fusion proteins. Structures of the pre- and postfusion states of the fusion proteins including paramyxovirus PIV5 F and influenza virus hemagglutinin suggest that this occurs via two intermediates. Following formation of an initial complex, the proteins structurally elongate, driving a hydrophobic N-terminal "fusion peptide" away from the protein surface into the target membrane. Paradoxically, this first conformation change moves the viral and cellular bilayers further apart. Next, the fusion proteins form a hairpin that drives the two membranes into close opposition. While the pre- and postfusion hairpin forms have been characterized crystallographically, the transiently extended prehairpin intermediate has not been visualized. To provide evidence for this extended intermediate we measured the interbilayer spacing of a paramyxovirus trapped in the process of fusing with solid-supported bilayers. A gold-labeled peptide that binds the prehairpin intermediate was used to stabilize and specifically image F-proteins in the prehairpin intermediate. The interbilayer spacing is precisely that predicted from a computational model of the prehairpin, providing strong evidence for its structure and functional role. Moreover, the F-proteins in the prehairpin conformation preferentially localize to a patch between the target and viral membranes, consistent with the fact that the formation of the prehairpin is triggered by local contacts between F- and neighboring viral receptor-binding proteins (HN) only when HN binds lipids in its target membrane.


Assuntos
Modelos Biológicos , Paramyxoviridae/metabolismo , Conformação Proteica , Proteínas Virais de Fusão/metabolismo , Ligação Viral , Membrana Celular/metabolismo , Cromatografia Líquida de Alta Pressão , Imuno-Histoquímica , Microscopia Eletrônica de Transmissão , Dobramento de Proteína , Ultracentrifugação
11.
Protein Sci ; 33(2): e4853, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38078680

RESUMO

Comparing accuracies of structural protein-protein interaction (PPI) models for different complexes on an absolute scale is a challenge, requiring normalization of scores across structures of different sizes and shapes. To help address this challenge, we have developed a statistical significance metric for docking models, called random-docking (RD) p-value. This score evaluates a PPI model based on how likely a random docking process is to produce a model of better or equal accuracy. The binding partners are randomly docked against each other a large number of times, and the probability of sampling a model of equal or greater accuracy from this reference distribution is the RD p-value. Using a subset of top predicted models from CAPRI (Critical Assessment of PRediction of Interactions) rounds over 2017-2020, we find that the ease of achieving a given root mean squared deviation or DOCKQ score varies considerably by target; achieving the same relative metric can be thousands of times easier for one complex compared to another. In contrast, RD p-values inherently normalize scores for models of different complexes, making them globally comparable. Furthermore, one can calculate RD p-values after generating a reference distribution that accounts for prior information about the interface geometry, such as residues involved in binding, by giving the random-docking process access the same information. Thus, one can decouple improvements in prediction accuracy that arise solely from basic modeling constraints from those due to the rest of the method. We provide efficient code for computing RD p-values at https://github.com/Grigoryanlab/RDP.


Assuntos
Mapeamento de Interação de Proteínas , Proteínas , Proteínas/química , Mapeamento de Interação de Proteínas/métodos , Simulação de Acoplamento Molecular , Conformação Proteica , Ligação Proteica , Software , Algoritmos , Biologia Computacional/métodos , Sítios de Ligação
12.
Protein Sci ; 33(9): e5127, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39167052

RESUMO

The ability to accurately predict antibody-antigen complex structures from their sequences could greatly advance our understanding of the immune system and would aid in the development of novel antibody therapeutics. There have been considerable recent advancements in predicting protein-protein interactions (PPIs) fueled by progress in machine learning (ML). To understand the current state of the field, we compare six representative methods for predicting antibody-antigen complexes from sequence, including two deep learning approaches trained to predict PPIs in general (AlphaFold-Multimer and RoseTTAFold), two composite methods that initially predict antibody and antigen structures separately and dock them (using antibody-mode ClusPro), local refinement in Rosetta (SnugDock) of globally docked poses from ClusPro, and a pipeline combining homology modeling with rigid-body docking informed by ML-based epitope and paratope prediction (AbAdapt). We find that AlphaFold-Multimer outperformed other methods, although the absolute performance leaves considerable room for improvement. AlphaFold-Multimer models of lower quality display significant structural biases at the level of tertiary motifs (TERMs) toward having fewer structural matches in non-antibody-containing structures from the Protein Data Bank (PDB). Specifically, better models exhibit more common PDB-like TERMs at the antibody-antigen interface than worse ones. Importantly, the clear relationship between performance and the commonness of interfacial TERMs suggests that the scarcity of interfacial geometry data in the structural database may currently limit the application of ML to the prediction of antibody-antigen interactions.


Assuntos
Complexo Antígeno-Anticorpo , Complexo Antígeno-Anticorpo/química , Conformação Proteica , Anticorpos/química , Anticorpos/imunologia , Simulação de Acoplamento Molecular , Modelos Moleculares , Humanos
13.
J Comput Chem ; 34(31): 2726-41, 2013 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-24132787

RESUMO

Computing the absolute free energy of a macromolecule's structural state, F, is a challenging problem of high relevance. This study presents a method that computes F using only information from an unperturbed simulation of the macromolecule in the relevant conformational state, ensemble, and environment. Absolute free energies produced by this method, dubbed Valuation of Local Configuration Integral with Dynamics (VALOCIDY), enable comparison of alternative states. For example, comparing explicitly solvated and vaporous states of amino acid side-chain analogs produces solvation free energies in good agreement with experiments. Also, comparisons between alternative conformational states of model heptapeptides (including the unfolded state) produce free energy differences in agreement with data from µs molecular-dynamics simulations and experimental propensities. The potential of using VALOCIDY in computational protein design is explored via a small design problem of stabilizing a ß-turn structure. When VALOCIDY-based estimation of folding free energy is used as the design metric, the resulting sequence folds into the desired structure within the atomistic force field used in design. The VALOCIDY-based approach also recognizes the distinct status of the native sequence regardless of minor details of the starting template structure, in stark contrast with a traditional fixed-backbone approach.


Assuntos
Aminoácidos/química , Peptídeos/química , Simulação de Dinâmica Molecular , Estrutura Secundária de Proteína , Termodinâmica
14.
Protein Sci ; 32(4): e4607, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36823715

RESUMO

We propose a high-throughput method for quantitatively measuring hundreds of protein-peptide binding affinities in parallel. In this assay, a solution of protein is dialyzed into a buffer containing a pool of potential binding peptides, such that upon equilibration the relative abundance of a peptide species is mathematically related to that peptide's dissociation constant, Kd . We use isobaric multiplexed quantitative proteomics to simultaneously determine the relative abundance, and hence the Kd and its associated error, for an entire peptide library. We apply this technique, which we call PEDAL (parallel equilibrium dialysis for affinity learning), to determine accurate Kd 's between a PDZ domain and hundreds of peptides, spanning an affinity range of multiple orders of magnitude in a single experiment. PEDAL is a convenient, fast, and low-cost method for measuring large numbers of protein-peptide affinities in parallel, providing a rare combination of true in-solution binding equilibria with the ability to multiplex.


Assuntos
Peptídeos , Diálise Renal , Peptídeos/metabolismo , Proteínas , Espectrometria de Massas , Biblioteca de Peptídeos
15.
Protein Sci ; 32(2): e4554, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36564857

RESUMO

Designing novel proteins to perform desired functions, such as binding or catalysis, is a major goal in synthetic biology. A variety of computational approaches can aid in this task. An energy-based framework rooted in the sequence-structure statistics of tertiary motifs (TERMs) can be used for sequence design on predefined backbones. Neural network models that use backbone coordinate-derived features provide another way to design new proteins. In this work, we combine the two methods to make neural structure-based models more suitable for protein design. Specifically, we supplement backbone-coordinate features with TERM-derived data, as inputs, and we generate energy functions as outputs. We present two architectures that generate Potts models over the sequence space: TERMinator, which uses both TERM-based and coordinate-based information, and COORDinator, which uses only coordinate-based information. Using these two models, we demonstrate that TERMs can be utilized to improve native sequence recovery performance of neural models. Furthermore, we demonstrate that sequences designed by TERMinator are predicted to fold to their target structures by AlphaFold. Finally, we show that both TERMinator and COORDinator learn notions of energetics, and these methods can be fine-tuned on experimental data to improve predictions. Our results suggest that using TERM-based and coordinate-based features together may be beneficial for protein design and that structure-based neural models that produce Potts energy tables have utility for flexible applications in protein science.


Assuntos
Redes Neurais de Computação , Proteínas , Sequência de Aminoácidos , Proteínas/química
16.
J Comput Chem ; 33(20): 1645-61, 2012 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-22565567

RESUMO

We present the Molecular Software Library (MSL), a C++ library for molecular modeling. MSL is a set of tools that supports a large variety of algorithms for the design, modeling, and analysis of macromolecules. Among the main features supported by the library are methods for applying geometric transformations and alignments, the implementation of a rich set of energy functions, side chain optimization, backbone manipulation, calculation of solvent accessible surface area, and other tools. MSL has a number of unique features, such as the ability of storing alternative atomic coordinates (for modeling) and multiple amino acid identities at the same backbone position (for design). It has a straightforward mechanism for extending its energy functions and can work with any type of molecules. Although the code base is large, MSL was created with ease of developing in mind. It allows the rapid implementation of simple tasks while fully supporting the creation of complex applications. Some of the potentialities of the software are demonstrated here with examples that show how to program complex and essential modeling tasks with few lines of code. MSL is an ongoing and evolving project, with new features and improvements being introduced regularly, but it is mature and suitable for production and has been used in numerous protein modeling and design projects. MSL is open-source software, freely downloadable at http://msl-libraries.org. We propose it as a common platform for the development of new molecular algorithms and to promote the distribution, sharing, and reutilization of computational methods.


Assuntos
Biologia Computacional/métodos , Proteínas/química , Software , Algoritmos , Bases de Dados de Proteínas , Modelos Moleculares , Conformação Proteica , Termodinâmica
17.
Protein Sci ; 31(4): 900-917, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35060221

RESUMO

Relating a protein's sequence to its conformation is a central challenge for both structure prediction and sequence design. Statistical contact potentials, as well as their more descriptive versions that account for side-chain orientation and other geometric descriptors, have served as simplistic but useful means of representing second-order contributions in sequence-structure relationships. Here we ask what happens when a pairwise potential is conditioned on the fully defined geometry of interacting backbones fragments. We show that the resulting structure-conditioned coupling energies more accurately reflect pair preferences as a function of structural contexts. These structure-conditioned energies more reliably encode native sequence information and more highly correlate with experimentally determined coupling energies. Clustering a database of interaction motifs by structure results in ensembles of similar energies and clustering them by energy results in ensembles of similar structures. By comparing many pairs of interaction motifs and showing that structural similarity and energetic similarity go hand-in-hand, we provide a tangible link between modular sequence and structure elements. This link is applicable to structural modeling, and we show that scoring CASP models with structured-conditioned energies results in substantially higher correlation with structural quality than scoring the same models with a contact potential. We conclude that structure-conditioned coupling energies are a good way to model the impact of interaction geometry on second-order sequence preferences.


Assuntos
Aminoácidos , Aminoácidos/química , Modelos Moleculares , Conformação Proteica
18.
Protein Sci ; 31(6): e4322, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35634780

RESUMO

Despite advances in protein engineering, the de novo design of small proteins or peptides that bind to a desired target remains a difficult task. Most computational methods search for binder structures in a library of candidate scaffolds, which can lead to designs with poor target complementarity and low success rates. Instead of choosing from pre-defined scaffolds, we propose that custom peptide structures can be constructed to complement a target surface. Our method mines tertiary motifs (TERMs) from known structures to identify surface-complementing fragments or "seeds." We combine seeds that satisfy geometric overlap criteria to generate peptide backbones and score the backbones to identify the most likely binding structures. We found that TERM-based seeds can describe known binding structures with high resolution: the vast majority of peptide binders from 486 peptide-protein complexes can be covered by seeds generated from single-chain structures. Furthermore, we demonstrate that known peptide structures can be reconstructed with high accuracy from peptide-covering seeds. As a proof of concept, we used our method to design 100 peptide binders of TRAF6, seven of which were predicted by Rosetta to form higher-quality interfaces than a native binder. The designed peptides interact with distinct sites on TRAF6, including the native peptide-binding site. These results demonstrate that known peptide-binding structures can be constructed from TERMs in single-chain structures and suggest that TERM information can be applied to efficiently design novel target-complementing binders.


Assuntos
Peptídeos , Fator 6 Associado a Receptor de TNF , Sítios de Ligação , Peptídeos/química , Ligação Proteica , Engenharia de Proteínas , Fator 6 Associado a Receptor de TNF/metabolismo
19.
Front Immunol ; 13: 1016179, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36569945

RESUMO

The optimal use of many biotherapeutics is restricted by Anti-drug antibodies (ADAs) and hypersensitivity responses which can affect potency and ability to administer a treatment. Here we demonstrate that Re-surfacing can be utilized as a generalizable approach to engineer proteins with extensive surface residue modifications in order to avoid binding by pre-existing ADAs. This technique was applied to E. coli Asparaginase (ASN) to produce functional mutants with up to 58 substitutions resulting in direct modification of 35% of surface residues. Re-surfaced ASNs exhibited significantly reduced binding to murine, rabbit and human polyclonal ADAs, with a negative correlation observed between binding and mutational distance from the native protein. Reductions in ADA binding correlated with diminished hypersensitivity responses in an in vivo mouse model. By using computational design approaches to traverse extended distances in mutational space while maintaining function, protein Re-surfacing may provide a means to generate novel or second line therapies for life-saving drugs with limited therapeutic alternatives.


Assuntos
Asparaginase , Escherichia coli , Humanos , Animais , Camundongos , Coelhos , Asparaginase/genética , Asparaginase/uso terapêutico , Escherichia coli/genética , Anticorpos , Proteínas de Membrana
20.
PLoS One ; 17(3): e0265020, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35286324

RESUMO

Engineered proteins generally must possess a stable structure in order to achieve their designed function. Stable designs, however, are astronomically rare within the space of all possible amino acid sequences. As a consequence, many designs must be tested computationally and experimentally in order to find stable ones, which is expensive in terms of time and resources. Here we use a high-throughput, low-fidelity assay to experimentally evaluate the stability of approximately 200,000 novel proteins. These include a wide range of sequence perturbations, providing a baseline for future work in the field. We build a neural network model that predicts protein stability given only sequences of amino acids, and compare its performance to the assayed values. We also report another network model that is able to generate the amino acid sequences of novel stable proteins given requested secondary sequences. Finally, we show that the predictive model-despite weaknesses including a noisy data set-can be used to substantially increase the stability of both expert-designed and model-generated proteins.


Assuntos
Redes Neurais de Computação , Proteínas , Sequência de Aminoácidos , Aminoácidos , Estabilidade Proteica , Proteínas/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA