Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
Protein Eng Des Sel ; 372024 Jan 29.
Artículo en Inglés | MEDLINE | ID: mdl-38757573

RESUMEN

With over 270 unique occurrences in the human genome, peptide-recognizing PDZ domains play a central role in modulating polarization, signaling, and trafficking pathways. Mutations in PDZ domains lead to diseases such as cancer and cystic fibrosis, making PDZ domains attractive targets for therapeutic intervention. D-peptide inhibitors offer unique advantages as therapeutics, including increased metabolic stability and low immunogenicity. Here, we introduce DexDesign, a novel OSPREY-based algorithm for computationally designing de novo D-peptide inhibitors. DexDesign leverages three novel techniques that are broadly applicable to computational protein design: the Minimum Flexible Set, K*-based Mutational Scan, and Inverse Alanine Scan. We apply these techniques and DexDesign to generate novel D-peptide inhibitors of two biomedically important PDZ domain targets: CAL and MAST2. We introduce a framework for analyzing de novo peptides-evaluation along a replication/restitution axis-and apply it to the DexDesign-generated D-peptides. Notably, the peptides we generated are predicted to bind their targets tighter than their targets' endogenous ligands, validating the peptides' potential as lead inhibitors. We also provide an implementation of DexDesign in the free and open source computational protein design software OSPREY.


Asunto(s)
Algoritmos , Péptidos , Péptidos/química , Péptidos/farmacología , Humanos , Diseño de Fármacos , Dominios PDZ
2.
bioRxiv ; 2024 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-38405797

RESUMEN

With over 270 unique occurrences in the human genome, peptide-recognizing PDZ domains play a central role in modulating polarization, signaling, and trafficking pathways. Mutations in PDZ domains lead to diseases such as cancer and cystic fibrosis, making PDZ domains attractive targets for therapeutic intervention. D-peptide inhibitors offer unique advantages as therapeutics, including increased metabolic stability and low immunogenicity. Here, we introduce DexDesign, a novel OSPREY-based algorithm for computationally designing de novo D-peptide inhibitors. DexDesign leverages three novel techniques that are broadly applicable to computational protein design: the Minimum Flexible Set, K*-based Mutational Scan, and Inverse Alanine Scan, which enable exponential reductions in the size of the peptide sequence search space. We apply these techniques and DexDesign to generate novel D-peptide inhibitors of two biomedically important PDZ domain targets: CAL and MAST2. We introduce a new framework for analyzing de novo peptides-evaluation along a replication/restitution axis-and apply it to the DexDesign-generated D-peptides. Notably, the peptides we generated are predicted to bind their targets tighter than their targets' endogenous ligands, validating the peptides' potential as lead therapeutic candidates. We provide an implementation of DexDesign in the free and open source computational protein design software OSPREY.

3.
bioRxiv ; 2023 Nov 18.
Artículo en Inglés | MEDLINE | ID: mdl-38014181

RESUMEN

Accurate binding affinity prediction is crucial to structure-based drug design. Recent work used computational topology to obtain an effective representation of protein-ligand interactions. Although persistent homology encodes geometric features, previous works on binding affinity prediction using persistent homology employed uninterpretable machine learning models and failed to explain the underlying geometric and topological features that drive accurate binding affinity prediction. In this work, we propose a novel, interpretable algorithm for protein-ligand binding affinity prediction. Our algorithm achieves interpretability through an effective embedding of distances across bipartite matchings of the protein and ligand atoms into real-valued functions by summing Gaussians centered at features constructed by persistent homology. We name these functions internuclear persistent contours (IPCs) . Next, we introduce persistence fingerprints , a vector with 10 components that sketches the distances of different bipartite matching between protein and ligand atoms, refined from IPCs. Let the number of protein atoms in the protein-ligand complex be n , number of ligand atoms be m , and ω ≈ 2.4 be the matrix multiplication exponent. We show that for any 0 < ε < 1, after an 𝒪 ( mn log( mn )) preprocessing procedure, we can compute an ε -accurate approximation to the persistence fingerprint in 𝒪 ( m log 6 ω ( m/" )) time, independent of protein size. This is an improvement in time complexity by a factor of 𝒪 (( m + n ) 3 ) over any previous binding affinity prediction that uses persistent homology. We show that the representational power of persistence fingerprint generalizes to protein-ligand binding datasets beyond the training dataset. Then, we introduce PATH , Predicting Affinity Through Homology, an interpretable, small ensemble of shallow regression trees for binding affinity prediction from persistence fingerprints. We show that despite using 1,400-fold fewer features, PATH has comparable performance to a previous state-of-the-art binding affinity prediction algorithm that uses persistent homology features. Moreover, PATH has the advantage of being interpretable. Finally, we visualize the features captured by persistence fingerprint for variant HIV-1 protease complexes and show that persistence fingerprint captures binding-relevant structural mutations. The source code for PATH is released open-source as part of the osprey protein design software package.

4.
Commun Biol ; 6(1): 720, 2023 07 13.
Artículo en Inglés | MEDLINE | ID: mdl-37443295

RESUMEN

We report an Osprey-based computational protocol to prospectively identify oncogenic mutations that act via disruption of molecular interactions. It is applicable to analyse both protein-protein and protein-DNA interfaces and it is validated on a dataset of clinically relevant mutations. In addition, it is used to predict previously uncharacterised patient mutations in CDK6 and p16 genes, which are experimentally confirmed to impair complex formation.


Asunto(s)
ADN , Proteínas , Humanos , Proteínas/genética , Mutación , ADN/genética
5.
Cell Rep ; 42(7): 112711, 2023 07 25.
Artículo en Inglés | MEDLINE | ID: mdl-37436900

RESUMEN

Broadly neutralizing antibodies (bNAbs) against HIV can reduce viral transmission in humans, but an effective therapeutic will require unusually high breadth and potency of neutralization. We employ the OSPREY computational protein design software to engineer variants of two apex-directed bNAbs, PGT145 and PG9RSH, resulting in increases in potency of over 100-fold against some viruses. The top designed variants improve neutralization breadth from 39% to 54% at clinically relevant concentrations (IC80 < 1 µg/mL) and improve median potency (IC80) by up to 4-fold over a cross-clade panel of 208 strains. To investigate the mechanisms of improvement, we determine cryoelectron microscopy structures of each variant in complex with the HIV envelope trimer. Surprisingly, we find the largest increases in breadth to be a result of optimizing side-chain interactions with highly variable epitope residues. These results provide insight into mechanisms of neutralization breadth and inform strategies for antibody design and improvement.


Asunto(s)
Infecciones por VIH , Seropositividad para VIH , VIH-1 , Humanos , Anticuerpos Anti-VIH , Anticuerpos Neutralizantes , Anticuerpos ampliamente neutralizantes , Microscopía por Crioelectrón , Pruebas de Neutralización
6.
J Struct Biol X ; 8: 100091, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37416832

RESUMEN

Podisus maculiventris thanatin has been reported as a potent antimicrobial peptide with antibacterial and antifungal activity. Its antibiotic activity has been most thoroughly characterized against E. coli and shown to interfere with multiple pathways, such as the lipopolysaccharide transport (LPT) pathway comprised of seven different Lpt proteins. Thanatin binds to E. coli LptA and LptD, thus disrupting the LPT complex formation and inhibiting cell wall synthesis and microbial growth. Here, we performed a genomic database search to uncover novel thanatin orthologs, characterized their binding to E. coli LptA using bio-layer interferometry, and assessed their antimicrobial activity against E. coli. We found that thanatins from Chinavia ubica and Murgantia histrionica bound tighter (by 3.6- and 2.2-fold respectively) to LptA and exhibited more potent antibiotic activity (by 2.1- and 2.8-fold respectively) than the canonical thanatin from P. maculiventris. We crystallized and determined the LptA-bound complex structures of thanatins from C. ubica (1.90 Å resolution), M. histrionica (1.80 Å resolution), and P. maculiventris (2.43 Å resolution) to better understand their mechanism of action. Our structural analysis revealed that residues A10 and I21 in C. ubica and M. histrionica thanatin are important for improving the binding interface with LptA, thus overall improving the potency of thanatin against E. coli. We also designed a stapled variant of thanatin that removes the need for a disulfide bond but retains the ability to bind LptA and antibiotic activity. Our discovery presents a library of novel thanatin sequences to serve as starting scaffolds for designing more potent antimicrobial therapeutics.

7.
STAR Protoc ; 4(2): 102170, 2023 Apr 27.
Artículo en Inglés | MEDLINE | ID: mdl-37115667

RESUMEN

Prospective predictions of drug-resistant protein mutants could improve the design of therapeutics less prone to resistance. Here, we describe RESISTOR, an algorithm that uses structure- and sequence-based criteria to predict resistance mutations. We demonstrate the process of using RESISTOR to predict ERK2 mutants likely to arise in melanoma ablating the efficacy of the ERK1/2 inhibitor SCH779284. RESISTOR is included in the free and open-source computational protein design software OSPREY. For complete details on the use and execution of this protocol, please refer to Guerin et al..1.

8.
Cell Syst ; 13(10): 830-843.e3, 2022 10 19.
Artículo en Inglés | MEDLINE | ID: mdl-36265469

RESUMEN

Resistance to pharmacological treatments is a major public health challenge. Here, we introduce Resistor-a structure- and sequence-based algorithm that prospectively predicts resistance mutations for drug design. Resistor computes the Pareto frontier of four resistance-causing criteria: the change in binding affinity (ΔKa) of the (1) drug and (2) endogenous ligand upon a protein's mutation; (3) the probability a mutation will occur based on empirically derived mutational signatures; and (4) the cardinality of mutations comprising a hotspot. For validation, we applied Resistor to EGFR and BRAF kinase inhibitors treating lung adenocarcinoma and melanoma. Resistor correctly identified eight clinically significant EGFR resistance mutations, including the erlotinib and gefitinib "gatekeeper" T790M mutation and five known osimertinib resistance mutations. Furthermore, Resistor predictions are consistent with BRAF inhibitor sensitivity data from both retrospective and prospective experiments using KinCon biosensors. Resistor is available in the open-source protein design software OSPREY.


Asunto(s)
Antineoplásicos , Neoplasias Pulmonares , Humanos , Clorhidrato de Erlotinib , Gefitinib/uso terapéutico , Receptores ErbB/genética , Receptores ErbB/metabolismo , Proteínas Proto-Oncogénicas B-raf/genética , Inhibidores de Proteínas Quinasas/farmacología , Mutación/genética , Neoplasias Pulmonares/tratamiento farmacológico , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patología , Estudios Retrospectivos , Ligandos , Estudios Prospectivos , Resistencia a Antineoplásicos/genética , Antineoplásicos/farmacología , Antineoplásicos/uso terapéutico , Algoritmos
9.
J Comput Biol ; 29(12): 1346-1352, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36099194

RESUMEN

Computational, in silico prediction of resistance-conferring escape mutations could accelerate the design of therapeutics less prone to resistance. This article describes how to use the Resistor algorithm to predict escape mutations. Resistor employs Pareto optimization on four resistance-conferring criteria-positive and negative design, mutational probability, and hotspot cardinality-to assign a Pareto rank to each prospective mutant. It also predicts the mechanism of resistance, that is, whether a mutant ablates binding to a drug, strengthens binding to the endogenous ligand, or a combination of these two factors, and provides structural models of the mutants. Resistor is part of the free and open-source computational protein design software OSPREY.


Asunto(s)
Algoritmos , Proteínas , Estudios Prospectivos , Proteínas/química , Mutación , Ligandos
10.
PLoS Comput Biol ; 18(2): e1009855, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-35143481

RESUMEN

Antimicrobial resistance presents a significant health care crisis. The mutation F98Y in Staphylococcus aureus dihydrofolate reductase (SaDHFR) confers resistance to the clinically important antifolate trimethoprim (TMP). Propargyl-linked antifolates (PLAs), next generation DHFR inhibitors, are much more resilient than TMP against this F98Y variant, yet this F98Y substitution still reduces efficacy of these agents. Surprisingly, differences in the enantiomeric configuration at the stereogenic center of PLAs influence the isomeric state of the NADPH cofactor. To understand the molecular basis of F98Y-mediated resistance and how PLAs' inhibition drives NADPH isomeric states, we used protein design algorithms in the osprey protein design software suite to analyze a comprehensive suite of structural, biophysical, biochemical, and computational data. Here, we present a model showing how F98Y SaDHFR exploits a different anomeric configuration of NADPH to evade certain PLAs' inhibition, while other PLAs remain unaffected by this resistance mechanism.


Asunto(s)
Antagonistas del Ácido Fólico , Infecciones Estafilocócicas , Farmacorresistencia Bacteriana/genética , Antagonistas del Ácido Fólico/química , Antagonistas del Ácido Fólico/farmacología , Humanos , NADP/metabolismo , Staphylococcus aureus/genética , Staphylococcus aureus/metabolismo , Tetrahidrofolato Deshidrogenasa/química , Tetrahidrofolato Deshidrogenasa/genética , Tetrahidrofolato Deshidrogenasa/metabolismo , Trimetoprim/química , Trimetoprim/metabolismo , Trimetoprim/farmacología
11.
PLoS Comput Biol ; 16(6): e1007447, 2020 06.
Artículo en Inglés | MEDLINE | ID: mdl-32511232

RESUMEN

The K* algorithm provably approximates partition functions for a set of states (e.g., protein, ligand, and protein-ligand complex) to a user-specified accuracy ε. Often, reaching an ε-approximation for a particular set of partition functions takes a prohibitive amount of time and space. To alleviate some of this cost, we introduce two new algorithms into the osprey suite for protein design: fries, a Fast Removal of Inadequately Energied Sequences, and EWAK*, an Energy Window Approximation to K*. fries pre-processes the sequence space to limit a design to only the most stable, energetically favorable sequence possibilities. EWAK* then takes this pruned sequence space as input and, using a user-specified energy window, calculates K* scores using the lowest energy conformations. We expect fries/EWAK* to be most useful in cases where there are many unstable sequences in the design sequence space and when users are satisfied with enumerating the low-energy ensemble of conformations. In combination, these algorithms provably retain calculational accuracy while limiting the input sequence space and the conformations included in each partition function calculation to only the most energetically favorable, effectively reducing runtime while still enriching for desirable sequences. This combined approach led to significant speed-ups compared to the previous state-of-the-art multi-sequence algorithm, BBK*, while maintaining its efficiency and accuracy, which we show across 40 different protein systems and a total of 2,826 protein design problems. Additionally, as a proof of concept, we used these new algorithms to redesign the protein-protein interface (PPI) of the c-Raf-RBD:KRas complex. The Ras-binding domain of the protein kinase c-Raf (c-Raf-RBD) is the tightest known binder of KRas, a protein implicated in difficult-to-treat cancers. fries/EWAK* accurately retrospectively predicted the effect of 41 different sets of mutations in the PPI of the c-Raf-RBD:KRas complex. Notably, these mutations include mutations whose effect had previously been incorrectly predicted using other computational methods. Next, we used fries/EWAK* for prospective design and discovered a novel point mutation that improves binding of c-Raf-RBD to KRas in its active, GTP-bound state (KRasGTP). We combined this new mutation with two previously reported mutations (which were highly-ranked by osprey) to create a new variant of c-Raf-RBD, c-Raf-RBD(RKY). fries/EWAK* in osprey computationally predicted that this new variant binds even more tightly than the previous best-binding variant, c-Raf-RBD(RK). We measured the binding affinity of c-Raf-RBD(RKY) using a bio-layer interferometry (BLI) assay, and found that this new variant exhibits single-digit nanomolar affinity for KRasGTP, confirming the computational predictions made with fries/EWAK*. This new variant binds roughly five times more tightly than the previous best known binder and roughly 36 times more tightly than the design starting point (wild-type c-Raf-RBD). This study steps through the advancement and development of computational protein design by presenting theory, new algorithms, accurate retrospective designs, new prospective designs, and biochemical validation.


Asunto(s)
Biología Computacional , Ingeniería de Proteínas/métodos , Proteínas Proto-Oncogénicas c-raf/química , Proteínas Proto-Oncogénicas p21(ras)/química , Algoritmos , Computadores , Humanos , Interferometría , Lectinas/química , Ligandos , Modelos Estadísticos , Lenguajes de Programación , Unión Proteica , Dominios Proteicos , Programas Informáticos
12.
J Comput Biol ; 27(4): 550-564, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-31855059

RESUMEN

Protein design algorithms that model continuous sidechain flexibility and conformational ensembles better approximate the in vitro and in vivo behavior of proteins. The previous state of the art, iMinDEE-A*-K*, computes provable ɛ-approximations to partition functions of protein states (e.g., bound vs. unbound) by computing provable, admissible pairwise-minimized energy lower bounds on protein conformations, and using the A* enumeration algorithm to return a gap-free list of lowest-energy conformations. iMinDEE-A*-K* runs in time sublinear in the number of conformations, but can be trapped in loosely-bounded, low-energy conformational wells containing many conformations with highly similar energies. That is, iMinDEE-A*-K* is unable to exploit the correlation between protein conformation and energy: similar conformations often have similar energy. We introduce two new concepts that exploit this correlation: Minimization-Aware Enumeration and Recursive K*. We combine these two insights into a novel algorithm, Minimization-Aware Recursive K* (MARK*), which tightens bounds not on single conformations, but instead on distinct regions of the conformation space. We compare the performance of iMinDEE-A*-K* versus MARK* by running the Branch and Bound over K* (BBK*) algorithm, which provably returns sequences in order of decreasing K* score, using either iMinDEE-A*-K* or MARK* to approximate partition functions. We show on 200 design problems that MARK* not only enumerates and minimizes vastly fewer conformations than the previous state of the art, but also runs up to 2 orders of magnitude faster. Finally, we show that MARK* not only efficiently approximates the partition function, but also provably approximates the energy landscape. To our knowledge, MARK* is the first algorithm to do so. We use MARK* to analyze the change in energy landscape of the bound and unbound states of an HIV-1 capsid protein C-terminal domain in complex with a camelid VHH, and measure the change in conformational entropy induced by binding. Thus, MARK* both accelerates existing designs and offers new capabilities not possible with previous algorithms.


Asunto(s)
Biología Computacional , Conformación Proteica , Proteínas/genética , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos/genética , Entropía , Modelos Moleculares , Dominios Proteicos/genética , Proteínas/ultraestructura , Termodinámica
13.
J Phys Chem B ; 123(49): 10441-10455, 2019 12 12.
Artículo en Inglés | MEDLINE | ID: mdl-31697075

RESUMEN

The CFTR-associated ligand PDZ domain (CALP) binds to the cystic fibrosis transmembrane conductance regulator (CFTR) and mediates lysosomal degradation of mature CFTR. Inhibition of this interaction has been explored as a therapeutic avenue for cystic fibrosis. Previously, we reported the ensemble-based computational design of a novel peptide inhibitor of CALP, which resulted in the most binding-efficient inhibitor to date. This inhibitor, kCAL01, was designed using osprey and evinced significant biological activity in in vitro cell-based assays. Here, we report a crystal structure of kCAL01 bound to CALP and compare structural features against iCAL36, a previously developed inhibitor of CALP. We compute side-chain energy landscapes for each structure to not only enable approximation of binding thermodynamics but also reveal ensemble features that contribute to the comparatively efficient binding of kCAL01. Finally, we compare the previously reported design ensemble for kCAL01 vs the new crystal structure and show that, despite small differences between the design model and crystal structure, significant biophysical features that enhance inhibitor binding are captured in the design ensemble. This suggests not only that ensemble-based design captured thermodynamically significant features observed in vitro, but also that a design eschewing ensembles would miss the kCAL01 sequence entirely.


Asunto(s)
Regulador de Conductancia de Transmembrana de Fibrosis Quística/antagonistas & inhibidores , Péptidos/farmacología , Termodinámica , Sitios de Unión/efectos de los fármacos , Fibrosis Quística/tratamiento farmacológico , Fibrosis Quística/metabolismo , Regulador de Conductancia de Transmembrana de Fibrosis Quística/metabolismo , Humanos , Ligandos , Modelos Moleculares , Péptidos/síntesis química , Péptidos/química
14.
ACS Infect Dis ; 5(11): 1896-1906, 2019 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-31565920

RESUMEN

The spread of plasmid borne resistance enzymes in clinical Staphylococcus aureus isolates is rendering trimethoprim and iclaprim, both inhibitors of dihydrofolate reductase (DHFR), ineffective. Continued exploitation of these targets will require compounds that can broadly inhibit these resistance-conferring isoforms. Using a structure-based approach, we have developed a novel class of ionized nonclassical antifolates (INCAs) that capture the molecular interactions that have been exclusive to classical antifolates. These modifications allow for a greatly expanded spectrum of activity across these pathogenic DHFR isoforms, while maintaining the ability to penetrate the bacterial cell wall. Using biochemical, structural, and computational methods, we are able to optimize these inhibitors to the conserved active sites of the endogenous and trimethoprim resistant DHFR enzymes. Here, we report a series of INCA compounds that exhibit low nanomolar enzymatic activity and potent cellular activity with human selectivity against a panel of clinically relevant TMP resistant (TMPR) and methicillin resistant Staphylococcus aureus (MRSA) isolates.


Asunto(s)
Antibacterianos/farmacología , Proteínas Bacterianas/antagonistas & inhibidores , Antagonistas del Ácido Fólico/química , Staphylococcus aureus Resistente a Meticilina/enzimología , Infecciones Estafilocócicas/microbiología , Tetrahidrofolato Deshidrogenasa/química , Trimetoprim/farmacología , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Dominio Catalítico , Antagonistas del Ácido Fólico/farmacología , Humanos , Staphylococcus aureus Resistente a Meticilina/efectos de los fármacos , Staphylococcus aureus Resistente a Meticilina/genética , Pruebas de Sensibilidad Microbiana , Tetrahidrofolato Deshidrogenasa/genética , Tetrahidrofolato Deshidrogenasa/metabolismo
15.
Commun ACM ; 62(10): 76-84, 2019 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-31607753
16.
J Comput Chem ; 39(30): 2494-2507, 2018 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-30368845

RESUMEN

We present osprey 3.0, a new and greatly improved release of the osprey protein design software. Osprey 3.0 features a convenient new Python interface, which greatly improves its ease of use. It is over two orders of magnitude faster than previous versions of osprey when running the same algorithms on the same hardware. Moreover, osprey 3.0 includes several new algorithms, which introduce substantial speedups as well as improved biophysical modeling. It also includes GPU support, which provides an additional speedup of over an order of magnitude. Like previous versions of osprey, osprey 3.0 offers a unique package of advantages over other design software, including provable design algorithms that account for continuous flexibility during design and model conformational entropy. Finally, we show here empirically that osprey 3.0 accurately predicts the effect of mutations on protein-protein binding. Osprey 3.0 is available at http://www.cs.duke.edu/donaldlab/osprey.php as free and open-source software. © 2018 Wiley Periodicals, Inc.


Asunto(s)
Conformación Proteica , Proteínas/química , Programas Informáticos , Algoritmos , Modelos Moleculares , Unión Proteica
17.
J Mol Biol ; 430(18 Pt B): 3412-3426, 2018 09 14.
Artículo en Inglés | MEDLINE | ID: mdl-29924964

RESUMEN

The flexibility of biological macromolecules is an important structural determinant of function. Unfortunately, the correlations between different motional modes are poorly captured by discrete ensemble representations. Here, we present new ways to both represent and visualize correlated interdomain motions. Interdomain motions are determined directly from residual dipolar couplings, represented as a continuous conformational distribution, and visualized using the disk-on-sphere representation. Using the disk-on-sphere representation, features of interdomain motions, including correlations, are intuitively visualized. The representation works especially well for multidomain systems with broad conformational distributions.This analysis also can be extended to multiple probability density modes, using a Bingham mixture model. We use this new paradigm to study the interdomain motions of staphylococcal protein A, which is a key virulence factor contributing to the pathogenicity of Staphylococcus aureus. We capture the smooth transitions between important states and demonstrate the utility of continuous distribution functions for computing the reorientational components of binding thermodynamics. Such insights allow for the dissection of the dynamic structural components of functionally important intermolecular interactions.


Asunto(s)
Modelos Moleculares , Conformación Proteica , Dominios y Motivos de Interacción de Proteínas , Proteínas/química , Termodinámica , Resonancia Magnética Nuclear Biomolecular , Proteína Estafilocócica A/química
18.
J Comput Biol ; 25(7): 726-739, 2018 07.
Artículo en Inglés | MEDLINE | ID: mdl-29641249

RESUMEN

Computational protein design (CPD) algorithms that compute binding affinity, Ka, search for sequences with an energetically favorable free energy of binding. Recent work shows that three principles improve the biological accuracy of CPD: ensemble-based design, continuous flexibility of backbone and side-chain conformations, and provable guarantees of accuracy with respect to the input. However, previous methods that use all three design principles are single-sequence (SS) algorithms, which are very costly: linear in the number of sequences and thus exponential in the number of simultaneously mutable residues. To address this computational challenge, we introduce BBK*, a new CPD algorithm whose key innovation is the multisequence (MS) bound: BBK* efficiently computes a single provable upper bound to approximate Ka for a combinatorial number of sequences, and avoids SS computation for all provably suboptimal sequences. Thus, to our knowledge, BBK* is the first provable, ensemble-based CPD algorithm to run in time sublinear in the number of sequences. Computational experiments on 204 protein design problems show that BBK* finds the tightest binding sequences while approximating Ka for up to 105-fold fewer sequences than the previous state-of-the-art algorithms, which require exhaustive enumeration of sequences. Furthermore, for 51 protein-ligand design problems, BBK* provably approximates Ka up to 1982-fold faster than the previous state-of-the-art iMinDEE/[Formula: see text]/[Formula: see text] algorithm. Therefore, BBK* not only accelerates protein designs that are possible with previous provable algorithms, but also efficiently performs designs that are too large for previous methods.


Asunto(s)
Biología Computacional/métodos , Conformación Proteica , Proteínas/química , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos/genética , Entropía , Humanos , Modelos Moleculares
19.
Bioinformatics ; 33(14): i5-i12, 2017 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-28882005

RESUMEN

MOTIVATION: When proteins mutate or bind to ligands, their backbones often move significantly, especially in loop regions. Computational protein design algorithms must model these motions in order to accurately optimize protein stability and binding affinity. However, methods for backbone conformational search in design have been much more limited than for sidechain conformational search. This is especially true for combinatorial protein design algorithms, which aim to search a large sequence space efficiently and thus cannot rely on temporal simulation of each candidate sequence. RESULTS: We alleviate this difficulty with a new parameterization of backbone conformational space, which represents all degrees of freedom of a specified segment of protein chain that maintain valid bonding geometry (by maintaining the original bond lengths and angles and ω dihedrals). In order to search this space, we present an efficient algorithm, CATS, for computing atomic coordinates as a function of our new continuous backbone internal coordinates. CATS generalizes the iMinDEE and EPIC protein design algorithms, which model continuous flexibility in sidechain dihedrals, to model continuous, appropriately localized flexibility in the backbone dihedrals ϕ and ψ as well. We show using 81 test cases based on 29 different protein structures that CATS finds sequences and conformations that are significantly lower in energy than methods with less or no backbone flexibility do. In particular, we show that CATS can model the viability of an antibody mutation known experimentally to increase affinity, but that appears sterically infeasible when modeled with less or no backbone flexibility. AVAILABILITY AND IMPLEMENTATION: Our code is available as free software at https://github.com/donaldlab/OSPREY_refactor . CONTACT: mhallen@ttic.edu or brd+ismb17@cs.duke.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Programas Informáticos , Algoritmos , Modelos Moleculares , Mutación , Conformación Proteica , Proteínas/genética
20.
PLoS Comput Biol ; 13(3): e1005346, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-28358804

RESUMEN

Protein design algorithms enumerate a combinatorial number of candidate structures to compute the Global Minimum Energy Conformation (GMEC). To efficiently find the GMEC, protein design algorithms must methodically reduce the conformational search space. By applying distance and energy cutoffs, the protein system to be designed can thus be represented using a sparse residue interaction graph, where the number of interacting residue pairs is less than all pairs of mutable residues, and the corresponding GMEC is called the sparse GMEC. However, ignoring some pairwise residue interactions can lead to a change in the energy, conformation, or sequence of the sparse GMEC vs. the original or the full GMEC. Despite the widespread use of sparse residue interaction graphs in protein design, the above mentioned effects of their use have not been previously analyzed. To analyze the costs and benefits of designing with sparse residue interaction graphs, we computed the GMECs for 136 different protein design problems both with and without distance and energy cutoffs, and compared their energies, conformations, and sequences. Our analysis shows that the differences between the GMECs depend critically on whether or not the design includes core, boundary, or surface residues. Moreover, neglecting long-range interactions can alter local interactions and introduce large sequence differences, both of which can result in significant structural and functional changes. Designs on proteins with experimentally measured thermostability show it is beneficial to compute both the full and the sparse GMEC accurately and efficiently. To this end, we show that a provable, ensemble-based algorithm can efficiently compute both GMECs by enumerating a small number of conformations, usually fewer than 1000. This provides a novel way to combine sparse residue interaction graphs with provable, ensemble-based algorithms to reap the benefits of sparse residue interaction graphs while avoiding their potential inaccuracies.


Asunto(s)
Algoritmos , Proteínas/química , Secuencia de Aminoácidos , Animales , Biología Computacional , Gráficos por Computador , Humanos , Modelos Moleculares , Conformación Proteica , Ingeniería de Proteínas , Proteínas/genética , Programas Informáticos , Termodinámica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA