Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-38405797

RESUMO

With over 270 unique occurrences in the human genome, peptide-recognizing PDZ domains play a central role in modulating polarization, signaling, and trafficking pathways. Mutations in PDZ domains lead to diseases such as cancer and cystic fibrosis, making PDZ domains attractive targets for therapeutic intervention. D-peptide inhibitors offer unique advantages as therapeutics, including increased metabolic stability and low immunogenicity. Here, we introduce DexDesign, a novel OSPREY-based algorithm for computationally designing de novo D-peptide inhibitors. DexDesign leverages three novel techniques that are broadly applicable to computational protein design: the Minimum Flexible Set, K*-based Mutational Scan, and Inverse Alanine Scan, which enable exponential reductions in the size of the peptide sequence search space. We apply these techniques and DexDesign to generate novel D-peptide inhibitors of two biomedically important PDZ domain targets: CAL and MAST2. We introduce a new framework for analyzing de novo peptides-evaluation along a replication/restitution axis-and apply it to the DexDesign-generated D-peptides. Notably, the peptides we generated are predicted to bind their targets tighter than their targets' endogenous ligands, validating the peptides' potential as lead therapeutic candidates. We provide an implementation of DexDesign in the free and open source computational protein design software OSPREY.

2.
bioRxiv ; 2023 Nov 18.
Artigo em Inglês | MEDLINE | ID: mdl-38014181

RESUMO

Accurate binding affinity prediction is crucial to structure-based drug design. Recent work used computational topology to obtain an effective representation of protein-ligand interactions. Although persistent homology encodes geometric features, previous works on binding affinity prediction using persistent homology employed uninterpretable machine learning models and failed to explain the underlying geometric and topological features that drive accurate binding affinity prediction. In this work, we propose a novel, interpretable algorithm for protein-ligand binding affinity prediction. Our algorithm achieves interpretability through an effective embedding of distances across bipartite matchings of the protein and ligand atoms into real-valued functions by summing Gaussians centered at features constructed by persistent homology. We name these functions internuclear persistent contours (IPCs) . Next, we introduce persistence fingerprints , a vector with 10 components that sketches the distances of different bipartite matching between protein and ligand atoms, refined from IPCs. Let the number of protein atoms in the protein-ligand complex be n , number of ligand atoms be m , and ω ≈ 2.4 be the matrix multiplication exponent. We show that for any 0 < ε < 1, after an 𝒪 ( mn log( mn )) preprocessing procedure, we can compute an ε -accurate approximation to the persistence fingerprint in 𝒪 ( m log 6 ω ( m/" )) time, independent of protein size. This is an improvement in time complexity by a factor of 𝒪 (( m + n ) 3 ) over any previous binding affinity prediction that uses persistent homology. We show that the representational power of persistence fingerprint generalizes to protein-ligand binding datasets beyond the training dataset. Then, we introduce PATH , Predicting Affinity Through Homology, an interpretable, small ensemble of shallow regression trees for binding affinity prediction from persistence fingerprints. We show that despite using 1,400-fold fewer features, PATH has comparable performance to a previous state-of-the-art binding affinity prediction algorithm that uses persistent homology features. Moreover, PATH has the advantage of being interpretable. Finally, we visualize the features captured by persistence fingerprint for variant HIV-1 protease complexes and show that persistence fingerprint captures binding-relevant structural mutations. The source code for PATH is released open-source as part of the osprey protein design software package.

3.
Commun Biol ; 6(1): 720, 2023 07 13.
Artigo em Inglês | MEDLINE | ID: mdl-37443295

RESUMO

We report an Osprey-based computational protocol to prospectively identify oncogenic mutations that act via disruption of molecular interactions. It is applicable to analyse both protein-protein and protein-DNA interfaces and it is validated on a dataset of clinically relevant mutations. In addition, it is used to predict previously uncharacterised patient mutations in CDK6 and p16 genes, which are experimentally confirmed to impair complex formation.


Assuntos
DNA , Proteínas , Humanos , Proteínas/genética , Mutação , DNA/genética
4.
Cell Rep ; 42(7): 112711, 2023 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-37436900

RESUMO

Broadly neutralizing antibodies (bNAbs) against HIV can reduce viral transmission in humans, but an effective therapeutic will require unusually high breadth and potency of neutralization. We employ the OSPREY computational protein design software to engineer variants of two apex-directed bNAbs, PGT145 and PG9RSH, resulting in increases in potency of over 100-fold against some viruses. The top designed variants improve neutralization breadth from 39% to 54% at clinically relevant concentrations (IC80 < 1 µg/mL) and improve median potency (IC80) by up to 4-fold over a cross-clade panel of 208 strains. To investigate the mechanisms of improvement, we determine cryoelectron microscopy structures of each variant in complex with the HIV envelope trimer. Surprisingly, we find the largest increases in breadth to be a result of optimizing side-chain interactions with highly variable epitope residues. These results provide insight into mechanisms of neutralization breadth and inform strategies for antibody design and improvement.


Assuntos
Infecções por HIV , Soropositividade para HIV , HIV-1 , Humanos , Anticorpos Anti-HIV , Anticorpos Neutralizantes , Anticorpos Amplamente Neutralizantes , Microscopia Crioeletrônica , Testes de Neutralização
5.
J Struct Biol X ; 8: 100091, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37416832

RESUMO

Podisus maculiventris thanatin has been reported as a potent antimicrobial peptide with antibacterial and antifungal activity. Its antibiotic activity has been most thoroughly characterized against E. coli and shown to interfere with multiple pathways, such as the lipopolysaccharide transport (LPT) pathway comprised of seven different Lpt proteins. Thanatin binds to E. coli LptA and LptD, thus disrupting the LPT complex formation and inhibiting cell wall synthesis and microbial growth. Here, we performed a genomic database search to uncover novel thanatin orthologs, characterized their binding to E. coli LptA using bio-layer interferometry, and assessed their antimicrobial activity against E. coli. We found that thanatins from Chinavia ubica and Murgantia histrionica bound tighter (by 3.6- and 2.2-fold respectively) to LptA and exhibited more potent antibiotic activity (by 2.1- and 2.8-fold respectively) than the canonical thanatin from P. maculiventris. We crystallized and determined the LptA-bound complex structures of thanatins from C. ubica (1.90 Å resolution), M. histrionica (1.80 Å resolution), and P. maculiventris (2.43 Å resolution) to better understand their mechanism of action. Our structural analysis revealed that residues A10 and I21 in C. ubica and M. histrionica thanatin are important for improving the binding interface with LptA, thus overall improving the potency of thanatin against E. coli. We also designed a stapled variant of thanatin that removes the need for a disulfide bond but retains the ability to bind LptA and antibiotic activity. Our discovery presents a library of novel thanatin sequences to serve as starting scaffolds for designing more potent antimicrobial therapeutics.

6.
STAR Protoc ; 4(2): 102170, 2023 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-37115667

RESUMO

Prospective predictions of drug-resistant protein mutants could improve the design of therapeutics less prone to resistance. Here, we describe RESISTOR, an algorithm that uses structure- and sequence-based criteria to predict resistance mutations. We demonstrate the process of using RESISTOR to predict ERK2 mutants likely to arise in melanoma ablating the efficacy of the ERK1/2 inhibitor SCH779284. RESISTOR is included in the free and open-source computational protein design software OSPREY. For complete details on the use and execution of this protocol, please refer to Guerin et al..1.

7.
Cell Syst ; 13(10): 830-843.e3, 2022 10 19.
Artigo em Inglês | MEDLINE | ID: mdl-36265469

RESUMO

Resistance to pharmacological treatments is a major public health challenge. Here, we introduce Resistor-a structure- and sequence-based algorithm that prospectively predicts resistance mutations for drug design. Resistor computes the Pareto frontier of four resistance-causing criteria: the change in binding affinity (ΔKa) of the (1) drug and (2) endogenous ligand upon a protein's mutation; (3) the probability a mutation will occur based on empirically derived mutational signatures; and (4) the cardinality of mutations comprising a hotspot. For validation, we applied Resistor to EGFR and BRAF kinase inhibitors treating lung adenocarcinoma and melanoma. Resistor correctly identified eight clinically significant EGFR resistance mutations, including the erlotinib and gefitinib "gatekeeper" T790M mutation and five known osimertinib resistance mutations. Furthermore, Resistor predictions are consistent with BRAF inhibitor sensitivity data from both retrospective and prospective experiments using KinCon biosensors. Resistor is available in the open-source protein design software OSPREY.


Assuntos
Antineoplásicos , Neoplasias Pulmonares , Humanos , Cloridrato de Erlotinib , Gefitinibe/uso terapêutico , Receptores ErbB/genética , Receptores ErbB/metabolismo , Proteínas Proto-Oncogênicas B-raf/genética , Inibidores de Proteínas Quinases/farmacologia , Mutação/genética , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Estudos Retrospectivos , Ligantes , Estudos Prospectivos , Resistencia a Medicamentos Antineoplásicos/genética , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Algoritmos
8.
J Comput Biol ; 29(12): 1346-1352, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36099194

RESUMO

Computational, in silico prediction of resistance-conferring escape mutations could accelerate the design of therapeutics less prone to resistance. This article describes how to use the Resistor algorithm to predict escape mutations. Resistor employs Pareto optimization on four resistance-conferring criteria-positive and negative design, mutational probability, and hotspot cardinality-to assign a Pareto rank to each prospective mutant. It also predicts the mechanism of resistance, that is, whether a mutant ablates binding to a drug, strengthens binding to the endogenous ligand, or a combination of these two factors, and provides structural models of the mutants. Resistor is part of the free and open-source computational protein design software OSPREY.


Assuntos
Algoritmos , Proteínas , Estudos Prospectivos , Proteínas/química , Mutação , Ligantes
9.
PLoS Comput Biol ; 18(2): e1009855, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-35143481

RESUMO

Antimicrobial resistance presents a significant health care crisis. The mutation F98Y in Staphylococcus aureus dihydrofolate reductase (SaDHFR) confers resistance to the clinically important antifolate trimethoprim (TMP). Propargyl-linked antifolates (PLAs), next generation DHFR inhibitors, are much more resilient than TMP against this F98Y variant, yet this F98Y substitution still reduces efficacy of these agents. Surprisingly, differences in the enantiomeric configuration at the stereogenic center of PLAs influence the isomeric state of the NADPH cofactor. To understand the molecular basis of F98Y-mediated resistance and how PLAs' inhibition drives NADPH isomeric states, we used protein design algorithms in the osprey protein design software suite to analyze a comprehensive suite of structural, biophysical, biochemical, and computational data. Here, we present a model showing how F98Y SaDHFR exploits a different anomeric configuration of NADPH to evade certain PLAs' inhibition, while other PLAs remain unaffected by this resistance mechanism.


Assuntos
Antagonistas do Ácido Fólico , Infecções Estafilocócicas , Farmacorresistência Bacteriana/genética , Antagonistas do Ácido Fólico/química , Antagonistas do Ácido Fólico/farmacologia , Humanos , NADP/metabolismo , Staphylococcus aureus/genética , Staphylococcus aureus/metabolismo , Tetra-Hidrofolato Desidrogenase/química , Tetra-Hidrofolato Desidrogenase/genética , Tetra-Hidrofolato Desidrogenase/metabolismo , Trimetoprima/química , Trimetoprima/metabolismo , Trimetoprima/farmacologia
10.
PLoS Comput Biol ; 16(6): e1007447, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32511232

RESUMO

The K* algorithm provably approximates partition functions for a set of states (e.g., protein, ligand, and protein-ligand complex) to a user-specified accuracy ε. Often, reaching an ε-approximation for a particular set of partition functions takes a prohibitive amount of time and space. To alleviate some of this cost, we introduce two new algorithms into the osprey suite for protein design: fries, a Fast Removal of Inadequately Energied Sequences, and EWAK*, an Energy Window Approximation to K*. fries pre-processes the sequence space to limit a design to only the most stable, energetically favorable sequence possibilities. EWAK* then takes this pruned sequence space as input and, using a user-specified energy window, calculates K* scores using the lowest energy conformations. We expect fries/EWAK* to be most useful in cases where there are many unstable sequences in the design sequence space and when users are satisfied with enumerating the low-energy ensemble of conformations. In combination, these algorithms provably retain calculational accuracy while limiting the input sequence space and the conformations included in each partition function calculation to only the most energetically favorable, effectively reducing runtime while still enriching for desirable sequences. This combined approach led to significant speed-ups compared to the previous state-of-the-art multi-sequence algorithm, BBK*, while maintaining its efficiency and accuracy, which we show across 40 different protein systems and a total of 2,826 protein design problems. Additionally, as a proof of concept, we used these new algorithms to redesign the protein-protein interface (PPI) of the c-Raf-RBD:KRas complex. The Ras-binding domain of the protein kinase c-Raf (c-Raf-RBD) is the tightest known binder of KRas, a protein implicated in difficult-to-treat cancers. fries/EWAK* accurately retrospectively predicted the effect of 41 different sets of mutations in the PPI of the c-Raf-RBD:KRas complex. Notably, these mutations include mutations whose effect had previously been incorrectly predicted using other computational methods. Next, we used fries/EWAK* for prospective design and discovered a novel point mutation that improves binding of c-Raf-RBD to KRas in its active, GTP-bound state (KRasGTP). We combined this new mutation with two previously reported mutations (which were highly-ranked by osprey) to create a new variant of c-Raf-RBD, c-Raf-RBD(RKY). fries/EWAK* in osprey computationally predicted that this new variant binds even more tightly than the previous best-binding variant, c-Raf-RBD(RK). We measured the binding affinity of c-Raf-RBD(RKY) using a bio-layer interferometry (BLI) assay, and found that this new variant exhibits single-digit nanomolar affinity for KRasGTP, confirming the computational predictions made with fries/EWAK*. This new variant binds roughly five times more tightly than the previous best known binder and roughly 36 times more tightly than the design starting point (wild-type c-Raf-RBD). This study steps through the advancement and development of computational protein design by presenting theory, new algorithms, accurate retrospective designs, new prospective designs, and biochemical validation.


Assuntos
Biologia Computacional , Engenharia de Proteínas/métodos , Proteínas Proto-Oncogênicas c-raf/química , Proteínas Proto-Oncogênicas p21(ras)/química , Algoritmos , Computadores , Humanos , Interferometria , Lectinas/química , Ligantes , Modelos Estatísticos , Linguagens de Programação , Ligação Proteica , Domínios Proteicos , Software
11.
J Comput Biol ; 27(4): 550-564, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-31855059

RESUMO

Protein design algorithms that model continuous sidechain flexibility and conformational ensembles better approximate the in vitro and in vivo behavior of proteins. The previous state of the art, iMinDEE-A*-K*, computes provable ɛ-approximations to partition functions of protein states (e.g., bound vs. unbound) by computing provable, admissible pairwise-minimized energy lower bounds on protein conformations, and using the A* enumeration algorithm to return a gap-free list of lowest-energy conformations. iMinDEE-A*-K* runs in time sublinear in the number of conformations, but can be trapped in loosely-bounded, low-energy conformational wells containing many conformations with highly similar energies. That is, iMinDEE-A*-K* is unable to exploit the correlation between protein conformation and energy: similar conformations often have similar energy. We introduce two new concepts that exploit this correlation: Minimization-Aware Enumeration and Recursive K*. We combine these two insights into a novel algorithm, Minimization-Aware Recursive K* (MARK*), which tightens bounds not on single conformations, but instead on distinct regions of the conformation space. We compare the performance of iMinDEE-A*-K* versus MARK* by running the Branch and Bound over K* (BBK*) algorithm, which provably returns sequences in order of decreasing K* score, using either iMinDEE-A*-K* or MARK* to approximate partition functions. We show on 200 design problems that MARK* not only enumerates and minimizes vastly fewer conformations than the previous state of the art, but also runs up to 2 orders of magnitude faster. Finally, we show that MARK* not only efficiently approximates the partition function, but also provably approximates the energy landscape. To our knowledge, MARK* is the first algorithm to do so. We use MARK* to analyze the change in energy landscape of the bound and unbound states of an HIV-1 capsid protein C-terminal domain in complex with a camelid VHH, and measure the change in conformational entropy induced by binding. Thus, MARK* both accelerates existing designs and offers new capabilities not possible with previous algorithms.


Assuntos
Biologia Computacional , Conformação Proteica , Proteínas/genética , Software , Algoritmos , Sequência de Aminoácidos/genética , Entropia , Modelos Moleculares , Domínios Proteicos/genética , Proteínas/ultraestrutura , Termodinâmica
12.
J Phys Chem B ; 123(49): 10441-10455, 2019 12 12.
Artigo em Inglês | MEDLINE | ID: mdl-31697075

RESUMO

The CFTR-associated ligand PDZ domain (CALP) binds to the cystic fibrosis transmembrane conductance regulator (CFTR) and mediates lysosomal degradation of mature CFTR. Inhibition of this interaction has been explored as a therapeutic avenue for cystic fibrosis. Previously, we reported the ensemble-based computational design of a novel peptide inhibitor of CALP, which resulted in the most binding-efficient inhibitor to date. This inhibitor, kCAL01, was designed using osprey and evinced significant biological activity in in vitro cell-based assays. Here, we report a crystal structure of kCAL01 bound to CALP and compare structural features against iCAL36, a previously developed inhibitor of CALP. We compute side-chain energy landscapes for each structure to not only enable approximation of binding thermodynamics but also reveal ensemble features that contribute to the comparatively efficient binding of kCAL01. Finally, we compare the previously reported design ensemble for kCAL01 vs the new crystal structure and show that, despite small differences between the design model and crystal structure, significant biophysical features that enhance inhibitor binding are captured in the design ensemble. This suggests not only that ensemble-based design captured thermodynamically significant features observed in vitro, but also that a design eschewing ensembles would miss the kCAL01 sequence entirely.


Assuntos
Regulador de Condutância Transmembrana em Fibrose Cística/antagonistas & inibidores , Peptídeos/farmacologia , Termodinâmica , Sítios de Ligação/efeitos dos fármacos , Fibrose Cística/tratamento farmacológico , Fibrose Cística/metabolismo , Regulador de Condutância Transmembrana em Fibrose Cística/metabolismo , Humanos , Ligantes , Modelos Moleculares , Peptídeos/síntese química , Peptídeos/química
13.
ACS Infect Dis ; 5(11): 1896-1906, 2019 11 08.
Artigo em Inglês | MEDLINE | ID: mdl-31565920

RESUMO

The spread of plasmid borne resistance enzymes in clinical Staphylococcus aureus isolates is rendering trimethoprim and iclaprim, both inhibitors of dihydrofolate reductase (DHFR), ineffective. Continued exploitation of these targets will require compounds that can broadly inhibit these resistance-conferring isoforms. Using a structure-based approach, we have developed a novel class of ionized nonclassical antifolates (INCAs) that capture the molecular interactions that have been exclusive to classical antifolates. These modifications allow for a greatly expanded spectrum of activity across these pathogenic DHFR isoforms, while maintaining the ability to penetrate the bacterial cell wall. Using biochemical, structural, and computational methods, we are able to optimize these inhibitors to the conserved active sites of the endogenous and trimethoprim resistant DHFR enzymes. Here, we report a series of INCA compounds that exhibit low nanomolar enzymatic activity and potent cellular activity with human selectivity against a panel of clinically relevant TMP resistant (TMPR) and methicillin resistant Staphylococcus aureus (MRSA) isolates.


Assuntos
Antibacterianos/farmacologia , Proteínas de Bactérias/antagonistas & inibidores , Antagonistas do Ácido Fólico/química , Staphylococcus aureus Resistente à Meticilina/enzimologia , Infecções Estafilocócicas/microbiologia , Tetra-Hidrofolato Desidrogenase/química , Trimetoprima/farmacologia , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Domínio Catalítico , Antagonistas do Ácido Fólico/farmacologia , Humanos , Staphylococcus aureus Resistente à Meticilina/efeitos dos fármacos , Staphylococcus aureus Resistente à Meticilina/genética , Testes de Sensibilidade Microbiana , Tetra-Hidrofolato Desidrogenase/genética , Tetra-Hidrofolato Desidrogenase/metabolismo
14.
Commun ACM ; 62(10): 76-84, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31607753
15.
J Comput Chem ; 39(30): 2494-2507, 2018 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-30368845

RESUMO

We present osprey 3.0, a new and greatly improved release of the osprey protein design software. Osprey 3.0 features a convenient new Python interface, which greatly improves its ease of use. It is over two orders of magnitude faster than previous versions of osprey when running the same algorithms on the same hardware. Moreover, osprey 3.0 includes several new algorithms, which introduce substantial speedups as well as improved biophysical modeling. It also includes GPU support, which provides an additional speedup of over an order of magnitude. Like previous versions of osprey, osprey 3.0 offers a unique package of advantages over other design software, including provable design algorithms that account for continuous flexibility during design and model conformational entropy. Finally, we show here empirically that osprey 3.0 accurately predicts the effect of mutations on protein-protein binding. Osprey 3.0 is available at http://www.cs.duke.edu/donaldlab/osprey.php as free and open-source software. © 2018 Wiley Periodicals, Inc.


Assuntos
Conformação Proteica , Proteínas/química , Software , Algoritmos , Modelos Moleculares , Ligação Proteica
16.
J Mol Biol ; 430(18 Pt B): 3412-3426, 2018 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-29924964

RESUMO

The flexibility of biological macromolecules is an important structural determinant of function. Unfortunately, the correlations between different motional modes are poorly captured by discrete ensemble representations. Here, we present new ways to both represent and visualize correlated interdomain motions. Interdomain motions are determined directly from residual dipolar couplings, represented as a continuous conformational distribution, and visualized using the disk-on-sphere representation. Using the disk-on-sphere representation, features of interdomain motions, including correlations, are intuitively visualized. The representation works especially well for multidomain systems with broad conformational distributions.This analysis also can be extended to multiple probability density modes, using a Bingham mixture model. We use this new paradigm to study the interdomain motions of staphylococcal protein A, which is a key virulence factor contributing to the pathogenicity of Staphylococcus aureus. We capture the smooth transitions between important states and demonstrate the utility of continuous distribution functions for computing the reorientational components of binding thermodynamics. Such insights allow for the dissection of the dynamic structural components of functionally important intermolecular interactions.


Assuntos
Modelos Moleculares , Conformação Proteica , Domínios e Motivos de Interação entre Proteínas , Proteínas/química , Termodinâmica , Ressonância Magnética Nuclear Biomolecular , Proteína Estafilocócica A/química
17.
J Comput Biol ; 25(7): 726-739, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29641249

RESUMO

Computational protein design (CPD) algorithms that compute binding affinity, Ka, search for sequences with an energetically favorable free energy of binding. Recent work shows that three principles improve the biological accuracy of CPD: ensemble-based design, continuous flexibility of backbone and side-chain conformations, and provable guarantees of accuracy with respect to the input. However, previous methods that use all three design principles are single-sequence (SS) algorithms, which are very costly: linear in the number of sequences and thus exponential in the number of simultaneously mutable residues. To address this computational challenge, we introduce BBK*, a new CPD algorithm whose key innovation is the multisequence (MS) bound: BBK* efficiently computes a single provable upper bound to approximate Ka for a combinatorial number of sequences, and avoids SS computation for all provably suboptimal sequences. Thus, to our knowledge, BBK* is the first provable, ensemble-based CPD algorithm to run in time sublinear in the number of sequences. Computational experiments on 204 protein design problems show that BBK* finds the tightest binding sequences while approximating Ka for up to 105-fold fewer sequences than the previous state-of-the-art algorithms, which require exhaustive enumeration of sequences. Furthermore, for 51 protein-ligand design problems, BBK* provably approximates Ka up to 1982-fold faster than the previous state-of-the-art iMinDEE/[Formula: see text]/[Formula: see text] algorithm. Therefore, BBK* not only accelerates protein designs that are possible with previous provable algorithms, but also efficiently performs designs that are too large for previous methods.


Assuntos
Biologia Computacional/métodos , Conformação Proteica , Proteínas/química , Software , Algoritmos , Sequência de Aminoácidos/genética , Entropia , Humanos , Modelos Moleculares
18.
Bioinformatics ; 33(14): i5-i12, 2017 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-28882005

RESUMO

MOTIVATION: When proteins mutate or bind to ligands, their backbones often move significantly, especially in loop regions. Computational protein design algorithms must model these motions in order to accurately optimize protein stability and binding affinity. However, methods for backbone conformational search in design have been much more limited than for sidechain conformational search. This is especially true for combinatorial protein design algorithms, which aim to search a large sequence space efficiently and thus cannot rely on temporal simulation of each candidate sequence. RESULTS: We alleviate this difficulty with a new parameterization of backbone conformational space, which represents all degrees of freedom of a specified segment of protein chain that maintain valid bonding geometry (by maintaining the original bond lengths and angles and ω dihedrals). In order to search this space, we present an efficient algorithm, CATS, for computing atomic coordinates as a function of our new continuous backbone internal coordinates. CATS generalizes the iMinDEE and EPIC protein design algorithms, which model continuous flexibility in sidechain dihedrals, to model continuous, appropriately localized flexibility in the backbone dihedrals ϕ and ψ as well. We show using 81 test cases based on 29 different protein structures that CATS finds sequences and conformations that are significantly lower in energy than methods with less or no backbone flexibility do. In particular, we show that CATS can model the viability of an antibody mutation known experimentally to increase affinity, but that appears sterically infeasible when modeled with less or no backbone flexibility. AVAILABILITY AND IMPLEMENTATION: Our code is available as free software at https://github.com/donaldlab/OSPREY_refactor . CONTACT: mhallen@ttic.edu or brd+ismb17@cs.duke.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Proteínas/química , Software , Algoritmos , Modelos Moleculares , Mutação , Conformação Proteica , Proteínas/genética
19.
PLoS Comput Biol ; 13(3): e1005346, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28358804

RESUMO

Protein design algorithms enumerate a combinatorial number of candidate structures to compute the Global Minimum Energy Conformation (GMEC). To efficiently find the GMEC, protein design algorithms must methodically reduce the conformational search space. By applying distance and energy cutoffs, the protein system to be designed can thus be represented using a sparse residue interaction graph, where the number of interacting residue pairs is less than all pairs of mutable residues, and the corresponding GMEC is called the sparse GMEC. However, ignoring some pairwise residue interactions can lead to a change in the energy, conformation, or sequence of the sparse GMEC vs. the original or the full GMEC. Despite the widespread use of sparse residue interaction graphs in protein design, the above mentioned effects of their use have not been previously analyzed. To analyze the costs and benefits of designing with sparse residue interaction graphs, we computed the GMECs for 136 different protein design problems both with and without distance and energy cutoffs, and compared their energies, conformations, and sequences. Our analysis shows that the differences between the GMECs depend critically on whether or not the design includes core, boundary, or surface residues. Moreover, neglecting long-range interactions can alter local interactions and introduce large sequence differences, both of which can result in significant structural and functional changes. Designs on proteins with experimentally measured thermostability show it is beneficial to compute both the full and the sparse GMEC accurately and efficiently. To this end, we show that a provable, ensemble-based algorithm can efficiently compute both GMECs by enumerating a small number of conformations, usually fewer than 1000. This provides a novel way to combine sparse residue interaction graphs with provable, ensemble-based algorithms to reap the benefits of sparse residue interaction graphs while avoiding their potential inaccuracies.


Assuntos
Algoritmos , Proteínas/química , Sequência de Aminoácidos , Animais , Biologia Computacional , Gráficos por Computador , Humanos , Modelos Moleculares , Conformação Proteica , Engenharia de Proteínas , Proteínas/genética , Software , Termodinâmica
20.
Methods Mol Biol ; 1529: 265-277, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27914056

RESUMO

Computational structure-based protein design (CSPD) is an important problem in computational biology, which aims to design or improve a prescribed protein function based on a protein structure template. It provides a practical tool for real-world protein engineering applications. A popular CSPD method that guarantees to find the global minimum energy solution (GMEC) is to combine both dead-end elimination (DEE) and A* tree search algorithms. However, in this framework, the A* search algorithm can run in exponential time in the worst case, which may become the computation bottleneck of large-scale computational protein design process. To address this issue, we extend and add a new module to the OSPREY program that was previously developed in the Donald lab (Gainza et al., Methods Enzymol 523:87, 2013) to implement a GPU-based massively parallel A* algorithm for improving protein design pipeline. By exploiting the modern GPU computational framework and optimizing the computation of the heuristic function for A* search, our new program, called gOSPREY, can provide up to four orders of magnitude speedups in large protein design cases with a small memory overhead comparing to the traditional A* search algorithm implementation, while still guaranteeing the optimality. In addition, gOSPREY can be configured to run in a bounded-memory mode to tackle the problems in which the conformation space is too large and the global optimal solution cannot be computed previously. Furthermore, the GPU-based A* algorithm implemented in the gOSPREY program can be combined with the state-of-the-art rotamer pruning algorithms such as iMinDEE (Gainza et al., PLoS Comput Biol 8:e1002335, 2012) and DEEPer (Hallen et al., Proteins 81:18-39, 2013) to also consider continuous backbone and side-chain flexibility.


Assuntos
Biologia Computacional/métodos , Engenharia de Proteínas/métodos , Proteínas , Algoritmos , Sequência de Aminoácidos , Simulação por Computador , Modelos Moleculares , Conformação Proteica , Proteínas/química , Proteínas/genética , Reprodutibilidade dos Testes , Software , Relação Estrutura-Atividade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA