Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Protein Sci ; 33(2): e4862, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38148272

RESUMO

Conventional protein-protein docking algorithms usually rely on heavy candidate sampling and reranking, but these steps are time-consuming and hinder applications that require high-throughput complex structure prediction, for example, structure-based virtual screening. Existing deep learning methods for protein-protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding-induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multitrack iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments, GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. On the Database of Interacting Protein Structures (DIPS) test set, GeoDock achieves a 43% top-1 success rate, outperforming all other tested methods. However, in the standard DIPS train/test splits, we discovered contamination of close homologs in the training set. After decontaminating the training set, the success rate is 31%. On the DB5.5 test set and a benchmark dataset of antibody-antigen complexes, GeoDock outperforms the deep learning models trained using the same dataset but falls behind most of the conventional methods and AlphaFold-Multimer. GeoDock attains an average inference speed of under 1 s on a single GPU, enabling its application in large-scale structure screening. Although binding-induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.


Assuntos
Algoritmos , Proteínas , Salicilatos , Conformação Proteica , Ligação Proteica , Proteínas/química , Simulação de Acoplamento Molecular
2.
Proteins ; 91(12): 1658-1683, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37905971

RESUMO

We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas , Mapeamento de Interação de Proteínas/métodos , Conformação Proteica , Ligação Proteica , Simulação de Acoplamento Molecular , Biologia Computacional/métodos , Software
3.
bioRxiv ; 2023 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-37546760

RESUMO

Despite the recent breakthrough of AlphaFold (AF) in the field of protein sequence-to-structure prediction, modeling protein interfaces and predicting protein complex structures remains challenging, especially when there is a significant conformational change in one or both binding partners. Prior studies have demonstrated that AF-multimer (AFm) can predict accurate protein complexes in only up to 43% of cases.1 In this work, we combine AlphaFold as a structural template generator with a physics-based replica exchange docking algorithm. Using a curated collection of 254 available protein targets with both unbound and bound structures, we first demonstrate that AlphaFold confidence measures (pLDDT) can be repurposed for estimating protein flexibility and docking accuracy for multimers. We incorporate these metrics within our ReplicaDock 2.0 protocol2 to complete a robust in-silico pipeline for accurate protein complex structure prediction. AlphaRED (AlphaFold-initiated Replica Exchange Docking) successfully docks failed AF predictions including 97 failure cases in Docking Benchmark Set 5.5. AlphaRED generates CAPRI acceptable-quality or better predictions for 66% of benchmark targets. Further, on a subset of antigen-antibody targets, which is challenging for AFm (19% success rate), AlphaRED demonstrates a success rate of 51%. This new strategy demonstrates the success possible by integrating deep-learning based architectures trained on evolutionary information with physics-based enhanced sampling. The pipeline is available at github.com/Graylab/AlphaRED.

4.
bioRxiv ; 2023 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-37425754

RESUMO

Conventional protein-protein docking algorithms usually rely on heavy candidate sampling and re-ranking, but these steps are time-consuming and hinder applications that require high-throughput complex structure prediction, e.g., structure-based virtual screening. Existing deep learning methods for protein-protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding-induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multi-track iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments (MSAs), GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. For a benchmark set of rigid targets, GeoDock obtains a 41% success rate, outperforming all the other tested methods. For a more challenging benchmark set of flexible targets, GeoDock achieves a similar number of top-model successes as the traditional method ClusPro [1], but fewer than ReplicaDock2 [2]. GeoDock attains an average inference speed of under one second on a single GPU, enabling its application in large-scale structure screening. Although binding-induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.

5.
MAbs ; 15(1): 2163584, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36683173

RESUMO

Over the last three decades, the appeal for monoclonal antibodies (mAbs) as therapeutics has been steadily increasing as evident with FDA's recent landmark approval of the 100th mAb. Unlike mAbs that bind to single targets, multispecific biologics (msAbs) have garnered particular interest owing to the advantage of engaging distinct targets. One important modular component of msAbs is the single-chain variable fragment (scFv). Despite the exquisite specificity and affinity of these scFv modules, their relatively poor thermostability often hampers their development as a potential therapeutic drug. In recent years, engineering antibody sequences to enhance their stability by mutations has gained considerable momentum. As experimental methods for antibody engineering are time-intensive, laborious and expensive, computational methods serve as a fast and inexpensive alternative to conventional routes. In this work, we show two machine learning approaches - one with pre-trained language models (PTLM) capturing functional effects of sequence variation, and second, a supervised convolutional neural network (CNN) trained with Rosetta energetic features - to better classify thermostable scFv variants from sequence. Both of these models are trained over temperature-specific data (TS50 measurements) derived from multiple libraries of scFv sequences. On out-of-distribution (refers to the fact that the out-of-distribution sequnes are blind to the algorithm) sequences, we show that a sufficiently simple CNN model performs better than general pre-trained language models trained on diverse protein sequences (average Spearman correlation coefficient, ρ, of 0.4 as opposed to 0.15). On the other hand, an antibody-specific language model performs comparatively better than the CNN model on the same task (ρ= 0.52). Further, we demonstrate that for an independent mAb with available thermal melting temperatures for 20 experimentally characterized thermostable mutations, these models trained on TS50 data could identify 18 residue positions and 5 identical amino-acid mutations showing remarkable generalizability. Our results suggest that such models can be broadly applicable for improving the biological characteristics of antibodies. Further, transferring such models for alternative physicochemical properties of scFvs can have potential applications in optimizing large-scale production and delivery of mAbs or bsAbs.


Assuntos
Anticorpos Monoclonais , Anticorpos de Cadeia Única , Sequência de Aminoácidos , Aprendizado de Máquina , Algoritmos
6.
PLoS Comput Biol ; 18(6): e1010124, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35658008

RESUMO

Despite the progress in prediction of protein complexes over the last decade, recent blind protein complex structure prediction challenges revealed limited success rates (less than 20% models with DockQ score > 0.4) on targets that exhibit significant conformational change upon binding. To overcome limitations in capturing backbone motions, we developed a new, aggressive sampling method that incorporates temperature replica exchange Monte Carlo (T-REMC) and conformational sampling techniques within docking protocols in Rosetta. Our method, ReplicaDock 2.0, mimics induced-fit mechanism of protein binding to sample backbone motions across putative interface residues on-the-fly, thereby recapitulating binding-partner induced conformational changes. Furthermore, ReplicaDock 2.0 clocks in at 150-500 CPU hours per target (protein-size dependent); a runtime that is significantly faster than Molecular Dynamics based approaches. For a benchmark set of 88 proteins with moderate to high flexibility (unbound-to-bound iRMSD over 1.2 Å), ReplicaDock 2.0 successfully docks 61% of moderately flexible complexes and 35% of highly flexible complexes. Additionally, we demonstrate that by biasing backbone sampling particularly towards residues comprising flexible loops or hinge domains, highly flexible targets can be predicted to under 2 Å accuracy. This indicates that additional gains are possible when mobile protein segments are known.


Assuntos
Benchmarking , Proteínas , Método de Monte Carlo , Ligação Proteica , Conformação Proteica , Proteínas/química
7.
Nat Commun ; 12(1): 6947, 2021 11 29.
Artigo em Inglês | MEDLINE | ID: mdl-34845212

RESUMO

Each year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.


Assuntos
Substâncias Macromoleculares/química , Simulação de Acoplamento Molecular , Proteínas/química , Software/normas , Benchmarking , Sítios de Ligação , Humanos , Ligantes , Substâncias Macromoleculares/metabolismo , Ligação Proteica , Proteínas/metabolismo , Reprodutibilidade dos Testes
8.
mBio ; 12(5): e0178721, 2021 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-34544275

RESUMO

Colicins are protein antibiotics deployed by Escherichia coli to eliminate competing strains. Colicins frequently exploit outer membrane (OM) nutrient transporters to penetrate the selectively permeable bacterial cell envelope. Here, by applying live-cell fluorescence imaging, we were able to monitor the entry of the pore-forming toxin colicin B (ColB) into E. coli and localize it within the periplasm. We further demonstrate that single-stranded DNA coupled to ColB can also be transported to the periplasm, emphasizing that the import routes of colicins can be exploited to carry large cargo molecules into bacteria. Moreover, we characterize the molecular mechanism of ColB association with its OM receptor FepA by applying a combination of photoactivated cross-linking, mass spectrometry, and structural modeling. We demonstrate that complex formation is coincident with large-scale conformational changes in the colicin. Thereafter, active transport of ColB through FepA involves the colicin taking the place of the N-terminal half of the plug domain that normally occludes this iron transporter. IMPORTANCE Decades of excessive use of readily available antibiotics has generated a global problem of antibiotic resistance and, hence, an urgent need for novel antibiotic solutions. Bacteriocins are protein-based antibiotics produced by bacteria to eliminate closely related competing bacterial strains. Bacteriocin toxins have evolved to bypass the complex cell envelope in order to kill bacterial cells. Here, we uncover the cellular penetration mechanism of a well-known but poorly understood bacteriocin called colicin B that is active against Escherichia coli. Moreover, we demonstrate that the colicin B-import pathway can be exploited to deliver conjugated DNA cargo into bacterial cells. Our work leads to a better understanding of the way bacteriocins, as potential alternative antibiotics, execute their mode of action as well as highlighting how they might even be exploited in the genomic manipulation of Gram-negative bacteria.


Assuntos
Proteínas da Membrana Bacteriana Externa/metabolismo , Transporte Biológico/efeitos dos fármacos , Proteínas de Transporte/metabolismo , Colicinas/farmacologia , DNA/metabolismo , Ferro/metabolismo , Receptores de Superfície Celular/metabolismo , Antibacterianos/metabolismo , Proteínas da Membrana Bacteriana Externa/genética , Bacteriocinas/genética , Proteínas de Transporte/genética , Membrana Celular/metabolismo , Colicinas/química , Colicinas/genética , Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Proteínas de Membrana Transportadoras/metabolismo , Modelos Moleculares , Periplasma/metabolismo , Proteínas Periplásmicas/metabolismo , Conformação Proteica , Transporte Proteico , Receptores de Superfície Celular/genética
9.
Proteins ; 89(12): 1800-1823, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34453465

RESUMO

We present the results for CAPRI Round 50, the fourth joint CASP-CAPRI protein assembly prediction challenge. The Round comprised a total of twelve targets, including six dimers, three trimers, and three higher-order oligomers. Four of these were easy targets, for which good structural templates were available either for the full assembly, or for the main interfaces (of the higher-order oligomers). Eight were difficult targets for which only distantly related templates were found for the individual subunits. Twenty-five CAPRI groups including eight automatic servers submitted ~1250 models per target. Twenty groups including six servers participated in the CAPRI scoring challenge submitted ~190 models per target. The accuracy of the predicted models was evaluated using the classical CAPRI criteria. The prediction performance was measured by a weighted scoring scheme that takes into account the number of models of acceptable quality or higher submitted by each group as part of their five top-ranking models. Compared to the previous CASP-CAPRI challenge, top performing groups submitted such models for a larger fraction (70-75%) of the targets in this Round, but fewer of these models were of high accuracy. Scorer groups achieved stronger performance with more groups submitting correct models for 70-80% of the targets or achieving high accuracy predictions. Servers performed less well in general, except for the MDOCKPP and LZERD servers, who performed on par with human groups. In addition to these results, major advances in methodology are discussed, providing an informative overview of where the prediction of protein assemblies currently stands.


Assuntos
Biologia Computacional/métodos , Modelos Moleculares , Proteínas , Software , Sítios de Ligação , Simulação de Acoplamento Molecular , Domínios e Motivos de Interação entre Proteínas , Proteínas/química , Proteínas/metabolismo , Análise de Sequência de Proteína
10.
Curr Opin Struct Biol ; 67: 178-186, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33360497

RESUMO

Computational docking methods can provide structural models of protein-protein complexes, but protein backbone flexibility upon association often thwarts accurate predictions. In recent blind challenges, medium or high accuracy models were submitted in less than 20% of the 'difficult' targets (with significant backbone change or uncertainty). Here, we describe recent developments in protein-protein docking and highlight advances that tackle backbone flexibility. In molecular dynamics and Monte Carlo approaches, enhanced sampling techniques have reduced time-scale limitations. Internal coordinate formulations can now capture realistic motions of monomers and complexes using harmonic dynamics. And machine learning approaches adaptively guide docking trajectories or generate novel binding site predictions from deep neural networks trained on protein interfaces. These tools poise the field to break through the longstanding challenge of correctly predicting complex structures with significant conformational change.


Assuntos
Algoritmos , Ligação Proteica , Proteínas , Sítios de Ligação , Simulação de Acoplamento Molecular , Método de Monte Carlo , Conformação Proteica , Proteínas/metabolismo , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...