Pesquisa | Secretaria de Estado da Saúde

1.

De novo design of protein structure and function with RFdiffusion.

Watson, Joseph L; Juergens, David; Bennett, Nathaniel R; Trippe, Brian L; Yim, Jason; Eisenach, Helen E; Ahern, Woody; Borst, Andrew J; Ragotte, Robert J; Milles, Lukas F; Wicky, Basile I M; Hanikel, Nikita; Pellock, Samuel J; Courbet, Alexis; Sheffler, William; Wang, Jue; Venkatesh, Preetham; Sappington, Isaac; Torres, Susana Vázquez; Lauko, Anna; De Bortoli, Valentin; Mathieu, Emile; Ovchinnikov, Sergey; Barzilay, Regina; Jaakkola, Tommi S; DiMaio, Frank; Baek, Minkyung; Baker, David.

Nature ; 620(7976): 1089-1100, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-37433327

RESUMO

There has been considerable recent progress in designing new proteins using deep-learning methods1-9. Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models10,11 have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence-structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications.

Assuntos

Aprendizado Profundo , Proteínas , Domínio Catalítico , Microscopia Crioeletrônica , Glicoproteínas de Hemaglutininação de Vírus da Influenza/química , Glicoproteínas de Hemaglutininação de Vírus da Influenza/metabolismo , Glicoproteínas de Hemaglutininação de Vírus da Influenza/ultraestrutura , Ligação Proteica , Proteínas/química , Proteínas/metabolismo , Proteínas/ultraestrutura

2.

Accurate prediction of protein-nucleic acid complexes using RoseTTAFoldNA.

Baek, Minkyung; McHugh, Ryan; Anishchenko, Ivan; Jiang, Hanlun; Baker, David; DiMaio, Frank.

Nat Methods ; 21(1): 117-121, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-37996753

RESUMO

Protein-RNA and protein-DNA complexes play critical roles in biology. Despite considerable recent advances in protein structure prediction, the prediction of the structures of protein-nucleic acid complexes without homology to known complexes is a largely unsolved problem. Here we extend the RoseTTAFold machine learning protein-structure-prediction approach to additionally predict nucleic acid and protein-nucleic acid complexes. We develop a single trained network, RoseTTAFoldNA, that rapidly produces three-dimensional structure models with confidence estimates for protein-DNA and protein-RNA complexes. Here we show that confident predictions have considerably higher accuracy than current state-of-the-art methods. RoseTTAFoldNA should be broadly useful for modeling the structure of naturally occurring protein-nucleic acid complexes, and for designing sequence-specific RNA and DNA-binding proteins.

Assuntos

Ácidos Nucleicos , RNA/química , Proteínas de Ligação a DNA/química , DNA/química

3.

Democratizing protein language models with parameter-efficient fine-tuning.

Sledzieski, Samuel; Kshirsagar, Meghana; Baek, Minkyung; Dodhia, Rahul; Lavista Ferres, Juan; Berger, Bonnie.

Proc Natl Acad Sci U S A ; 121(26): e2405840121, 2024 Jun 25.

Artigo em Inglês | MEDLINE | ID: mdl-38900798

RESUMO

Proteomics has been revolutionized by large protein language models (PLMs), which learn unsupervised representations from large corpora of sequences. These models are typically fine-tuned in a supervised setting to adapt the model to specific downstream tasks. However, the computational and memory footprint of fine-tuning (FT) large PLMs presents a barrier for many research groups with limited computational resources. Natural language processing has seen a similar explosion in the size of models, where these challenges have been addressed by methods for parameter-efficient fine-tuning (PEFT). In this work, we introduce this paradigm to proteomics through leveraging the parameter-efficient method LoRA and training new models for two important tasks: predicting protein-protein interactions (PPIs) and predicting the symmetry of homooligomer quaternary structures. We show that these approaches are competitive with traditional FT while requiring reduced memory and substantially fewer parameters. We additionally show that for the PPI prediction task, training only the classification head also remains competitive with full FT, using five orders of magnitude fewer parameters, and that each of these methods outperform state-of-the-art PPI prediction methods with substantially reduced compute. We further perform a comprehensive evaluation of the hyperparameter space, demonstrate that PEFT of PLMs is robust to variations in these hyperparameters, and elucidate where best practices for PEFT in proteomics differ from those in natural language processing. All our model adaptation and evaluation code is available open-source at https://github.com/microsoft/peft_proteomics. Thus, we provide a blueprint to democratize the power of PLM adaptation to groups with limited computational resources.

Assuntos

Proteômica , Proteômica/métodos , Proteínas/química , Proteínas/metabolismo , Processamento de Linguagem Natural , Mapeamento de Interação de Proteínas/métodos , Biologia Computacional/métodos , Humanos , Algoritmos

4.

Peptide-binding specificity prediction using fine-tuned protein structure prediction networks.

Motmaen, Amir; Dauparas, Justas; Baek, Minkyung; Abedi, Mohamad H; Baker, David; Bradley, Philip.

Proc Natl Acad Sci U S A ; 120(9): e2216697120, 2023 02 28.

Artigo em Inglês | MEDLINE | ID: mdl-36802421

RESUMO

Peptide-binding proteins play key roles in biology, and predicting their binding specificity is a long-standing challenge. While considerable protein structural information is available, the most successful current methods use sequence information alone, in part because it has been a challenge to model the subtle structural changes accompanying sequence substitutions. Protein structure prediction networks such as AlphaFold model sequence-structure relationships very accurately, and we reasoned that if it were possible to specifically train such networks on binding data, more generalizable models could be created. We show that placing a classifier on top of the AlphaFold network and fine-tuning the combined network parameters for both classification and structure prediction accuracy leads to a model with strong generalizable performance on a wide range of Class I and Class II peptide-MHC interactions that approaches the overall performance of the state-of-the-art NetMHCpan sequence-based method. The peptide-MHC optimized model shows excellent performance in distinguishing binding and non-binding peptides to SH3 and PDZ domains. This ability to generalize well beyond the training set far exceeds that of sequence-only models and should be particularly powerful for systems where less experimental data are available.

Assuntos

Antígenos de Histocompatibilidade Classe II , Peptídeos , Ligação Proteica , Peptídeos/química , Antígenos de Histocompatibilidade Classe II/metabolismo , Genes MHC da Classe II , Domínios PDZ

5.

Protein-protein interactions in the Mla lipid transport system probed by computational structure prediction and deep mutational scanning.

MacRae, Mark R; Puvanendran, Dhenesh; Haase, Max A B; Coudray, Nicolas; Kolich, Ljuvica; Lam, Cherry; Baek, Minkyung; Bhabha, Gira; Ekiert, Damian C.

J Biol Chem ; 299(6): 104744, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37100290

RESUMO

The outer membrane (OM) of Gram-negative bacteria is an asymmetric bilayer that protects the cell from external stressors, such as antibiotics. The Mla transport system is implicated in the Maintenance of OM Lipid Asymmetry by mediating retrograde phospholipid transport across the cell envelope. Mla uses a shuttle-like mechanism to move lipids between the MlaFEDB inner membrane complex and the MlaA-OmpF/C OM complex, via a periplasmic lipid-binding protein, MlaC. MlaC binds to MlaD and MlaA, but the underlying protein-protein interactions that facilitate lipid transfer are not well understood. Here, we take an unbiased deep mutational scanning approach to map the fitness landscape of MlaC from Escherichia coli, which provides insights into important functional sites. Combining this analysis with AlphaFold2 structure predictions and binding experiments, we map the MlaC-MlaA and MlaC-MlaD protein-protein interfaces. Our results suggest that the MlaD and MlaA binding surfaces on MlaC overlap to a large extent, leading to a model in which MlaC can only bind one of these proteins at a time. Low-resolution cryo-electron microscopy (cryo-EM) maps of MlaC bound to MlaFEDB suggest that at least two MlaC molecules can bind to MlaD at once, in a conformation consistent with AlphaFold2 predictions. These data lead us to a model for MlaC interaction with its binding partners and insights into lipid transfer steps that underlie phospholipid transport between the bacterial inner and OMs.

Assuntos

Proteínas de Escherichia coli , Metabolismo dos Lipídeos , Proteínas de Membrana Transportadoras , Proteínas da Membrana Bacteriana Externa/genética , Proteínas da Membrana Bacteriana Externa/metabolismo , Transporte Biológico , Membrana Celular/metabolismo , Microscopia Crioeletrônica , Escherichia coli/metabolismo , Proteínas de Escherichia coli/química , Lipídeos de Membrana/metabolismo , Fosfolipídeos/metabolismo , Proteínas de Membrana Transportadoras/química , Proteínas de Membrana Transportadoras/metabolismo

6.

GalaxyHeteromer: protein heterodimer structure prediction by template-based and ab initio docking.

Park, Taeyong; Won, Jonghun; Baek, Minkyung; Seok, Chaok.

Nucleic Acids Res ; 49(W1): W237-W241, 2021 07 02.

Artigo em Inglês | MEDLINE | ID: mdl-34048578

RESUMO

Protein-protein interactions play crucial roles in diverse biological processes, including various disease progressions. Atomistic structural details of protein-protein interactions may provide important information that can facilitate the design of therapeutic agents. GalaxyHeteromer is a freely available automatic web server (http://galaxy.seoklab.org/heteromer) that predicts protein heterodimer complex structures from two subunit protein sequences or structures. When subunit structures are unavailable, they are predicted by template- or distance-prediction-based modelling methods. Heterodimer complex structures can be predicted by both template-based and ab initio docking, depending on the template's availability. Structural templates are detected from the protein structure database based on both the sequence and structure similarities. The templates for heterodimers may be selected from monomer and homo-oligomer structures, as well as from hetero-oligomers, owing to the evolutionary relationships of heterodimers with domains of monomers or subunits of homo-oligomers. In addition, the server employs one of the best ab initio docking methods when heterodimer templates are unavailable. The multiple heterodimer structure models and the associated scores, which are provided by the web server, may be further examined by user to test or develop functional hypotheses or to design new functional molecules.

Assuntos

Simulação de Acoplamento Molecular , Multimerização Proteica , Software , Subunidades Proteicas/química , Análise de Sequência de Proteína

7.

Towards the prediction of general biomolecular interactions with AI.

Baek, Minkyung.

Nat Methods ; 21(8): 1382-1383, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-39122945

Assuntos

Inteligência Artificial , Biologia Computacional/métodos , Humanos , Mapeamento de Interação de Proteínas/métodos , Algoritmos

8.

Protein oligomer modeling guided by predicted interchain contacts in CASP14.

Baek, Minkyung; Anishchenko, Ivan; Park, Hahnbeom; Humphreys, Ian R; Baker, David.

Proteins ; 89(12): 1824-1833, 2021 12.

Artigo em Inglês | MEDLINE | ID: mdl-34324224

RESUMO

For CASP14, we developed deep learning-based methods for predicting homo-oligomeric and hetero-oligomeric contacts and used them for oligomer modeling. To build structure models, we developed an oligomer structure generation method that utilizes predicted interchain contacts to guide iterative restrained minimization from random backbone structures. We supplemented this gradient-based fold-and-dock method with template-based and ab initio docking approaches using deep learning-based subunit predictions on 29 assembly targets. These methods produced oligomer models with summed Z-scores 5.5 units higher than the next best group, with the fold-and-dock method having the best relative performance. Over the eight targets for which this method was used, the best of the five submitted models had average oligomer TM-score of 0.71 (average oligomer TM-score of the next best group: 0.64), and explicit modeling of inter-subunit interactions improved modeling of six out of 40 individual domains (ΔGDT-TS > 2.0).

Assuntos

Modelos Moleculares , Conformação Proteica , Proteínas , Software , Biologia Computacional , Bases de Dados de Proteínas , Aprendizado Profundo , Ligação Proteica , Subunidades Proteicas/química , Subunidades Proteicas/metabolismo , Proteínas/química , Proteínas/metabolismo , Análise de Sequência de Proteína

9.

Protein tertiary structure prediction and refinement using deep learning and Rosetta in CASP14.

Anishchenko, Ivan; Baek, Minkyung; Park, Hahnbeom; Hiranuma, Naozumi; Kim, David E; Dauparas, Justas; Mansoor, Sanaa; Humphreys, Ian R; Baker, David.

Proteins ; 89(12): 1722-1733, 2021 12.

Artigo em Inglês | MEDLINE | ID: mdl-34331359

RESUMO

The trRosetta structure prediction method employs deep learning to generate predicted residue-residue distance and orientation distributions from which 3D models are built. We sought to improve the method by incorporating as inputs (in addition to sequence information) both language model embeddings and template information weighted by sequence similarity to the target. We also developed a refinement pipeline that recombines models generated by template-free and template utilizing versions of trRosetta guided by the DeepAccNet accuracy predictor. Both benchmark tests and CASP results show that the new pipeline is a considerable improvement over the original trRosetta, and it is faster and requires less computing resources, completing the entire modeling process in a median < 3 h in CASP14. Our human group improved results with this pipeline primarily by identifying additional homologous sequences for input into the network. We also used the DeepAccNet accuracy predictor to guide Rosetta high-resolution refinement for submissions in the regular and refinement categories; although performance was quite good on a CASP relative scale, the overall improvements were rather modest in part due to missing inter-domain or inter-chain contacts.

Assuntos

Biologia Computacional/métodos , Aprendizado Profundo , Estrutura Terciária de Proteína , Proteínas , Software , Humanos , Metagenoma/genética , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Análise de Sequência de Proteína

10.

Structure prediction of biological assemblies using GALAXY in CAPRI rounds 38-45.

Park, Taeyong; Woo, Hyeonuk; Baek, Minkyung; Yang, Jinsol; Seok, Chaok.

Proteins ; 88(8): 1009-1017, 2020 08.

Artigo em Inglês | MEDLINE | ID: mdl-31774573

RESUMO

We participated in CARPI rounds 38-45 both as a server predictor and a human predictor. These CAPRI rounds provided excellent opportunities for testing prediction methods for three classes of protein interactions, that is, protein-protein, protein-peptide, and protein-oligosaccharide interactions. Both template-based methods (GalaxyTBM for monomer protein, GalaxyHomomer for homo-oligomer protein, GalaxyPepDock for protein-peptide complex) and ab initio docking methods (GalaxyTongDock and GalaxyPPDock for protein oligomer, GalaxyPepDock-ab-initio for protein-peptide complex, GalaxyDock2 and Galaxy7TM for protein-oligosaccharide complex) have been tested. Template-based methods depend heavily on the availability of proper templates and template-target similarity, and template-target difference is responsible for inaccuracy of template-based models. Inaccurate template-based models could be improved by our structure refinement and loop modeling methods based on physics-based energy optimization (GalaxyRefineComplex and GalaxyLoop) for several CAPRI targets. Current ab initio docking methods require accurate protein structures as input. Small conformational changes from input structure could be accounted for by our docking methods, producing one of the best models for several CAPRI targets. However, predicting large conformational changes involving protein backbone is still challenging, and full exploration of physics-based methods for such problems is still to come.

Assuntos

Simulação de Acoplamento Molecular , Oligossacarídeos/química , Peptídeos/química , Proteínas/química , Software , Sequência de Aminoácidos , Sítios de Ligação , Humanos , Ligantes , Oligossacarídeos/metabolismo , Peptídeos/metabolismo , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Multimerização Proteica , Proteínas/metabolismo , Projetos de Pesquisa , Homologia Estrutural de Proteína , Termodinâmica

11.

Deep learning and protein structure modeling.

Baek, Minkyung; Baker, David.

Nat Methods ; 19(1): 13-14, 2022 01.

Artigo em Inglês | MEDLINE | ID: mdl-35017724

Assuntos

Biologia Computacional/métodos , Aprendizado Profundo , Proteínas/química , Algoritmos , Modelos Moleculares , Complexos Multiproteicos/química , Conformação Proteica , Alinhamento de Sequência

12.

Prediction of protein oligomer structures using GALAXY in CASP13.

Baek, Minkyung; Park, Taeyong; Woo, Hyeonuk; Seok, Chaok.

Proteins ; 87(12): 1233-1240, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-31509276

RESUMO

Many proteins need to form oligomers to be functional, so oligomer structures provide important clues to biological roles of proteins. Prediction of oligomer structures therefore can be a useful tool in the absence of experimentally resolved structures. In this article, we describe the server and human methods that we used to predict oligomer structures in the CASP13 experiment. Performances of the methods on the 42 CASP13 oligomer targets consisting of 30 homo-oligomers and 12 hetero-oligomers are discussed. Our server method, Seok-assembly, generated models with interface contact similarity measure greater than 0.2 as model 1 for 11 homo-oligomer targets when proper templates existed in the database. Model refinement methods such as loop modeling and molecular dynamics (MD)-based overall refinement failed to improve model qualities when target proteins have domains not covered by templates or when chains have very small interfaces. In human predictions, additional experimental data such as low-resolution electron microscopy (EM) map were utilized. EM data could assist oligomer structure prediction by providing a global shape of the complex structure.

Assuntos

Biologia Computacional , Conformação Proteica , Proteínas/ultraestrutura , Software , Algoritmos , Humanos , Simulação de Dinâmica Molecular , Multimerização Proteica/genética , Proteínas/química , Proteínas/genética

13.

Assessment of protein model structure accuracy estimation in CASP13: Challenges in the era of deep learning.

Won, Jonghun; Baek, Minkyung; Monastyrskyy, Bohdan; Kryshtafovych, Andriy; Seok, Chaok.

Proteins ; 87(12): 1351-1360, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-31436360

RESUMO

Scoring model structure is an essential component of protein structure prediction that can affect the prediction accuracy tremendously. Users of protein structure prediction results also need to score models to select the best models for their application studies. In Critical Assessment of techniques for protein Structure Prediction (CASP), model accuracy estimation methods have been tested in a blind fashion by providing models submitted by the tertiary structure prediction servers for scoring. In CASP13, model accuracy estimation results were evaluated in terms of both global and local structure accuracy. Global structure accuracy estimation was evaluated by the quality of the models selected by the global structure scores and by the absolute estimates of the global scores. Residue-wise, local structure accuracy estimations were evaluated by three different measures. A new measure introduced in CASP13 evaluates the ability to predict inaccurately modeled regions that may be improved by refinement. An intensive comparative analysis on CASP13 and the previous CASPs revealed that the tertiary structure models generated by the CASP13 servers show very distinct features. Higher consensus toward models of higher global accuracy appeared even for free modeling targets, and many models of high global accuracy were not well optimized at the atomic level. This is related to the new technology in CASP13, deep learning for tertiary contact prediction. The tertiary model structures generated by deep learning pose a new challenge for EMA (estimation of model accuracy) method developers. Model accuracy estimation itself is also an area where deep learning can potentially have an impact, although current EMA methods have not fully explored that direction.

Assuntos

Biologia Computacional , Modelos Moleculares , Conformação Proteica , Proteínas/ultraestrutura , Algoritmos , Bases de Dados de Proteínas , Aprendizado Profundo , Proteínas/química , Proteínas/genética , Análise de Sequência de Proteína , Software

14.

GalaxyDock3: Protein-ligand docking that considers the full ligand conformational flexibility.

Yang, Jinsol; Baek, Minkyung; Seok, Chaok.

J Comput Chem ; 40(31): 2739-2748, 2019 12 05.

Artigo em Inglês | MEDLINE | ID: mdl-31423613

RESUMO

Predicting conformational changes of both the protein and the ligand is a major challenge when a protein-ligand complex structure is predicted from the unbound protein and ligand structures. Herein, we introduce a new protein-ligand docking program called GalaxyDock3 that considers the full ligand conformational flexibility by explicitly sampling the ligand ring conformation and allowing the relaxation of the full ligand degrees of freedom, including bond angles and lengths. This method is based on the previous version (GalaxyDock2) which performs the global optimization of a designed score function. Ligand ring conformation is sampled from a ring conformation library constructed from structure databases. The GalaxyDock3 score function was trained with an additional bonded energy term for the ligand on a large set of complex structures. The performance of GalaxyDock3 was improved compared to GalaxyDock2 when predicted ligand conformation was used as the input for docking, especially when the input ligand conformation differs significantly from the crystal conformation. GalaxyDock3 also compared favorably with other available docking programs on two benchmark tests that contained diverse ligand rings. The program is freely available at http://galaxy.seoklab.org/softwares/galaxydock.html. © 2019 Wiley Periodicals, Inc.

Assuntos

Ligantes , Proteínas/química , Software , Conformação Molecular , Simulação de Acoplamento Molecular

15.

GalaxyTongDock: Symmetric and asymmetric ab initio protein-protein docking web server with improved energy parameters.

Park, Taeyong; Baek, Minkyung; Lee, Hasup; Seok, Chaok.

J Comput Chem ; 40(27): 2413-2417, 2019 10 15.

Artigo em Inglês | MEDLINE | ID: mdl-31173387

RESUMO

Protein-protein docking methods are spotlighted for their roles in providing insights into protein-protein interactions in the absence of full structural information by experiment. GalaxyTongDock is an ab initio protein-protein docking web server that performs rigid-body docking just like ZDOCK but with improved energy parameters. The energy parameters were trained by iterative docking and parameter search so that more native-like structures are selected as top rankers. GalaxyTongDock performs asymmetric docking of two different proteins (GalaxyTongDock_A) and symmetric docking of homo-oligomeric proteins with Cn and Dn symmetries (GalaxyTongDock_C and GalaxyTongDock_D). Performance tests on an unbound docking benchmark set for asymmetric docking and a model docking benchmark set for symmetric docking showed that GalaxyTongDock is better or comparable to other state-of-the-art methods. Experimental and/or evolutionary information on binding interfaces can be easily incorporated by using block and interface options. GalaxyTongDock web server is freely available at http://galaxy.seoklab.org/tongdock. © 2019 Wiley Periodicals, Inc.

Assuntos

Simulação de Acoplamento Molecular , Proteínas/química , Teoria Quântica , Software

16.

GalaxyHomomer: a web server for protein homo-oligomer structure prediction from a monomer sequence or structure.

Baek, Minkyung; Park, Taeyong; Heo, Lim; Park, Chiwook; Seok, Chaok.

Nucleic Acids Res ; 45(W1): W320-W324, 2017 07 03.

Artigo em Inglês | MEDLINE | ID: mdl-28387820

RESUMO

Homo-oligomerization of proteins is abundant in nature, and is often intimately related with the physiological functions of proteins, such as in metabolism, signal transduction or immunity. Information on the homo-oligomer structure is therefore important to obtain a molecular-level understanding of protein functions and their regulation. Currently available web servers predict protein homo-oligomer structures either by template-based modeling using homo-oligomer templates selected from the protein structure database or by ab initio docking of monomer structures resolved by experiment or predicted by computation. The GalaxyHomomer server, freely accessible at http://galaxy.seoklab.org/homomer, carries out template-based modeling, ab initio docking or both depending on the availability of proper oligomer templates. It also incorporates recently developed model refinement methods that can consistently improve model quality. Moreover, the server provides additional options that can be chosen by the user depending on the availability of information on the monomer structure, oligomeric state and locations of unreliable/flexible loops or termini. The performance of the server was better than or comparable to that of other available methods when tested on benchmark sets and in a recent CASP performed in a blind fashion.

Assuntos

Modelos Estatísticos , Simulação de Acoplamento Molecular , Multimerização Proteica , Proteínas/química , Software , Sequência de Aminoácidos , Benchmarking , Bases de Dados de Proteínas , Humanos , Internet , Simulação de Dinâmica Molecular , Estrutura Secundária de Proteína

17.

The challenge of modeling protein assemblies: the CASP12-CAPRI experiment.

Lensink, Marc F; Velankar, Sameer; Baek, Minkyung; Heo, Lim; Seok, Chaok; Wodak, Shoshana J.

Proteins ; 86 Suppl 1: 257-273, 2018 03.

Artigo em Inglês | MEDLINE | ID: mdl-29127686

RESUMO

We present the quality assessment of 5613 models submitted by predictor groups from both CAPRI and CASP for the total of 15 most tractable targets from the second joint CASP-CAPRI protein assembly prediction experiment. These targets comprised 12 homo-oligomers and 3 hetero-complexes. The bulk of the analysis focuses on 10 targets (of CAPRI Round 37), which included all 3 hetero-complexes, and whose protein chains or the full assembly could be readily modeled from structural templates in the PDB. On average, 28 CAPRI groups and 10 CASP groups (including automatic servers), submitted models for each of these 10 targets. Additionally, about 16 groups participated in the CAPRI scoring experiments. A range of acceptable to high quality models were obtained for 6 of the 10 Round 37 targets, for which templates were available for the full assembly. Poorer results were achieved for the remaining targets due to the lower quality of the templates available for the full complex or the individual protein chains, highlighting the unmet challenge of modeling the structural adjustments of the protein components that occur upon binding or which must be accounted for in template-based modeling. On the other hand, our analysis indicated that residues in binding interfaces were correctly predicted in a sizable fraction of otherwise poorly modeled assemblies and this with higher accuracy than published methods that do not use information on the binding partner. Lastly, the strengths and weaknesses of the assessment methods are evaluated and improvements suggested.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Modelos Moleculares , Conformação Proteica , Mapeamento de Interação de Proteínas/métodos , Multimerização Proteica , Proteínas/química , Algoritmos , Humanos , Análise de Sequência de Proteína

18.

Template-based modeling and ab initio refinement of protein oligomer structures using GALAXY in CAPRI round 30.

Lee, Hasup; Baek, Minkyung; Lee, Gyu Rie; Park, Sangwoo; Seok, Chaok.

Proteins ; 85(3): 399-407, 2017 03.

Artigo em Inglês | MEDLINE | ID: mdl-27770545

RESUMO

Many proteins function as homo- or hetero-oligomers; therefore, attempts to understand and regulate protein functions require knowledge of protein oligomer structures. The number of available experimental protein structures is increasing, and oligomer structures can be predicted using the experimental structures of related proteins as templates. However, template-based models may have errors due to sequence differences between the target and template proteins, which can lead to functional differences. Such structural differences may be predicted by loop modeling of local regions or refinement of the overall structure. In CAPRI (Critical Assessment of PRotein Interactions) round 30, we used recently developed features of the GALAXY protein modeling package, including template-based structure prediction, loop modeling, model refinement, and protein-protein docking to predict protein complex structures from amino acid sequences. Out of the 25 CAPRI targets, medium and acceptable quality models were obtained for 14 and 1 target(s), respectively, for which proper oligomer or monomer templates could be detected. Symmetric interface loop modeling on oligomer model structures successfully improved model quality, while loop modeling on monomer model structures failed. Overall refinement of the predicted oligomer structures consistently improved the model quality, in particular in interface contacts. Proteins 2017; 85:399-407. © 2016 Wiley Periodicals, Inc.

Assuntos

Algoritmos , Biologia Computacional/métodos , Simulação de Acoplamento Molecular/métodos , Proteínas/química , Sequência de Aminoácidos , Benchmarking , Sítios de Ligação , Ligação Proteica , Conformação Proteica , Multimerização Proteica , Projetos de Pesquisa , Software , Homologia Estrutural de Proteína

19.

GalaxyDock BP2 score: a hybrid scoring function for accurate protein-ligand docking.

Baek, Minkyung; Shin, Woong-Hee; Chung, Hwan Won; Seok, Chaok.

J Comput Aided Mol Des ; 31(7): 653-666, 2017 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-28623486

RESUMO

Protein-ligand docking is a useful tool for providing atomic-level understanding of protein functions in nature and design principles for artificial ligands or proteins with desired properties. The ability to identify the true binding pose of a ligand to a target protein among numerous possible candidate poses is an essential requirement for successful protein-ligand docking. Many previously developed docking scoring functions were trained to reproduce experimental binding affinities and were also used for scoring binding poses. However, in this study, we developed a new docking scoring function, called GalaxyDock BP2 Score, by directly training the scoring power of binding poses. This function is a hybrid of physics-based, empirical, and knowledge-based score terms that are balanced to strengthen the advantages of each component. The performance of the new scoring function exhibits significant improvement over existing scoring functions in decoy pose discrimination tests. In addition, when the score is used with the GalaxyDock2 protein-ligand docking program, it outperformed other state-of-the-art docking programs in docking tests on the Astex diverse set, the Cross2009 benchmark set, and the Astex non-native set. GalaxyDock BP2 Score and GalaxyDock2 with this score are freely available at http://galaxy.seoklab.org/softwares/galaxydock.html .

Assuntos

Simulação de Acoplamento Molecular , Proteínas/química , Sítios de Ligação , Bases de Dados de Proteínas , Ligantes , Ligação Proteica , Conformação Proteica , Projetos de Pesquisa , Software

20.

Absolute binding free energies for octa-acids and guests in SAMPL5 : Evaluating binding free energies for octa-acid and guest complexes in the SAMPL5 blind challenge.

Tofoleanu, Florentina; Lee, Juyong; Pickard Iv, Frank C; König, Gerhard; Huang, Jing; Baek, Minkyung; Seok, Chaok; Brooks, Bernard R.

J Comput Aided Mol Des ; 31(1): 107-118, 2017 01.

Artigo em Inglês | MEDLINE | ID: mdl-27696242

RESUMO

As part of the SAMPL5 blind prediction challenge, we calculate the absolute binding free energies of six guest molecules to an octa-acid (OAH) and to a methylated octa-acid (OAMe). We use the double decoupling method via thermodynamic integration (TI) or Hamiltonian replica exchange in connection with the Bennett acceptance ratio (HREM-BAR). We produce the binding poses either through manual docking or by using GalaxyDock-HG, a docking software developed specifically for this study. The root mean square deviations for our most accurate predictions are 1.4 kcal mol-1 for OAH with TI and 1.9 kcal mol-1 for OAMe with HREM-BAR. Our best results for OAMe were obtained for systems with ionic concentrations corresponding to the ionic strength of the experimental solution. The most problematic system contains a halogenated guest. Our attempt to model the σ-hole of the bromine using a constrained off-site point charge, does not improve results. We use results from molecular dynamics simulations to argue that the distinct binding affinities of this guest to OAH and OAMe are due to a difference in the flexibility of the host. We believe that the results of this extensive analysis of host-guest complexes will help improve the protocol used in predicting binding affinities for larger systems, such as protein-substrate compounds.

Assuntos

Ligantes , Simulação de Dinâmica Molecular , Proteínas/química , Termodinâmica , Sítios de Ligação , Conformação Molecular , Estrutura Molecular , Ligação Proteica , Teoria Quântica , Software , Solventes/química

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa