Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
1.
Nature ; 620(7976): 1089-1100, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37433327

RESUMEN

There has been considerable recent progress in designing new proteins using deep-learning methods1-9. Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models10,11 have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence-structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications.


Asunto(s)
Aprendizaje Profundo , Proteínas , Dominio Catalítico , Microscopía por Crioelectrón , Glicoproteínas Hemaglutininas del Virus de la Influenza/química , Glicoproteínas Hemaglutininas del Virus de la Influenza/metabolismo , Glicoproteínas Hemaglutininas del Virus de la Influenza/ultraestructura , Unión Proteica , Proteínas/química , Proteínas/metabolismo , Proteínas/ultraestructura
2.
Nat Methods ; 21(1): 117-121, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37996753

RESUMEN

Protein-RNA and protein-DNA complexes play critical roles in biology. Despite considerable recent advances in protein structure prediction, the prediction of the structures of protein-nucleic acid complexes without homology to known complexes is a largely unsolved problem. Here we extend the RoseTTAFold machine learning protein-structure-prediction approach to additionally predict nucleic acid and protein-nucleic acid complexes. We develop a single trained network, RoseTTAFoldNA, that rapidly produces three-dimensional structure models with confidence estimates for protein-DNA and protein-RNA complexes. Here we show that confident predictions have considerably higher accuracy than current state-of-the-art methods. RoseTTAFoldNA should be broadly useful for modeling the structure of naturally occurring protein-nucleic acid complexes, and for designing sequence-specific RNA and DNA-binding proteins.


Asunto(s)
Ácidos Nucleicos , ARN/química , Proteínas de Unión al ADN/química , ADN/química
3.
Proc Natl Acad Sci U S A ; 121(26): e2405840121, 2024 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-38900798

RESUMEN

Proteomics has been revolutionized by large protein language models (PLMs), which learn unsupervised representations from large corpora of sequences. These models are typically fine-tuned in a supervised setting to adapt the model to specific downstream tasks. However, the computational and memory footprint of fine-tuning (FT) large PLMs presents a barrier for many research groups with limited computational resources. Natural language processing has seen a similar explosion in the size of models, where these challenges have been addressed by methods for parameter-efficient fine-tuning (PEFT). In this work, we introduce this paradigm to proteomics through leveraging the parameter-efficient method LoRA and training new models for two important tasks: predicting protein-protein interactions (PPIs) and predicting the symmetry of homooligomer quaternary structures. We show that these approaches are competitive with traditional FT while requiring reduced memory and substantially fewer parameters. We additionally show that for the PPI prediction task, training only the classification head also remains competitive with full FT, using five orders of magnitude fewer parameters, and that each of these methods outperform state-of-the-art PPI prediction methods with substantially reduced compute. We further perform a comprehensive evaluation of the hyperparameter space, demonstrate that PEFT of PLMs is robust to variations in these hyperparameters, and elucidate where best practices for PEFT in proteomics differ from those in natural language processing. All our model adaptation and evaluation code is available open-source at https://github.com/microsoft/peft_proteomics. Thus, we provide a blueprint to democratize the power of PLM adaptation to groups with limited computational resources.


Asunto(s)
Proteómica , Proteómica/métodos , Proteínas/química , Proteínas/metabolismo , Procesamiento de Lenguaje Natural , Mapeo de Interacción de Proteínas/métodos , Biología Computacional/métodos , Humanos , Algoritmos
4.
Proc Natl Acad Sci U S A ; 120(9): e2216697120, 2023 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-36802421

RESUMEN

Peptide-binding proteins play key roles in biology, and predicting their binding specificity is a long-standing challenge. While considerable protein structural information is available, the most successful current methods use sequence information alone, in part because it has been a challenge to model the subtle structural changes accompanying sequence substitutions. Protein structure prediction networks such as AlphaFold model sequence-structure relationships very accurately, and we reasoned that if it were possible to specifically train such networks on binding data, more generalizable models could be created. We show that placing a classifier on top of the AlphaFold network and fine-tuning the combined network parameters for both classification and structure prediction accuracy leads to a model with strong generalizable performance on a wide range of Class I and Class II peptide-MHC interactions that approaches the overall performance of the state-of-the-art NetMHCpan sequence-based method. The peptide-MHC optimized model shows excellent performance in distinguishing binding and non-binding peptides to SH3 and PDZ domains. This ability to generalize well beyond the training set far exceeds that of sequence-only models and should be particularly powerful for systems where less experimental data are available.


Asunto(s)
Antígenos de Histocompatibilidad Clase II , Péptidos , Unión Proteica , Péptidos/química , Antígenos de Histocompatibilidad Clase II/metabolismo , Genes MHC Clase II , Dominios PDZ
5.
J Biol Chem ; 299(6): 104744, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37100290

RESUMEN

The outer membrane (OM) of Gram-negative bacteria is an asymmetric bilayer that protects the cell from external stressors, such as antibiotics. The Mla transport system is implicated in the Maintenance of OM Lipid Asymmetry by mediating retrograde phospholipid transport across the cell envelope. Mla uses a shuttle-like mechanism to move lipids between the MlaFEDB inner membrane complex and the MlaA-OmpF/C OM complex, via a periplasmic lipid-binding protein, MlaC. MlaC binds to MlaD and MlaA, but the underlying protein-protein interactions that facilitate lipid transfer are not well understood. Here, we take an unbiased deep mutational scanning approach to map the fitness landscape of MlaC from Escherichia coli, which provides insights into important functional sites. Combining this analysis with AlphaFold2 structure predictions and binding experiments, we map the MlaC-MlaA and MlaC-MlaD protein-protein interfaces. Our results suggest that the MlaD and MlaA binding surfaces on MlaC overlap to a large extent, leading to a model in which MlaC can only bind one of these proteins at a time. Low-resolution cryo-electron microscopy (cryo-EM) maps of MlaC bound to MlaFEDB suggest that at least two MlaC molecules can bind to MlaD at once, in a conformation consistent with AlphaFold2 predictions. These data lead us to a model for MlaC interaction with its binding partners and insights into lipid transfer steps that underlie phospholipid transport between the bacterial inner and OMs.


Asunto(s)
Proteínas de Escherichia coli , Metabolismo de los Lípidos , Proteínas de Transporte de Membrana , Proteínas de la Membrana Bacteriana Externa/genética , Proteínas de la Membrana Bacteriana Externa/metabolismo , Transporte Biológico , Membrana Celular/metabolismo , Microscopía por Crioelectrón , Escherichia coli/metabolismo , Proteínas de Escherichia coli/química , Lípidos de la Membrana/metabolismo , Fosfolípidos/metabolismo , Proteínas de Transporte de Membrana/química , Proteínas de Transporte de Membrana/metabolismo
6.
Nucleic Acids Res ; 49(W1): W237-W241, 2021 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-34048578

RESUMEN

Protein-protein interactions play crucial roles in diverse biological processes, including various disease progressions. Atomistic structural details of protein-protein interactions may provide important information that can facilitate the design of therapeutic agents. GalaxyHeteromer is a freely available automatic web server (http://galaxy.seoklab.org/heteromer) that predicts protein heterodimer complex structures from two subunit protein sequences or structures. When subunit structures are unavailable, they are predicted by template- or distance-prediction-based modelling methods. Heterodimer complex structures can be predicted by both template-based and ab initio docking, depending on the template's availability. Structural templates are detected from the protein structure database based on both the sequence and structure similarities. The templates for heterodimers may be selected from monomer and homo-oligomer structures, as well as from hetero-oligomers, owing to the evolutionary relationships of heterodimers with domains of monomers or subunits of homo-oligomers. In addition, the server employs one of the best ab initio docking methods when heterodimer templates are unavailable. The multiple heterodimer structure models and the associated scores, which are provided by the web server, may be further examined by user to test or develop functional hypotheses or to design new functional molecules.


Asunto(s)
Simulación del Acoplamiento Molecular , Multimerización de Proteína , Programas Informáticos , Subunidades de Proteína/química , Análisis de Secuencia de Proteína
8.
Proteins ; 89(12): 1824-1833, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34324224

RESUMEN

For CASP14, we developed deep learning-based methods for predicting homo-oligomeric and hetero-oligomeric contacts and used them for oligomer modeling. To build structure models, we developed an oligomer structure generation method that utilizes predicted interchain contacts to guide iterative restrained minimization from random backbone structures. We supplemented this gradient-based fold-and-dock method with template-based and ab initio docking approaches using deep learning-based subunit predictions on 29 assembly targets. These methods produced oligomer models with summed Z-scores 5.5 units higher than the next best group, with the fold-and-dock method having the best relative performance. Over the eight targets for which this method was used, the best of the five submitted models had average oligomer TM-score of 0.71 (average oligomer TM-score of the next best group: 0.64), and explicit modeling of inter-subunit interactions improved modeling of six out of 40 individual domains (ΔGDT-TS > 2.0).


Asunto(s)
Modelos Moleculares , Conformación Proteica , Proteínas , Programas Informáticos , Biología Computacional , Bases de Datos de Proteínas , Aprendizaje Profundo , Unión Proteica , Subunidades de Proteína/química , Subunidades de Proteína/metabolismo , Proteínas/química , Proteínas/metabolismo , Análisis de Secuencia de Proteína
9.
Proteins ; 89(12): 1722-1733, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34331359

RESUMEN

The trRosetta structure prediction method employs deep learning to generate predicted residue-residue distance and orientation distributions from which 3D models are built. We sought to improve the method by incorporating as inputs (in addition to sequence information) both language model embeddings and template information weighted by sequence similarity to the target. We also developed a refinement pipeline that recombines models generated by template-free and template utilizing versions of trRosetta guided by the DeepAccNet accuracy predictor. Both benchmark tests and CASP results show that the new pipeline is a considerable improvement over the original trRosetta, and it is faster and requires less computing resources, completing the entire modeling process in a median < 3 h in CASP14. Our human group improved results with this pipeline primarily by identifying additional homologous sequences for input into the network. We also used the DeepAccNet accuracy predictor to guide Rosetta high-resolution refinement for submissions in the regular and refinement categories; although performance was quite good on a CASP relative scale, the overall improvements were rather modest in part due to missing inter-domain or inter-chain contacts.


Asunto(s)
Biología Computacional/métodos , Aprendizaje Profundo , Estructura Terciaria de Proteína , Proteínas , Programas Informáticos , Humanos , Metagenoma/genética , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Análisis de Secuencia de Proteína
10.
Proteins ; 88(8): 1009-1017, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-31774573

RESUMEN

We participated in CARPI rounds 38-45 both as a server predictor and a human predictor. These CAPRI rounds provided excellent opportunities for testing prediction methods for three classes of protein interactions, that is, protein-protein, protein-peptide, and protein-oligosaccharide interactions. Both template-based methods (GalaxyTBM for monomer protein, GalaxyHomomer for homo-oligomer protein, GalaxyPepDock for protein-peptide complex) and ab initio docking methods (GalaxyTongDock and GalaxyPPDock for protein oligomer, GalaxyPepDock-ab-initio for protein-peptide complex, GalaxyDock2 and Galaxy7TM for protein-oligosaccharide complex) have been tested. Template-based methods depend heavily on the availability of proper templates and template-target similarity, and template-target difference is responsible for inaccuracy of template-based models. Inaccurate template-based models could be improved by our structure refinement and loop modeling methods based on physics-based energy optimization (GalaxyRefineComplex and GalaxyLoop) for several CAPRI targets. Current ab initio docking methods require accurate protein structures as input. Small conformational changes from input structure could be accounted for by our docking methods, producing one of the best models for several CAPRI targets. However, predicting large conformational changes involving protein backbone is still challenging, and full exploration of physics-based methods for such problems is still to come.


Asunto(s)
Simulación del Acoplamiento Molecular , Oligosacáridos/química , Péptidos/química , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Sitios de Unión , Humanos , Ligandos , Oligosacáridos/metabolismo , Péptidos/metabolismo , Unión Proteica , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas , Multimerización de Proteína , Proteínas/metabolismo , Proyectos de Investigación , Homología Estructural de Proteína , Termodinámica
12.
Proteins ; 87(12): 1233-1240, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31509276

RESUMEN

Many proteins need to form oligomers to be functional, so oligomer structures provide important clues to biological roles of proteins. Prediction of oligomer structures therefore can be a useful tool in the absence of experimentally resolved structures. In this article, we describe the server and human methods that we used to predict oligomer structures in the CASP13 experiment. Performances of the methods on the 42 CASP13 oligomer targets consisting of 30 homo-oligomers and 12 hetero-oligomers are discussed. Our server method, Seok-assembly, generated models with interface contact similarity measure greater than 0.2 as model 1 for 11 homo-oligomer targets when proper templates existed in the database. Model refinement methods such as loop modeling and molecular dynamics (MD)-based overall refinement failed to improve model qualities when target proteins have domains not covered by templates or when chains have very small interfaces. In human predictions, additional experimental data such as low-resolution electron microscopy (EM) map were utilized. EM data could assist oligomer structure prediction by providing a global shape of the complex structure.


Asunto(s)
Biología Computacional , Conformación Proteica , Proteínas/ultraestructura , Programas Informáticos , Algoritmos , Humanos , Simulación de Dinámica Molecular , Multimerización de Proteína/genética , Proteínas/química , Proteínas/genética
13.
Proteins ; 87(12): 1351-1360, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31436360

RESUMEN

Scoring model structure is an essential component of protein structure prediction that can affect the prediction accuracy tremendously. Users of protein structure prediction results also need to score models to select the best models for their application studies. In Critical Assessment of techniques for protein Structure Prediction (CASP), model accuracy estimation methods have been tested in a blind fashion by providing models submitted by the tertiary structure prediction servers for scoring. In CASP13, model accuracy estimation results were evaluated in terms of both global and local structure accuracy. Global structure accuracy estimation was evaluated by the quality of the models selected by the global structure scores and by the absolute estimates of the global scores. Residue-wise, local structure accuracy estimations were evaluated by three different measures. A new measure introduced in CASP13 evaluates the ability to predict inaccurately modeled regions that may be improved by refinement. An intensive comparative analysis on CASP13 and the previous CASPs revealed that the tertiary structure models generated by the CASP13 servers show very distinct features. Higher consensus toward models of higher global accuracy appeared even for free modeling targets, and many models of high global accuracy were not well optimized at the atomic level. This is related to the new technology in CASP13, deep learning for tertiary contact prediction. The tertiary model structures generated by deep learning pose a new challenge for EMA (estimation of model accuracy) method developers. Model accuracy estimation itself is also an area where deep learning can potentially have an impact, although current EMA methods have not fully explored that direction.


Asunto(s)
Biología Computacional , Modelos Moleculares , Conformación Proteica , Proteínas/ultraestructura , Algoritmos , Bases de Datos de Proteínas , Aprendizaje Profundo , Proteínas/química , Proteínas/genética , Análisis de Secuencia de Proteína , Programas Informáticos
14.
J Comput Chem ; 40(31): 2739-2748, 2019 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-31423613

RESUMEN

Predicting conformational changes of both the protein and the ligand is a major challenge when a protein-ligand complex structure is predicted from the unbound protein and ligand structures. Herein, we introduce a new protein-ligand docking program called GalaxyDock3 that considers the full ligand conformational flexibility by explicitly sampling the ligand ring conformation and allowing the relaxation of the full ligand degrees of freedom, including bond angles and lengths. This method is based on the previous version (GalaxyDock2) which performs the global optimization of a designed score function. Ligand ring conformation is sampled from a ring conformation library constructed from structure databases. The GalaxyDock3 score function was trained with an additional bonded energy term for the ligand on a large set of complex structures. The performance of GalaxyDock3 was improved compared to GalaxyDock2 when predicted ligand conformation was used as the input for docking, especially when the input ligand conformation differs significantly from the crystal conformation. GalaxyDock3 also compared favorably with other available docking programs on two benchmark tests that contained diverse ligand rings. The program is freely available at http://galaxy.seoklab.org/softwares/galaxydock.html. © 2019 Wiley Periodicals, Inc.


Asunto(s)
Ligandos , Proteínas/química , Programas Informáticos , Conformación Molecular , Simulación del Acoplamiento Molecular
15.
J Comput Chem ; 40(27): 2413-2417, 2019 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-31173387

RESUMEN

Protein-protein docking methods are spotlighted for their roles in providing insights into protein-protein interactions in the absence of full structural information by experiment. GalaxyTongDock is an ab initio protein-protein docking web server that performs rigid-body docking just like ZDOCK but with improved energy parameters. The energy parameters were trained by iterative docking and parameter search so that more native-like structures are selected as top rankers. GalaxyTongDock performs asymmetric docking of two different proteins (GalaxyTongDock_A) and symmetric docking of homo-oligomeric proteins with Cn and Dn symmetries (GalaxyTongDock_C and GalaxyTongDock_D). Performance tests on an unbound docking benchmark set for asymmetric docking and a model docking benchmark set for symmetric docking showed that GalaxyTongDock is better or comparable to other state-of-the-art methods. Experimental and/or evolutionary information on binding interfaces can be easily incorporated by using block and interface options. GalaxyTongDock web server is freely available at http://galaxy.seoklab.org/tongdock. © 2019 Wiley Periodicals, Inc.


Asunto(s)
Simulación del Acoplamiento Molecular , Proteínas/química , Teoría Cuántica , Programas Informáticos
16.
Nucleic Acids Res ; 45(W1): W320-W324, 2017 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-28387820

RESUMEN

Homo-oligomerization of proteins is abundant in nature, and is often intimately related with the physiological functions of proteins, such as in metabolism, signal transduction or immunity. Information on the homo-oligomer structure is therefore important to obtain a molecular-level understanding of protein functions and their regulation. Currently available web servers predict protein homo-oligomer structures either by template-based modeling using homo-oligomer templates selected from the protein structure database or by ab initio docking of monomer structures resolved by experiment or predicted by computation. The GalaxyHomomer server, freely accessible at http://galaxy.seoklab.org/homomer, carries out template-based modeling, ab initio docking or both depending on the availability of proper oligomer templates. It also incorporates recently developed model refinement methods that can consistently improve model quality. Moreover, the server provides additional options that can be chosen by the user depending on the availability of information on the monomer structure, oligomeric state and locations of unreliable/flexible loops or termini. The performance of the server was better than or comparable to that of other available methods when tested on benchmark sets and in a recent CASP performed in a blind fashion.


Asunto(s)
Modelos Estadísticos , Simulación del Acoplamiento Molecular , Multimerización de Proteína , Proteínas/química , Programas Informáticos , Secuencia de Aminoácidos , Benchmarking , Bases de Datos de Proteínas , Humanos , Internet , Simulación de Dinámica Molecular , Estructura Secundaria de Proteína
17.
Proteins ; 86 Suppl 1: 257-273, 2018 03.
Artículo en Inglés | MEDLINE | ID: mdl-29127686

RESUMEN

We present the quality assessment of 5613 models submitted by predictor groups from both CAPRI and CASP for the total of 15 most tractable targets from the second joint CASP-CAPRI protein assembly prediction experiment. These targets comprised 12 homo-oligomers and 3 hetero-complexes. The bulk of the analysis focuses on 10 targets (of CAPRI Round 37), which included all 3 hetero-complexes, and whose protein chains or the full assembly could be readily modeled from structural templates in the PDB. On average, 28 CAPRI groups and 10 CASP groups (including automatic servers), submitted models for each of these 10 targets. Additionally, about 16 groups participated in the CAPRI scoring experiments. A range of acceptable to high quality models were obtained for 6 of the 10 Round 37 targets, for which templates were available for the full assembly. Poorer results were achieved for the remaining targets due to the lower quality of the templates available for the full complex or the individual protein chains, highlighting the unmet challenge of modeling the structural adjustments of the protein components that occur upon binding or which must be accounted for in template-based modeling. On the other hand, our analysis indicated that residues in binding interfaces were correctly predicted in a sizable fraction of otherwise poorly modeled assemblies and this with higher accuracy than published methods that do not use information on the binding partner. Lastly, the strengths and weaknesses of the assessment methods are evaluated and improvements suggested.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Modelos Moleculares , Conformación Proteica , Mapeo de Interacción de Proteínas/métodos , Multimerización de Proteína , Proteínas/química , Algoritmos , Humanos , Análisis de Secuencia de Proteína
18.
Proteins ; 85(3): 399-407, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-27770545

RESUMEN

Many proteins function as homo- or hetero-oligomers; therefore, attempts to understand and regulate protein functions require knowledge of protein oligomer structures. The number of available experimental protein structures is increasing, and oligomer structures can be predicted using the experimental structures of related proteins as templates. However, template-based models may have errors due to sequence differences between the target and template proteins, which can lead to functional differences. Such structural differences may be predicted by loop modeling of local regions or refinement of the overall structure. In CAPRI (Critical Assessment of PRotein Interactions) round 30, we used recently developed features of the GALAXY protein modeling package, including template-based structure prediction, loop modeling, model refinement, and protein-protein docking to predict protein complex structures from amino acid sequences. Out of the 25 CAPRI targets, medium and acceptable quality models were obtained for 14 and 1 target(s), respectively, for which proper oligomer or monomer templates could be detected. Symmetric interface loop modeling on oligomer model structures successfully improved model quality, while loop modeling on monomer model structures failed. Overall refinement of the predicted oligomer structures consistently improved the model quality, in particular in interface contacts. Proteins 2017; 85:399-407. © 2016 Wiley Periodicals, Inc.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Simulación del Acoplamiento Molecular/métodos , Proteínas/química , Secuencia de Aminoácidos , Benchmarking , Sitios de Unión , Unión Proteica , Conformación Proteica , Multimerización de Proteína , Proyectos de Investigación , Programas Informáticos , Homología Estructural de Proteína
19.
J Comput Aided Mol Des ; 31(7): 653-666, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-28623486

RESUMEN

Protein-ligand docking is a useful tool for providing atomic-level understanding of protein functions in nature and design principles for artificial ligands or proteins with desired properties. The ability to identify the true binding pose of a ligand to a target protein among numerous possible candidate poses is an essential requirement for successful protein-ligand docking. Many previously developed docking scoring functions were trained to reproduce experimental binding affinities and were also used for scoring binding poses. However, in this study, we developed a new docking scoring function, called GalaxyDock BP2 Score, by directly training the scoring power of binding poses. This function is a hybrid of physics-based, empirical, and knowledge-based score terms that are balanced to strengthen the advantages of each component. The performance of the new scoring function exhibits significant improvement over existing scoring functions in decoy pose discrimination tests. In addition, when the score is used with the GalaxyDock2 protein-ligand docking program, it outperformed other state-of-the-art docking programs in docking tests on the Astex diverse set, the Cross2009 benchmark set, and the Astex non-native set. GalaxyDock BP2 Score and GalaxyDock2 with this score are freely available at http://galaxy.seoklab.org/softwares/galaxydock.html .


Asunto(s)
Simulación del Acoplamiento Molecular , Proteínas/química , Sitios de Unión , Bases de Datos de Proteínas , Ligandos , Unión Proteica , Conformación Proteica , Proyectos de Investigación , Programas Informáticos
20.
J Comput Aided Mol Des ; 31(1): 107-118, 2017 01.
Artículo en Inglés | MEDLINE | ID: mdl-27696242

RESUMEN

As part of the SAMPL5 blind prediction challenge, we calculate the absolute binding free energies of six guest molecules to an octa-acid (OAH) and to a methylated octa-acid (OAMe). We use the double decoupling method via thermodynamic integration (TI) or Hamiltonian replica exchange in connection with the Bennett acceptance ratio (HREM-BAR). We produce the binding poses either through manual docking or by using GalaxyDock-HG, a docking software developed specifically for this study. The root mean square deviations for our most accurate predictions are 1.4 kcal mol-1 for OAH with TI and 1.9 kcal mol-1 for OAMe with HREM-BAR. Our best results for OAMe were obtained for systems with ionic concentrations corresponding to the ionic strength of the experimental solution. The most problematic system contains a halogenated guest. Our attempt to model the σ-hole of the bromine using a constrained off-site point charge, does not improve results. We use results from molecular dynamics simulations to argue that the distinct binding affinities of this guest to OAH and OAMe are due to a difference in the flexibility of the host. We believe that the results of this extensive analysis of host-guest complexes will help improve the protocol used in predicting binding affinities for larger systems, such as protein-substrate compounds.


Asunto(s)
Ligandos , Simulación de Dinámica Molecular , Proteínas/química , Termodinámica , Sitios de Unión , Conformación Molecular , Estructura Molecular , Unión Proteica , Teoría Cuántica , Programas Informáticos , Solventes/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA