RESUMO
Ribonucleic acid (RNA)-ligand interactions play a pivotal role in a wide spectrum of biological processes, ranging from protein biosynthesis to cellular reproduction. This recognition has prompted the broader acceptance of RNA as a viable candidate for drug targets. Delving into the atomic-scale understanding of RNA-ligand interactions holds paramount importance in unraveling intricate molecular mechanisms and further contributing to RNA-based drug discovery. Computational approaches, particularly molecular docking, offer an efficient way of predicting the interactions between RNA and small molecules. However, the accuracy and reliability of these predictions heavily depend on the performance of scoring functions (SFs). In contrast to the majority of SFs used in RNA-ligand docking, the end-point binding free energy calculation methods, such as molecular mechanics/generalized Born surface area (MM/GBSA) and molecular mechanics/Poisson Boltzmann surface area (MM/PBSA), stand as theoretically more rigorous approaches. Yet, the evaluation of their effectiveness in predicting both binding affinities and binding poses within RNA-ligand systems remains unexplored. This study first reported the performance of MM/PBSA and MM/GBSA with diverse solvation models, interior dielectric constants (εin) and force fields in the context of binding affinity prediction for 29 RNA-ligand complexes. MM/GBSA is based on short (5 ns) molecular dynamics (MD) simulations in an explicit solvent with the YIL force field; the GBGBn2 model with higher interior dielectric constant (εin = 12, 16 or 20) yields the best correlation (Rp = -0.513), which outperforms the best correlation (Rp = -0.317, rDock) offered by various docking programs. Then, the efficacy of MM/GBSA in identifying the near-native binding poses from the decoys was assessed based on 56 RNA-ligand complexes. However, it is evident that MM/GBSA has limitations in accurately predicting binding poses for RNA-ligand systems, particularly compared with notably proficient docking programs like rDock and PLANTS. The best top-1 success rate achieved by MM/GBSA rescoring is 39.3%, which falls below the best results given by docking programs (50%, PLNATS). This study represents the first evaluation of MM/PBSA and MM/GBSA for RNA-ligand systems and is expected to provide valuable insights into their successful application to RNA targets.
Assuntos
Simulação de Dinâmica Molecular , RNA , Simulação de Acoplamento Molecular , Ligantes , Reprodutibilidade dos Testes , Ligação Proteica , Termodinâmica , Sítios de LigaçãoRESUMO
Machine-learning (ML)-based scoring functions (MLSFs) have gradually emerged as a promising alternative for protein-ligand binding affinity prediction and structure-based virtual screening. However, clouds of doubts have still been raised against the benefits of this novel type of scoring functions (SFs). In this study, to benchmark the performance of target-specific MLSFs on a relatively unbiased dataset, the MLSFs trained from three representative protein-ligand interaction representations were assessed on the LIT-PCBA dataset, and the classical Glide SP SF and three types of ligand-based quantitative structure-activity relationship (QSAR) models were also utilized for comparison. Two major aspects in virtual screening campaigns, including prediction accuracy and hit novelty, were systematically explored. The calculation results illustrate that the tested target-specific MLSFs yielded generally superior performance over the classical Glide SP SF, but they could hardly outperform the 2D fingerprint-based QSAR models. Although substantial improvements could be achieved by integrating multiple types of protein-ligand interaction features, the MLSFs were still not sufficient to exceed MACCS-based QSAR models. In terms of the correlations between the hit ranks or the structures of the top-ranked hits, the MLSFs developed by different featurization strategies would have the ability to identify quite different hits. Nevertheless, it seems that target-specific MLSFs do not have the intrinsic attributes of a traditional SF and may not be a substitute for classical SFs. In contrast, MLSFs can be regarded as a new derivative of ligand-based QSAR models. It is expected that our study may provide valuable guidance for the assessment and further development of target-specific MLSFs.
Assuntos
Bases de Dados de Proteínas , Aprendizado de Máquina , Simulação de Acoplamento Molecular , Proteínas/química , Ligantes , Relação Quantitativa Estrutura-AtividadeRESUMO
Protein kinases have been regarded as important therapeutic targets for many diseases. Currently, a total of 41 kinase inhibitors have been approved by the Food and Drug Administration, along with a large number of kinase inhibitors being evaluated in clinical and preclinical trials. Among all, allosteric inhibitors, such as type II kinase inhibitors, have attracted extensive attention owing to their potential high selectivity. Nowadays, molecular docking has become a powerful tool to search for novel kinase inhibitors. However, as for type II kinase inhibitors, their allosteric characteristics may exert a deep influence on docking accuracy. In this study, a comprehensive assessment was conducted to evaluate the effectiveness of nine docking algorithms towards type II kinase inhibitors. The calculation results showed that most tested docking programs, especially Glide with XP scoring, LeDock and Surflex-Dock, succeeded in the accurate identification of near-native binding poses, with the success rates ranging from 0.80 to 0.90, and the scoring functions in GOLD and LeDock outperformed the others in the prediction of relative binding affinities. In terms of the P-values, areas under the curve and enrichment factors, Glide with XP scoring, Surflex-Dock, GOLD with Astex Statistical Potential scoring and LeDock had better screening power to discriminate between active compounds and decoys. However, the screening power is sensitive to different initial conformations of the same target. It is expected that our study can provide some guidance for docking-based virtual screening to discover novel type II kinase inhibitors, as well as other allosteric inhibitors.
RESUMO
DNA methyltransferase 3A (DNMT3A) has been regarded as a potential epigenetic target for the development of cancer therapeutics. A number of DNMT3A inhibitors have been reported, but most of them do not have good potency, high selectivity and/or low cytotoxicity. It has been suggested that a non-conserved region around the target recognition domain (TRD) loop is implicated in the DNMT3A activity under the allosteric regulation of the ATRX-DNMT3-DNMT3L (ADD) domain, but the molecular mechanism of the regulation of the TRD loop on the DNMT3A activity needs to be elucidated. In this study, based on the reported crystal structures, the dynamics of the TRD loop in different multimerization with/without the bound guest molecule, namely the ADD domain or the DNA molecule, was investigated using conventional molecular dynamics (MD) and umbrella sampling simulations. The simulation results illustrate that the TRD loop exhibits relatively higher flexibility than the other components in the whole catalytic domain (CD), which could be well stabilized into different local minima through the binding with either the ADD domain or the DNA molecule by forming tight hydrogen-bond and salt-bridge networks involving distinct residues. Moreover, the movement of the TRD loop away from the catalytic loop upon activation could be triggered simply by the detachment of the ADD domain, but not necessarily induced by the ADD domain relocation on the CD. All these dynamic structural details could be a supplement to the previously reported crystal structure, which underlines the importance of the structural flexibility for the critical residues in the TRD loop, arousing more interest in the rational design of novel DNMT3A inhibitors targeting this region.
Assuntos
DNA (Citosina-5-)-Metiltransferases , Simulação de Dinâmica Molecular , Domínio Catalítico , DNA/metabolismo , Metilação de DNA , DNA Metiltransferase 3ARESUMO
The molecular mechanics/generalized Born surface area (MM/GBSA) has been widely used in end-point binding free energy prediction in structure-based drug design (SBDD). However, in practice, it is usually being treated as a disputed method mostly because of its system dependence. Here, combining with machine-learning optimization, we developed a novel version of MM/GBSA, named variable atomic dielectric MM/GBSA (VAD-MM/GBSA), by assigning variable dielectric constants directly to the protein/ligand atoms. The new strategy exhibits markedly improved accuracy in binding affinity calculations for various protein-ligand systems and is promising to be used in the postprocessing of structure-based virtual screening. Moreover, VAD-MM/GBSA outperformed prime MM/GBSA in Schrödinger software and showed remarkable predictive performance for specific protein targets, such as POL polyprotein, human immunodeficiency virus type 1 (HIV-1) protease, etc. Our study showed that the VAD-MM/GBSA method with little extra computational overhead provides a potential replacement of the MM/GBSA in AMBER software. An online web server of VAD-MMGBSA has been developed and is now available at http://cadd.zju.edu.cn/vdgb.
Assuntos
Simulação de Dinâmica Molecular , Proteínas , Entropia , Humanos , Ligantes , Ligação Proteica , Proteínas/metabolismo , TermodinâmicaRESUMO
Molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) and molecular mechanics generalized Born surface area (MM/GBSA) are arguably very popular methods for binding free energy prediction since they are more accurate than most scoring functions of molecular docking and less computationally demanding than alchemical free energy methods. MM/PBSA and MM/GBSA have been widely used in biomolecular studies such as protein folding, protein-ligand binding, protein-protein interaction, etc. In this review, methods to adjust the polar solvation energy and to improve the performance of MM/PBSA and MM/GBSA calculations are reviewed and discussed. The latest applications of MM/GBSA and MM/PBSA in drug design are also presented. This review intends to provide readers with guidance for practically applying MM/PBSA and MM/GBSA in drug design and related research fields.
Assuntos
Desenho de Fármacos , Preparações Farmacêuticas/química , Humanos , Modelos Moleculares , Simulação de Acoplamento Molecular , Farmacologia , Propriedades de Superfície , TermodinâmicaRESUMO
Protein-protein interactions (PPIs) play an important role in the different functions of cells, but accurate prediction of the three-dimensional structures for PPIs is still a notoriously difficult task. In this study, HawkDock, a free and open accessed web server, was developed to predict and analyze the structures of PPIs. In the HawkDock server, the ATTRACT docking algorithm, the HawkRank scoring function developed in our group and the MM/GBSA free energy decomposition analysis were seamlessly integrated into a multi-functional platform. The structures of PPIs were predicted by combining the ATTRACT docking and the HawkRank re-scoring, and the key residues for PPIs were highlighted by the MM/GBSA free energy decomposition. The molecular visualization was supported by 3Dmol.js. For the structural modeling of PPIs, HawkDock could achieve a better performance than ZDOCK 3.0.2 in the benchmark testing. For the prediction of key residues, the important residues that play an essential role in PPIs could be identified in the top 10 residues for â¼81.4% predicted models and â¼95.4% crystal structures in the benchmark dataset. To sum up, the HawkDock server is a powerful tool to predict the binding structures and identify the key residues of PPIs. The HawkDock server is accessible free of charge at http://cadd.zju.edu.cn/hawkdock/.
Assuntos
Algoritmos , Simulação de Acoplamento Molecular/métodos , Proteínas/química , Software , Sequência de Aminoácidos , Benchmarking , Sítios de Ligação , Cristalografia por Raios X , Humanos , Internet , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Proteínas/metabolismo , Alinhamento de Sequência , Homologia Estrutural de Proteína , TermodinâmicaRESUMO
SUMMARY: Protein-protein interactions (PPIs) have been regarded as an attractive emerging class of therapeutic targets for the development of new treatments. Computational approaches, especially molecular docking, have been extensively employed to predict the binding structures of PPI-inhibitors or discover novel small molecule PPI inhibitors. However, due to the relatively 'undruggable' features of PPI interfaces, accurate predictions of the binding structures for ligands towards PPI targets are quite challenging for most docking algorithms. Here, we constructed a non-redundant pose ranking benchmark dataset for small-molecule PPI inhibitors, which contains 900 binding poses for 184 protein-ligand complexes. Then, we evaluated the performance of MM/PB(GB)SA approaches to identify the correct binding poses for PPI inhibitors, including two Prime MM/GBSA procedures from the Schrödinger suite and seven different MM/PB(GB)SA procedures from the Amber package. Our results showed that MM/PBSA outperformed the Glide SP scoring function (success rate of 58.6%) and MM/GBSA in most cases, especially the PB3 procedure which could achieve an overall success rate of â¼74%. Moreover, the GB6 procedure (success rate of 68.9%) performed much better than the other MM/GBSA procedures, highlighting the excellent potential of the GBNSR6 implicit solvation model for pose ranking. Finally, we developed the webserver of Fast Amber Rescoring for PPI Inhibitors (farPPI), which offers a freely available service to rescore the docking poses for PPI inhibitors by using the MM/PB(GB)SA methods. AVAILABILITY AND IMPLEMENTATION: farPPI web server is freely available at http://cadd.zju.edu.cn/farppi/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Proteínas/química , Algoritmos , Sítios de Ligação , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , SoftwareRESUMO
In structure-based drug design (SBDD), the molecular mechanics generalized Born surface area (MM/GBSA) approach has been widely used in ranking the binding affinity of small molecule ligands. However, an accurate estimation of protein-ligand binding affinity still remains a challenge due to the intrinsic limitation of the standard generalized Born (GB) model used in MM/GBSA. In this study, we proposed and evaluated the MM/GBSA approach based on a variable dielectric generalized Born (VDGB) model using residue-type-based dielectric constants. In the VDGB model, different dielectric values were assigned for the three types of protein residues, and the magnitude of the dielectric constants for residue types follows this order: charged ≥ polar ≥ nonpolar. We found that MM/GBSA based on a VDGB model (MM/GBSAVDGB) with an optimal dielectric constant of 4.0 for the charged residues and 1.0 for the noncharged residues together with a net-charge-dependent dielectric value for ligands achieved better predictions as judged by Pearson's correlation coefficient than the standard MM/GBSA with a uniform solute dielectric constant of 4.0 for the training set of 130 protein-ligand complexes. The prediction on the test set with 165 protein-ligand complexes also validated the better performance of MM/GBSAVDGB. Moreover, this method exhibited potential in predicting the relative binding free energies for multiple ligands against the same target. Furthermore, we found that rational truncation of protein residues far from the binding site can significantly speed up the MM/GBSAVDGB calculations, while it almost does not influence the prediction accuracy. Therefore, it is feasible to implement the system-truncated MM/GBSAVDGB as a scoring function for SBDD.
Assuntos
Simulação de Dinâmica Molecular , Proteínas , Sítios de Ligação , Entropia , Ligantes , Ligação Proteica , Proteínas/metabolismo , TermodinâmicaRESUMO
The villin headpiece subdomain (HP35) is a fast-folding protein with 35 residues and its folding pathways have been extensively studied experimentally and theoretically but remain controversial. While experiments showed that HP35 might have multiple folding pathways, most theoretical studies only found one major pathway, although a few theoretical studies revealed two. Here we report our results of molecular dynamics simulations of HP35 folding by using the newest AMBER ff14SB force field and show that HP35 has a novel folding pathway in addition to the two pathways shown previously. We also study the mechanism of determining the folding pathways and found that the dynamics of Helix2 may play a special role in the folding of HP35. Our results may be helpful to understand the folding mechanism of HP35 further.
Assuntos
Proteínas dos Microfilamentos/química , Simulação de Dinâmica Molecular , Peptídeos/química , Dobramento de Proteína , Domínios Proteicos , TermodinâmicaRESUMO
A significant number of protein-protein interactions (PPIs) are mediated through the interactions between proteins and peptide segments, and therefore determination of protein-peptide interactions (PpIs) is critical to gain an in-depth understanding of the PPI network and even design peptides or small molecules capable of modulating PPIs. Computational approaches, especially molecular docking, provide an efficient way to model PpIs, and a reliable scoring function that can recognize the correct binding conformations for protein-peptide complexes is one of the most important components in protein-peptide docking. The end-point binding free energy calculation methods, such as MM/GBSA and MM/PBSA, are theoretically more rigorous than most empirical and semi-empirical scoring functions designed for protein-peptide docking, but their performance in predicting binding affinities and binding poses for protein-peptide systems has not been systematically assessed. In this study, we first evaluated the capability of MM/GBSA and MM/PBSA with different solvation models, interior dielectric constants (εin) and force fields to predict the binding affinities for 53 protein-peptide complexes. For the 19 short peptides with 5-12 residues, MM/PBSA based on the minimized structures in explicit solvent with the ff99 force field and εin = 2 yields the best correlation between the predicted binding affinities and the experimental data (rp = 0.748), while for the 34 medium-size peptides with 20-25 residues, MM/GBSA based on 1 ns of molecular dynamics (MD) simulations in implicit solvent with the ff03 force field, the GBOBC1 model and a low interior dielectric constant (εin = 1) yields the best accuracy (rp = 0.735). Then, we assessed the rescoring capability of MM/PBSA and MM/GBSA to distinguish the correct binding conformations from the decoys for 112 protein-peptide systems. The results illustrate that MM/PBSA based on the minimized structures with the ff99 or ff14SB force field and MM/GBSA based on the minimized structures with the ff03 force field show excellent capability to recognize the near-native binding poses for the short and medium-size peptides, respectively, and they outperform the predictions given by two popular protein-peptide docking algorithms (pepATTRACT and HPEPDOCK). Therefore, MM/PBSA and MM/GBSA are powerful tools to predict the binding affinities and identify the correct binding poses for protein-peptide systems.
Assuntos
Simulação de Dinâmica Molecular , Peptídeos/química , Proteínas/química , Teoria Quântica , Algoritmos , Ligação ProteicaRESUMO
Enhanced sampling has been extensively used to capture the conformational transitions in protein folding, but it attracts much less attention in the studies of protein-protein recognition. In this study, we evaluated the impact of enhanced sampling methods and solute dielectric constants on the overall accuracy of the molecular mechanics/Poisson-Boltzmann surface area (MM/PBSA) and molecular mechanics/generalized Born surface area (MM/GBSA) approaches for the protein-protein binding free energy calculations. Here, two widely used enhanced sampling methods, including aMD and GaMD, and conventional molecular dynamics (cMD) simulations with two AMBER force fields (ff03 and ff14SB) were used to sample the conformations for 21 protein-protein complexes. The MM/PBSA and MM/GBSA calculation results illustrate that the standard MM/GBSA based on the cMD simulations yields the best Pearson correlation (rp = -0.523) between the predicted binding affinities and the experimental data, which is much higher than that given by MM/PBSA (rp = -0.212). Two enhanced sampling methods (aMD and GaMD) are indeed more efficient for conformational sampling, but they did not improve the binding affinity predictions for protein-protein systems, suggesting that the aMD or GaMD sampling (at least in short timescale simulations) may not be a good choice for the MM/PBSA and MM/GBSA predictions of protein-protein complexes. The solute dielectric constant of 1.0 is recommended to MM/GBSA, but a higher solute dielectric constant is recommended to MM/PBSA, especially for the systems with higher polarity on the protein-protein binding interfaces. Then, a preliminary assessment of the MM/GBSA calculations based on a variable dielectric generalized Born (VDGB) model was conducted. The results highlight the potential power of VDGB in the free energy predictions for protein-protein systems, but more thorough studies should be done in the future.
Assuntos
Técnicas de Química Analítica/métodos , Modelos Químicos , Proteínas/química , Técnicas de Química Analítica/normas , Simulação de Dinâmica Molecular , Ligação Proteica , Conformação Proteica , Reprodutibilidade dos TestesRESUMO
DNA methyltransferases (DNMTs), responsible for the regulation of DNA methylation, have been regarded as promising drug targets for cancer therapy. However, high structural conservation of the catalytic domains of DNMTs poses a big challenge to design selective inhibitors for a specific DNMT isoform. In this study, molecular dynamics (MD) simulations, end-point free energy calculations and umbrella sampling (US) simulations were performed to reveal the molecular basis of the binding selectivity of three representative DNMT inhibitors towards DNMT1 and DNMT3A, including SFG (DNMT1 and DNMT3A dual inhibitors), DC-05 (DNMT1 selective inhibitor) and GSKex1 (DNMT3A selective inhibitor). The binding selectivity of the studied inhibitors reported in previous experiments is reproduced by the MD simulation and binding free energy prediction. The simulation results also suggest that the driving force to determine the binding selectivity of the studied inhibitors stems from the difference in the protein-inhibitor van der Waals interactions. Meanwhile, the per-residue free energy decomposition reveals that the contributions from several non-conserved residues in the binding pocket of DNMT1/DNMT3A, especially Val1580/Trp893, Asn1578/Arg891 and Met1169/Val665, are the key factors responsible for the binding selectivity of DNMT inhibitors. In addition, the binding preference of the studied inhibitors was further validated by the potentials of mean force predicted by the US simulations. This study will provide valuable information for the rational design of novel selective inhibitors targeting DNMT1 and DNMT3A.
Assuntos
DNA (Citosina-5-)-Metiltransferase 1/antagonistas & inibidores , DNA (Citosina-5-)-Metiltransferases/antagonistas & inibidores , Inibidores Enzimáticos/química , Simulação de Dinâmica Molecular , Sítios de Ligação , Domínio Catalítico , DNA (Citosina-5-)-Metiltransferase 1/química , DNA (Citosina-5-)-Metiltransferases/química , Metilação de DNA , DNA Metiltransferase 3A , Ligação Proteica , Conformação Proteica , TermodinâmicaRESUMO
Target-aware drug discovery has greatly accelerated the drug discovery process to design small-molecule ligands with high binding affinity to disease-related protein targets. Conditioned on targeted proteins, previous works utilize various kinds of deep generative models and have shown great potential in generating molecules with strong protein-ligand binding interactions. However, beyond binding affinity, effective drug molecules must manifest other essential properties such as high drug-likeness, which are not explicitly addressed by current target-aware generative methods. In this article, aiming to bridge the gap of multi-objective target-aware molecule generation in the field of deep learning-based drug discovery, we propose ParetoDrug, a Pareto Monte Carlo Tree Search (MCTS) generation algorithm. ParetoDrug searches molecules on the Pareto Front in chemical space using MCTS to enable synchronous optimization of multiple properties. Specifically, ParetoDrug utilizes pretrained atom-by-atom autoregressive generative models for the exploration guidance to desired molecules during MCTS searching. Besides, when selecting the next atom symbol, a scheme named ParetoPUCT is proposed to balance exploration and exploitation. Benchmark experiments and case studies demonstrate that ParetoDrug is highly effective in traversing the large and complex chemical space to discover novel compounds with satisfactory binding affinities and drug-like properties for various multi-objective target-aware drug discovery tasks.
Assuntos
Descoberta de Drogas , Método de Monte Carlo , Conscientização , Descoberta de Drogas/métodos , Humanos , Aprendizado de MáquinaRESUMO
Androgen receptor (AR) antagonists are widely used for the treatment of prostate cancer (PCa), but their therapeutic efficacy is usually compromised by the rapid emergence of drug resistance. However, the lack of the detailed interaction between AR and its antagonists poses a major obstacle to the design of novel AR antagonists. Here, funnel metadynamics is employed to elucidate the inherent regulation mechanisms of three AR antagonists (hydroxyflutamide, enzalutamide, and darolutamide) on AR. For the first time it is observed that the binding of antagonists significantly disturbed the C-terminus of AR helix-11, thereby disrupting the specific internal hydrophobic contacts of AR-LBD and correspondingly the communication between AR ligand binding pocket (AR-LBP), activation function 2 (AF2), and binding function 3 (BF3). The subsequent bioassays verified the necessity of the hydrophobic contacts for AR function. Furthermore, it is found that darolutamide, a newly approved AR antagonist capable of fighting almost all reported drug resistant AR mutants, can induce antagonistic binding structure. Subsequently, docking-based virtual screening toward the dominant binding conformation of AR for darolutamide is conducted, and three novel AR antagonists with favorable binding affinity and strong capability to combat drug resistance are identified by in vitro bioassays. This work provides a novel rational strategy for the development of anti-resistant AR antagonists.
Assuntos
Antagonistas de Receptores de Andrógenos , Benzamidas , Antagonistas de Receptores de Andrógenos/farmacologia , Antagonistas de Receptores de Andrógenos/química , Humanos , Benzamidas/farmacologia , Feniltioidantoína/farmacologia , Feniltioidantoína/análogos & derivados , Masculino , Receptores Androgênicos/metabolismo , Receptores Androgênicos/química , Receptores Androgênicos/genética , Nitrilas/farmacologia , Simulação de Dinâmica Molecular , Resistencia a Medicamentos Antineoplásicos/efeitos dos fármacos , Resistencia a Medicamentos Antineoplásicos/genética , Neoplasias da Próstata/tratamento farmacológico , Neoplasias da Próstata/genética , Pirazóis/farmacologia , Pirazóis/química , Simulação de Acoplamento Molecular/métodos , Amidas/farmacologia , Amidas/química , Flutamida/análogos & derivadosRESUMO
Protein loop modeling is a challenging yet highly nontrivial task in protein structure prediction. Despite recent progress, existing methods including knowledge-based, ab initio, hybrid, and deep learning (DL) methods fall substantially short of either atomic accuracy or computational efficiency. To overcome these limitations, we present KarmaLoop, a novel paradigm that distinguishes itself as the first DL method centered on full-atom (encompassing both backbone and side-chain heavy atoms) protein loop modeling. Our results demonstrate that KarmaLoop considerably outperforms conventional and DL-based methods of loop modeling in terms of both accuracy and efficiency, with the average RMSDs of 1.77 and 1.95 Å for the CASP13+14 and CASP15 benchmark datasets, respectively, and manifests at least 2 orders of magnitude speedup in general compared with other methods. Consequently, our comprehensive evaluations indicate that KarmaLoop provides a state-of-the-art DL solution for protein loop modeling, with the potential to hasten the advancement of protein engineering, antibody-antigen recognition, and drug design.
RESUMO
Protein-protein interaction plays an important role in studying the mechanism of protein functions from the structural perspective. Molecular docking is a powerful approach to detect protein-protein complexes using computational tools, due to the high cost and time-consuming of the traditional experimental methods. Among existing technologies, the template-based method utilizes the structural information of known homologous 3D complexes as available and reliable templates to achieve high accuracy and low computational complexity. However, the performance of the template-based method depends on the quality and quantity of templates. When insufficient or even no templates, the ab initio docking method is necessary and largely enriches the docking conformations. Therefore, it's a feasible strategy to fuse the effectivity of the template-based model and the universality of ab initio model to improve the docking performance. In this study, we construct a new, diverse, comprehensive template library derived from PDB, containing 77,685 complexes. We propose a template-based method (named TemDock), which retrieves the evolutionary relationship between the target sequence and samples in the template library and transfers similar structural information. Then, the target structure is built by superposing on the homologous template complex with TM-align. Moreover, we develop a consensus-based method (named ComDock) to integrate our TemDock and an existing ab initio method (ZDOCK). On 105 targets with templates from Benchmark 5.0, the TemDock and ComDock achieve a success rate of 68.57 % and 71.43 % in the top 10 conformations, respectively. Compared with the HDOCK, ComDock obtains better I-RMSD of hit configurations on 9 targets and more hit models in the top 100 conformations. As an efficient method for protein-protein docking, the ComDock is expected to study protein-protein recognition and reveal the various biological passways that are critical for developing drug discovery. The final results are stored at https://github.com/guofei-tju/mqz_ComDock_docking.
Assuntos
Algoritmos , Software , Simulação de Acoplamento Molecular , Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas/química , Ligação ProteicaRESUMO
Compound-protein interactions (CPI) play significant roles in drug development. To avoid side effects, it is also crucial to evaluate drug selectivity when binding to different targets. However, most selectivity prediction models are constructed for specific targets with limited data. In this study, we present a pretrained multi-functional model for compound-protein interaction prediction (PMF-CPI) and fine-tune it to assess drug selectivity. This model uses recurrent neural networks to process the protein embedding based on the pretrained language model TAPE, extracts molecular information from a graph encoder, and produces the output from dense layers. PMF-CPI obtained the best performance compared to outstanding approaches on both the binding affinity regression and CPI classification tasks. Meanwhile, we apply the model to analyzing drug selectivity after fine-tuning it on three datasets related to specific targets, including human cytochrome P450s. The study shows that PMF-CPI can accurately predict different drug affinities or opposite interactions toward similar targets, recognizing selective drugs for precise therapeutics.Kindly confirm if corresponding authors affiliations are identified correctly and amend if any.Yes, it is correct.
RESUMO
Artificial intelligence has deeply revolutionized the field of medicinal chemistry with many impressive applications, but the success of these applications requires a massive amount of training samples with high-quality annotations, which seriously limits the wide usage of data-driven methods. In this paper, we focus on the reaction yield prediction problem, which assists chemists in selecting high-yield reactions in a new chemical space only with a few experimental trials. To attack this challenge, we first put forth MetaRF, an attention-based random forest model specially designed for the few-shot yield prediction, where the attention weight of a random forest is automatically optimized by the meta-learning framework and can be quickly adapted to predict the performance of new reagents while given a few additional samples. To improve the few-shot learning performance, we further introduce a dimension-reduction based sampling method to determine valuable samples to be experimentally tested and then learned. Our methodology is evaluated on three different datasets and acquires satisfactory performance on few-shot prediction. In high-throughput experimentation (HTE) datasets, the average yield of our methodology's top 10 high-yield reactions is relatively close to the results of ideal yield selection.
RESUMO
Inverse Protein Folding (IPF) is an important task of protein design, which aims to design sequences compatible with a given backbone structure. Despite the prosperous development of algorithms for this task, existing methods tend to rely on noisy predicted residues located in the local neighborhood when generating sequences. To address this limitation, we propose an entropy-based residue selection method to remove noise in the input residue context. Additionally, we introduce ProRefiner, a memory-efficient global graph attention model to fully utilize the denoised context. Our proposed method achieves state-of-the-art performance on multiple sequence design benchmarks in different design settings. Furthermore, we demonstrate the applicability of ProRefiner in redesigning Transposon-associated transposase B, where six out of the 20 variants we propose exhibit improved gene editing activity.