Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
2.
J Cheminform ; 16(1): 38, 2024 Mar 31.
Artículo en Inglés | MEDLINE | ID: mdl-38556873

RESUMEN

Accurate prediction of the enzyme comission (EC) numbers for chemical reactions is essential for the understanding and manipulation of enzyme functions, biocatalytic processes and biosynthetic planning. A number of machine leanring (ML)-based models have been developed to classify enzymatic reactions, showing great advantages over costly and long-winded experimental verifications. However, the prediction accuracy for most available models trained on the records of chemical reactions without specifying the enzymatic catalysts is rather limited. In this study, we introduced BEC-Pred, a BERT-based multiclassification model, for predicting EC numbers associated with reactions. Leveraging transfer learning, our approach achieves precise forecasting across a wide variety of Enzyme Commission (EC) numbers solely through analysis of the SMILES sequences of substrates and products. BEC-Pred model outperformed other sequence and graph-based ML methods, attaining a higher accuracy of 91.6%, surpassing them by 5.5%, and exhibiting superior F1 scores with improvements of 6.6% and 6.0%, respectively. The enhanced performance highlights the potential of BEC-Pred to serve as a reliable foundational tool to accelerate the cutting-edge research in synthetic biology and drug metabolism. Moreover, we discussed a few examples on how BEC-Pred could accurately predict the enzymatic classification for the Novozym 435-induced hydrolysis and lipase efficient catalytic synthesis. We anticipate that BEC-Pred will have a positive impact on the progression of enzymatic research.

3.
Comput Biol Med ; 174: 108397, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38603896

RESUMEN

The equilibrium of cellular protein levels is pivotal for maintaining normal physiological functions. USP5 belongs to the deubiquitination enzyme (DUBs) family, controlling protein degradation and preserving cellular protein homeostasis. Aberrant expression of USP5 is implicated in a variety of diseases, including cancer, neurodegenerative diseases, and inflammatory diseases. In this paper, a multi-level virtual screening (VS) approach was employed to target the zinc finger ubiquitin-binding domain (ZnF-UBD) of USP5, leading to the identification of a highly promising candidate compound 0456-0049. Molecular dynamics (MD) simulations were then employed to assess the stability of complex binding and predict hotspot residues in interactions. The results indicated that the candidate stably binds to the ZnF-UBD of USP5 through crucial interactions with residues ARG221, TRP209, GLY220, ASN207, TYR261, TYR259, and MET266. Binding free energy calculations, along with umbrella sampling (US) simulations, underscored a superior binding affinity of the candidate relative to known inhibitors. Moreover, US simulations revealed conformational changes of USP5 during ligand dissociation. These insights provide a valuable foundation for the development of novel inhibitors targeting USP5.


Asunto(s)
Endopeptidasas , Dedos de Zinc , Humanos , Endopeptidasas/química , Endopeptidasas/metabolismo , Simulación de Dinámica Molecular , Unión Proteica , Dominios Proteicos
4.
J Chem Theory Comput ; 20(3): 1465-1478, 2024 Feb 13.
Artículo en Inglés | MEDLINE | ID: mdl-38300792

RESUMEN

Multisite λ-dynamics (MSLD) is a highly efficient binding free energy calculation method that samples multiple ligands in a single round by assigning different λ values to the alchemical part of each ligand. This method holds great promise for lead optimization (LO) in drug discovery. However, the complex data preparation and simulation process limits its widespread application in diverse protein-ligand systems. To address this challenge, we developed a comprehensive, open-source, and automated workflow for MSLD calculations based on the BLaDE dynamics engine. This workflow incorporates the Ligand Internal and Cartesian coordinate reconstruction-based alignment algorithm (LIC-align) and an optimized maximum common substructure (MCS) search algorithm to accurately generate MSLD multiple topologies with ideal perturbation patterns. Furthermore, our workflow is highly modularized, allowing straightforward integration and extension of various simulation techniques, and is highly accessible to nonexperts. This workflow was validated by calculating the relative binding free energies of large-scale congeneric ligands, many of which have large perturbing groups. The agreement between the calculations and experiments was excellent, with an average unsigned error of 1.08 ± 0.47 kcal/mol. More than 57.1% of the ligands had an error of less than 1.0 kcal/mol, and the perturbations of 6 targets were fully connected via the calculations, while those of 2 targets were connected via both calculations and experimental data. The Pearson correlation coefficient reached 0.88, indicating that the MSLD workflow provides accurate predictions that can guide lead optimization in drug discovery. We also examined the impact of single-site versus multisite perturbations, ligand grouping by perturbing group size, and the position of the anchor atom on the MSLD performance. By integrating our proposed LIC-align and optimized MCS search algorithm along with the coping strategies to handle challenging molecular substructures, our workflow can handle many realistic scenarios more reasonably than all previously published methods. Moreover, we observed that our MSLD workflow achieved similar accuracy to free energy perturbation (FEP) while improving computational efficiency by over 1 order of magnitude in speedup. These findings provide valuable insights and strategies for further MSLD development, making MSLD a competitive tool for lead optimization.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Termodinámica , Ligandos , Flujo de Trabajo , Proteínas/química , Unión Proteica
5.
J Chem Inf Model ; 64(4): 1213-1228, 2024 Feb 26.
Artículo en Inglés | MEDLINE | ID: mdl-38302422

RESUMEN

Deep learning-based de novo molecular design has recently gained significant attention. While numerous DL-based generative models have been successfully developed for designing novel compounds, the majority of the generated molecules lack sufficiently novel scaffolds or high drug-like profiles. The aforementioned issues may not be fully captured by commonly used metrics for the assessment of molecular generative models, such as novelty, diversity, and quantitative estimation of the drug-likeness score. To address these limitations, we proposed a genetic algorithm-guided generative model called GARel (genetic algorithm-based receptor-ligand interaction generator), a novel framework for training a DL-based generative model to produce drug-like molecules with novel scaffolds. To efficiently train the GARel model, we utilized dense net to update the parameters based on molecules with novel scaffolds and drug-like features. To demonstrate the capability of the GARel model, we used it to design inhibitors for three targets: AA2AR, EGFR, and SARS-Cov2. The results indicate that GARel-generated molecules feature more diverse and novel scaffolds and possess more desirable physicochemical properties and favorable docking scores. Compared with other generative models, GARel makes significant progress in balancing novelty and drug-likeness, providing a promising direction for the further development of DL-based de novo design methodology with potential impacts on drug discovery.


Asunto(s)
Diseño de Fármacos , ARN Viral , Ligandos , Algoritmos , Descubrimiento de Drogas
6.
Chem Sci ; 15(4): 1449-1471, 2024 Jan 24.
Artículo en Inglés | MEDLINE | ID: mdl-38274053

RESUMEN

The expertise accumulated in deep neural network-based structure prediction has been widely transferred to the field of protein-ligand binding pose prediction, thus leading to the emergence of a variety of deep learning-guided docking models for predicting protein-ligand binding poses without relying on heavy sampling. However, their prediction accuracy and applicability are still far from satisfactory, partially due to the lack of protein-ligand binding complex data. To this end, we create a large-scale complex dataset containing ∼9 M protein-ligand docking complexes for pre-training, and propose CarsiDock, the first deep learning-guided docking approach that leverages pre-training of millions of predicted protein-ligand complexes. CarsiDock contains two main stages, i.e., a deep learning model for the prediction of protein-ligand atomic distance matrices, and a translation, rotation and torsion-guided geometry optimization procedure to reconstruct the matrices into a credible binding pose. The pre-training and multiple innovative architectural designs facilitate the dramatically improved docking accuracy of our approach over the baselines in terms of multiple docking scenarios, thereby contributing to its outstanding early recognition performance in several retrospective virtual screening campaigns. Further explorations demonstrate that CarsiDock can not only guarantee the topological reliability of the binding poses but also successfully reproduce the crucial interactions in crystalized structures, highlighting its superior applicability.

7.
J Med Chem ; 67(1): 138-151, 2024 Jan 11.
Artículo en Inglés | MEDLINE | ID: mdl-38153295

RESUMEN

Androgen receptor (AR) is the primary target for treating prostate cancer (PCa), which inevitably progresses due to drug-resistant mutations. Bromodomain-containing protein 4 (BRD4) has been a new potential drug target for PCa treatment. Herein, we report the rational design and discovery of novel BRD4 inhibitors through computer-aided drug design (CADD), and a hit compound SQ-1 (IC50 = 676 nM) was identified by structure-based virtual screening (SBVS) with the conserved water network. To optimize the structure of SQ-1, the free energy landscape was constructed, and the binding mechanism was explored by characterizing the water profile and the dissociation mechanism. Finally, the compound SQ-17 with improved inhibitory activity (IC50 < 100 nM) was discovered, which showed potent antiproliferative activity against LNCaP. These data highlighted a successful attempt to identify and optimize a small molecule by comprehensive CADD application and provided essential clues for developing novel therapeutics for PCa treatment.


Asunto(s)
Antineoplásicos , Neoplasias de la Próstata , Masculino , Humanos , Factores de Transcripción , Proteínas Nucleares , Agua/química , Detección Precoz del Cáncer , Diseño de Fármacos , Proteínas de Ciclo Celular/metabolismo , Neoplasias de la Próstata/tratamiento farmacológico , Neoplasias de la Próstata/metabolismo , Relación Estructura-Actividad , Antineoplásicos/química , Proteínas que Contienen Bromodominio
8.
Comput Biol Med ; 169: 107815, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38128254

RESUMEN

Anaplastic lymphoma kinase (ALK) is implicated in the genesis of multiple malignant tumors. Lorlatinib stands out as the most advanced and effective inhibitor currently used in the clinic for the treatment of ALK-positive non-small cell lung cancer. However, resistance to lorlatinib has inevitably manifested over time, with double/triple mutations of G1202, L1196, L1198, C1156 and I1171 frequently observed in clinical practice, and tumors regrow within a short time after treatment with lorlatinib. Therefore, elucidating the mechanism of resistance to lorlatinib is paramount in paving the way for innovative therapeutic strategies and the development of next-generation drugs. In this study, we leveraged multiple computational methodologies to delve into the resistance mechanisms of three specific double mutations of ALKG1202R/L1196M, ALKG1202R/L1198F and ALKI1171N/L1198F to lorlatinib. We analyzed these mechanisms through qualitative (PCA, DCCM) and quantitative (MM/GBSA, US) kinetic analyses. The qualitative analysis shows that these mutations exert minimal perturbations on the conformational dynamics of the structural domains of ALK. The energetic and structural assessments show that the van der Waals interactions, formed by the conserved residue Leu1256 within the ATP-binding site and the residues Glu1197 and Met1199 in the hinge domain with lorlatinib, play integral roles in the occurrence of drug resistance. Furthermore, the US simulation results elucidate that the pathways through which lorlatinib dissociates vary across mutant systems, and the distinct environments during the dissociation process culminate in diverse resistance mechanisms. Collectively, these insights provide important clues for the design of novel inhibitors to combat resistance.


Asunto(s)
Aminopiridinas , Carcinoma de Pulmón de Células no Pequeñas , Lactamas , Neoplasias Pulmonares , Pirazoles , Humanos , Aminopiridinas/farmacología , Aminopiridinas/uso terapéutico , Quinasa de Linfoma Anaplásico/genética , Quinasa de Linfoma Anaplásico/metabolismo , Resistencia a Antineoplásicos , Lactamas/farmacología , Lactamas/uso terapéutico , Lactamas Macrocíclicas/farmacología , Lactamas Macrocíclicas/uso terapéutico , Neoplasias Pulmonares/genética , Mutación , Inhibidores de Proteínas Quinasas/farmacología , Inhibidores de Proteínas Quinasas/uso terapéutico , Pirazoles/farmacología , Pirazoles/uso terapéutico
9.
Chem Sci ; 14(43): 12166-12181, 2023 Nov 08.
Artículo en Inglés | MEDLINE | ID: mdl-37969589

RESUMEN

Contemporary structure-based molecular generative methods have demonstrated their potential to model the geometric and energetic complementarity between ligands and receptors, thereby facilitating the design of molecules with favorable binding affinity and target specificity. Despite the introduction of deep generative models for molecular generation, the atom-wise generation paradigm that partially contradicts chemical intuition limits the validity and synthetic accessibility of the generated molecules. Additionally, the dependence of deep learning models on large-scale structural data has hindered their adaptability across different targets. To overcome these challenges, we present a novel search-based framework, 3D-MCTS, for structure-based de novo drug design. Distinct from prevailing atom-centric methods, 3D-MCTS employs a fragment-based molecular editing strategy. The fragments decomposed from small-molecule drugs are recombined under predefined retrosynthetic rules, offering improved drug-likeness and synthesizability, overcoming the inherent limitations of atom-based approaches. Leveraging multi-threaded parallel simulations combined with a real-time energy constraint-based pruning strategy, 3D-MCTS achieves remarkable efficiency. At a fixed computational cost, it outperforms other state-of-the-art (SOTA) methods by producing molecules with enhanced binding affinity. Furthermore, its fragment-based approach ensures the generation of more dependable binding conformations, exhibiting a success rate 43.6% higher than that of other SOTAs. This advantage becomes even more pronounced when handling targets that significantly deviate from the training dataset. 3D-MCTS is capable of achieving thirty times more hits with high binding affinity than traditional virtual screening methods, which demonstrates the superior ability of 3D-MCTS to explore chemical space. Moreover, the flexibility of our framework makes it easy to incorporate domain knowledge during the process, thereby enabling the generation of molecules with desirable pharmacophores and enhanced binding affinity. The adaptability of 3D-MCTS is further showcased in metalloprotein applications, highlighting its potential across various drug design scenarios.

10.
J Chem Inf Model ; 63(21): 6525-6536, 2023 11 13.
Artículo en Inglés | MEDLINE | ID: mdl-37883143

RESUMEN

Small-molecule conformer generation (SMCG) is an extremely important task in both ligand- and structure-based computer-aided drug design, especially during the hit discovery phase. Recently, a multitude of artificial intelligence (AI) models tailored for SMCG have emerged. Despite developers typically furnishing performance evaluation data upon releasing their AI models, a comprehensive and equitable performance comparison between AI models and conventional methods is still lacking. In this study, we curated a new benchmarking data set comprising 3354 high-quality ligand bioactive conformations. Subsequently, we conducted a systematic assessment of the performance of four widely adopted traditional methods (i.e., ConfGenX, Conformator, OMEGA, and RDKit ETKDG) and five AI models (i.e., ConfGF, DMCG, GeoDiff, GeoMol, and torsional diffusion) in the tasks of reproducing bioactive and low-energy conformations of small molecules. In the former task, the AI models have no advantage, particularly with a maximum ensemble size of 1. Even the best-performing AI model GeoMol is still worse than any of the tested traditional methods. Conversely, in the latter task, the torsional diffusion model shows obvious advantages, surpassing the best-performing traditional method ConfGenX by 26.09 and 12.97% on the COV-R and COV-P metrics, respectively. Furthermore, the influence of force field-based fine-tuning on the quality of the generated conformers was also discussed. Finally, a user-friendly Web server called fastSMCG was developed to enable researchers to rapidly and flexibly generate small-molecule conformers using both traditional and AI methods. We anticipate that our work will offer valuable practical assistance to the scientific community in this field.


Asunto(s)
Inteligencia Artificial , Diseño de Fármacos , Modelos Moleculares , Ligandos , Conformación Molecular
12.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37738401

RESUMEN

Cracking the entangling code of protein-ligand interaction (PLI) is of great importance to structure-based drug design and discovery. Different physical and biochemical representations can be used to describe PLI such as energy terms and interaction fingerprints, which can be analyzed by machine learning (ML) algorithms to create ML-based scoring functions (MLSFs). Here, we propose the ML-based PLI capturer (ML-PLIC), a web platform that automatically characterizes PLI and generates MLSFs to identify the potential binders of a specific protein target through virtual screening (VS). ML-PLIC comprises five modules, including Docking for ligand docking, Descriptors for PLI generation, Modeling for MLSF training, Screening for VS and Pipeline for the integration of the aforementioned functions. We validated the MLSFs constructed by ML-PLIC in three benchmark datasets (Directory of Useful Decoys-Enhanced, Active as Decoys and TocoDecoy), demonstrating accuracy outperforming traditional docking tools and competitive performance to the deep learning-based SF, and provided a case study of the Serine/threonine-protein kinase WEE1 in which MLSFs were developed by using the ML-based VS pipeline in ML-PLIC. Underpinning the latest version of ML-PLIC is a powerful platform that incorporates physical and biological knowledge about PLI, leveraging PLI characterization and MLSF generation into the design of structure-based VS pipeline. The ML-PLIC web platform is now freely available at http://cadd.zju.edu.cn/plic/.


Asunto(s)
Algoritmos , Benchmarking , Ligandos , Diseño de Fármacos , Aprendizaje Automático
13.
Chem Sci ; 14(30): 8129-8146, 2023 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-37538816

RESUMEN

Applying machine learning algorithms to protein-ligand scoring functions has aroused widespread attention in recent years due to the high predictive accuracy and affordable computational cost. Nevertheless, most machine learning-based scoring functions are only applicable to a specific task, e.g., binding affinity prediction, binding pose prediction or virtual screening, suggesting that the development of a scoring function with balanced performance in all critical tasks remains a grand challenge. To this end, we propose a novel parameterization strategy by introducing an adjustable binding affinity term that represents the correlation between the predicted outcomes and experimental data into the training of mixture density network. The resulting residue-atom distance likelihood potential not only retains the superior docking and screening power over all the other state-of-the-art approaches, but also achieves a remarkable improvement in scoring and ranking performance. We emphatically explore the impacts of several key elements on prediction accuracy as well as the task preference, and demonstrate that the performance of scoring/ranking and docking/screening tasks of a certain model could be well balanced through an appropriate manner. Overall, our study highlights the potential utility of our innovative parameterization strategy as well as the resulting scoring framework in future structure-based drug design.

14.
J Med Chem ; 66(15): 10808-10823, 2023 08 10.
Artículo en Inglés | MEDLINE | ID: mdl-37471134

RESUMEN

Recently, deep generative models have been regarded as promising tools in fragment-based drug design (FBDD). Despite the growing interest in these models, they still face challenges in generating molecules with desired properties in low data regimes. In this study, we propose a novel flow-based autoregressive model named FFLOM for linker and R-group design. In a large-scale benchmark evaluation on ZINC, CASF, and PDBbind test sets, FFLOM achieves state-of-the-art performance in terms of validity, uniqueness, novelty, and recovery of the generated molecules and can recover over 92% of the original molecules in the PDBbind test set (with at least five atoms). FFLOM also exhibits excellent potential applicability in several practical scenarios encompassing fragment linking, PROTAC design, R-group growing, and R-group optimization. In all four cases, FFLOM can perfectly reconstruct the ground-truth compounds and generate over 74% of molecules with novel fragments, some of which have higher binding affinity than the ground truth.


Asunto(s)
Diseño de Fármacos , Ligandos , Tiazoles/química
15.
J Cheminform ; 15(1): 63, 2023 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-37403155

RESUMEN

Machine learning-based scoring functions (MLSFs) have shown potential for improving virtual screening capabilities over classical scoring functions (SFs). Due to the high computational cost in the process of feature generation, the numbers of descriptors used in MLSFs and the characterization of protein-ligand interactions are always limited, which may affect the overall accuracy and efficiency. Here, we propose a new SF called TB-IECS (theory-based interaction energy component score), which combines energy terms from Smina and NNScore version 2, and utilizes the eXtreme Gradient Boosting (XGBoost) algorithm for model training. In this study, the energy terms decomposed from 15 traditional SFs were firstly categorized based on their formulas and physicochemical principles, and 324 feature combinations were generated accordingly. Five best feature combinations were selected for further evaluation of the model performance in regard to the selection of feature vectors with various length, interaction types and ML algorithms. The virtual screening power of TB-IECS was assessed on the datasets of DUD-E and LIT-PCBA, as well as seven target-specific datasets from the ChemDiv database. The results showed that TB-IECS outperformed classical SFs including Glide SP and Dock, and effectively balanced the efficiency and accuracy for practical virtual screening.

16.
J Med Chem ; 66(13): 9174-9183, 2023 07 13.
Artículo en Inglés | MEDLINE | ID: mdl-37317043

RESUMEN

Machine-learning-based scoring functions (MLSFs) have gained attention for their potential to improve accuracy in binding affinity prediction and structure-based virtual screening (SBVS) compared to classical SFs. Developing accurate MLSFs for SBVS requires a large and unbiased dataset that includes structurally diverse actives and decoys. Unfortunately, most datasets suffer from hidden biases and data insufficiency. Here, we developed topology-based and conformation-based decoys database (ToCoDDB). The biological targets and active ligands in ToCoDDB were collected from scientific literature and established datasets. The decoys were generated and debiased by using conditional recurrent neural networks and molecular docking. ToCoDDB is presently the largest unbiased database with 2.4 million decoys encompassing 155 targets. The detailed information and performance benchmark for each target are provided, which are beneficial for training and evaluating MLSFs. Moreover, the online decoys generation function of ToCoDDB further expands its application range to any target. ToCoDDB is freely available at http://cadd.zju.edu.cn/tocodecoy/.


Asunto(s)
Benchmarking , Aprendizaje Automático , Simulación del Acoplamiento Molecular , Conformación Molecular , Bases de Datos Factuales , Ligandos , Unión Proteica
17.
Nat Commun ; 14(1): 2585, 2023 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-37142585

RESUMEN

Graph neural networks (GNNs) have been widely used in molecular property prediction, but explaining their black-box predictions is still a challenge. Most existing explanation methods for GNNs in chemistry focus on attributing model predictions to individual nodes, edges or fragments that are not necessarily derived from a chemically meaningful segmentation of molecules. To address this challenge, we propose a method named substructure mask explanation (SME). SME is based on well-established molecular segmentation methods and provides an interpretation that aligns with the understanding of chemists. We apply SME to elucidate how GNNs learn to predict aqueous solubility, genotoxicity, cardiotoxicity and blood-brain barrier permeation for small molecules. SME provides interpretation that is consistent with the understanding of chemists, alerts them to unreliable performance, and guides them in structural optimization for target properties. Hence, we believe that SME empowers chemists to confidently mine structure-activity relationship (SAR) from reliable GNNs through a transparent inspection on how GNNs pick up useful signals when learning from data.


Asunto(s)
Barrera Hematoencefálica , Cardiotoxicidad , Humanos , Daño del ADN , Redes Neurales de la Computación , Registros
18.
Nat Comput Sci ; 3(10): 849-859, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38177756

RESUMEN

Highly effective de novo design is a grand challenge of computer-aided drug discovery. Practical structure-specific three-dimensional molecule generations have started to emerge in recent years, but most approaches treat the target structure as a conditional input to bias the molecule generation and do not fully learn the detailed atomic interactions that govern the molecular conformation and stability of the binding complexes. The omission of these fine details leads to many models having difficulty in outputting reasonable molecules for a variety of therapeutic targets. Here, to address this challenge, we formulate a model, called SurfGen, that designs molecules in a fashion closely resembling the figurative key-and-lock principle. SurfGen comprises two equivariant neural networks, Geodesic-GNN and Geoatom-GNN, which capture the topological interactions on the pocket surface and the spatial interaction between ligand atoms and surface nodes, respectively. SurfGen outperforms other methods in a number of benchmarks, and its high sensitivity on the pocket structures enables an effective generative-model-based solution to the thorny issue of mutation-induced drug resistance.


Asunto(s)
Descubrimiento de Drogas , Redes Neurales de la Computación , Descubrimiento de Drogas/métodos , Conformación Molecular
19.
Nat Comput Sci ; 3(9): 789-804, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38177786

RESUMEN

Ligand docking is one of the core technologies in structure-based virtual screening for drug discovery. However, conventional docking tools and existing deep learning tools may suffer from limited performance in terms of speed, pose quality and binding affinity accuracy. Here we propose KarmaDock, a deep learning approach for ligand docking that integrates the functions of docking acceleration, binding pose generation and correction, and binding strength estimation. The three-stage model consists of the following components: (1) encoders for the protein and ligand to learn the representations of intramolecular interactions; (2) E(n) equivariant graph neural networks with self-attention to update the ligand pose based on both protein-ligand and intramolecular interactions, followed by post-processing to ensure chemically plausible structures; (3) a mixture density network for scoring the binding strength. KarmaDock was validated on four benchmark datasets and tested in a real-world virtual screening project that successfully identified experiment-validated active inhibitors of leukocyte tyrosine kinase (LTK).


Asunto(s)
Redes Neurales de la Computación , Proteínas , Unión Proteica , Ligandos , Simulación del Acoplamiento Molecular , Proteínas/química
20.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38171930

RESUMEN

Protein loops play a critical role in the dynamics of proteins and are essential for numerous biological functions, and various computational approaches to loop modeling have been proposed over the past decades. However, a comprehensive understanding of the strengths and weaknesses of each method is lacking. In this work, we constructed two high-quality datasets (i.e. the General dataset and the CASP dataset) and systematically evaluated the accuracy and efficiency of 13 commonly used loop modeling approaches from the perspective of loop lengths, protein classes and residue types. The results indicate that the knowledge-based method FREAD generally outperforms the other tested programs in most cases, but encountered challenges when predicting loops longer than 15 and 30 residues on the CASP and General datasets, respectively. The ab initio method Rosetta NGK demonstrated exceptional modeling accuracy for short loops with four to eight residues and achieved the highest success rate on the CASP dataset. The well-known AlphaFold2 and RoseTTAFold require more resources for better performance, but they exhibit promise for predicting loops longer than 16 and 30 residues in the CASP and General datasets. These observations can provide valuable insights for selecting suitable methods for specific loop modeling tasks and contribute to future advancements in the field.


Asunto(s)
Proteínas , Conformación Proteica , Proteínas/química
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...