Pesquisa | BVS Bolivia

1.

Design and synthesis of 7-membered lactam fused hydroxypyridinones as potent metal binding pharmacophores (MBPs) for inhibiting influenza virus PA_N endonuclease.

Zhang, Lei; Ke, Di; Li, Yuting; Zhang, Hui; Zhang, Xi; Wang, Sihan; Ni, Shaokai; Peng, Bo; Zeng, Huixuan; Hou, Tingjun; Du, Yushen; Pan, Peichen; Yu, Yongping; Chen, Wenteng.

Eur J Med Chem ; 276: 116639, 2024 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-38964259

RESUMO

Since influenza virus RNA polymerase subunit PAN is a dinuclear Mn2+ dependent endonuclease, metal-binding pharmacophores (MBPs) with Mn2+ coordination has been elucidated as a promising strategy to develop PAN inhibitors for influenza treatment. However, few attentions have been paid to the relationship between the optimal arrangement of the donor atoms in MBPs and anti-influenza A virus (IAV) efficacy. Given that, the privileged hydroxypyridinones fusing a seven-membered lactam ring with diverse side chains, chiral centers or cyclic systems were designed and synthesized. A structure-activity relationship study resulted in a hit compound 16l (IC50 = 2.868 ± 0.063 µM against IAV polymerase), the seven-membered lactam ring of which was fused a pyrrolidine ring. Further optimization of the hydrophobic binding groups on 16l afforded a lead compound (R, S)-16s, which exhibited a 64-fold more potent inhibitory activity (IC50 = 0.045 ± 0.002 µM) toward IAV polymerase. Moreover, (R, S)-16s demonstrated a potent anti-IAV efficacy (EC50 = 0.134 ± 0.093 µM) and weak cytotoxicity (CC50 = 15.35 µM), indicating the high selectivity of (R, S)-16s. Although the lead compound (R, S)-16s exhibited a little weaker activity than baloxavir, these findings illustrated the utility of a metal coordination-based strategy in generating novel MBPs with potent anti-influenza activity.

2.

Diverging Trends in Left Without Being Seen Rates During the Pandemic Era: Emergency Department Length of Stay May Be a Key Factor.

Toy, Stanley; Chiu, Wen-Ta; Chon, John; Aflakian, Kaveh; Lin, Wan-Yi; Pan, Pei-Chen; Lin, Yu-Tien; Toy, Jessica; Wu, Su-Yen; Wu, Jonathan.

J Emerg Med ; 66(4): e544-e546, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38580416

Assuntos

Serviço Hospitalar de Emergência , Pandemias , Humanos , Tempo de Internação , Fatores de Tempo , Estudos Retrospectivos

3.

A general model for predicting enzyme functions based on enzymatic reactions.

Qian, Wenjia; Wang, Xiaorui; Kang, Yu; Pan, Peichen; Hou, Tingjun; Hsieh, Chang-Yu.

J Cheminform ; 16(1): 38, 2024 Mar 31.

Artigo em Inglês | MEDLINE | ID: mdl-38556873

RESUMO

Accurate prediction of the enzyme comission (EC) numbers for chemical reactions is essential for the understanding and manipulation of enzyme functions, biocatalytic processes and biosynthetic planning. A number of machine leanring (ML)-based models have been developed to classify enzymatic reactions, showing great advantages over costly and long-winded experimental verifications. However, the prediction accuracy for most available models trained on the records of chemical reactions without specifying the enzymatic catalysts is rather limited. In this study, we introduced BEC-Pred, a BERT-based multiclassification model, for predicting EC numbers associated with reactions. Leveraging transfer learning, our approach achieves precise forecasting across a wide variety of Enzyme Commission (EC) numbers solely through analysis of the SMILES sequences of substrates and products. BEC-Pred model outperformed other sequence and graph-based ML methods, attaining a higher accuracy of 91.6%, surpassing them by 5.5%, and exhibiting superior F1 scores with improvements of 6.6% and 6.0%, respectively. The enhanced performance highlights the potential of BEC-Pred to serve as a reliable foundational tool to accelerate the cutting-edge research in synthetic biology and drug metabolism. Moreover, we discussed a few examples on how BEC-Pred could accurately predict the enzymatic classification for the Novozym 435-induced hydrolysis and lipase efficient catalytic synthesis. We anticipate that BEC-Pred will have a positive impact on the progression of enzymatic research.

4.

Structure-based virtual screening of novel USP5 inhibitors targeting the zinc finger ubiquitin-binding domain.

Wang, Tianhao; Tong, Jianbo; Zhang, Xing; Wang, Zhe; Xu, Lei; Pan, Peichen; Hou, Tingjun.

Comput Biol Med ; 174: 108397, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38603896

RESUMO

The equilibrium of cellular protein levels is pivotal for maintaining normal physiological functions. USP5 belongs to the deubiquitination enzyme (DUBs) family, controlling protein degradation and preserving cellular protein homeostasis. Aberrant expression of USP5 is implicated in a variety of diseases, including cancer, neurodegenerative diseases, and inflammatory diseases. In this paper, a multi-level virtual screening (VS) approach was employed to target the zinc finger ubiquitin-binding domain (ZnF-UBD) of USP5, leading to the identification of a highly promising candidate compound 0456-0049. Molecular dynamics (MD) simulations were then employed to assess the stability of complex binding and predict hotspot residues in interactions. The results indicated that the candidate stably binds to the ZnF-UBD of USP5 through crucial interactions with residues ARG221, TRP209, GLY220, ASN207, TYR261, TYR259, and MET266. Binding free energy calculations, along with umbrella sampling (US) simulations, underscored a superior binding affinity of the candidate relative to known inhibitors. Moreover, US simulations revealed conformational changes of USP5 during ligand dissociation. These insights provide a valuable foundation for the development of novel inhibitors targeting USP5.

Assuntos

Endopeptidases , Dedos de Zinco , Humanos , Endopeptidases/química , Endopeptidases/metabolismo , Simulação de Dinâmica Molecular , Ligação Proteica , Domínios Proteicos

5.

Genetic Algorithm-Based Receptor Ligand: A Genetic Algorithm-Guided Generative Model to Boost the Novelty and Drug-Likeness of Molecules in a Sampling Chemical Space.

Wang, Mingyang; Wu, Zhengjian; Wang, Jike; Weng, Gaoqi; Kang, Yu; Pan, Peichen; Li, Dan; Deng, Yafeng; Yao, Xiaojun; Bing, Zhitong; Hsieh, Chang-Yu; Hou, Tingjun.

J Chem Inf Model ; 64(4): 1213-1228, 2024 Feb 26.

Artigo em Inglês | MEDLINE | ID: mdl-38302422

RESUMO

Deep learning-based de novo molecular design has recently gained significant attention. While numerous DL-based generative models have been successfully developed for designing novel compounds, the majority of the generated molecules lack sufficiently novel scaffolds or high drug-like profiles. The aforementioned issues may not be fully captured by commonly used metrics for the assessment of molecular generative models, such as novelty, diversity, and quantitative estimation of the drug-likeness score. To address these limitations, we proposed a genetic algorithm-guided generative model called GARel (genetic algorithm-based receptor-ligand interaction generator), a novel framework for training a DL-based generative model to produce drug-like molecules with novel scaffolds. To efficiently train the GARel model, we utilized dense net to update the parameters based on molecules with novel scaffolds and drug-like features. To demonstrate the capability of the GARel model, we used it to design inhibitors for three targets: AA2AR, EGFR, and SARS-Cov2. The results indicate that GARel-generated molecules feature more diverse and novel scaffolds and possess more desirable physicochemical properties and favorable docking scores. Compared with other generative models, GARel makes significant progress in balancing novelty and drug-likeness, providing a promising direction for the further development of DL-based de novo design methodology with potential impacts on drug discovery.

Assuntos

Desenho de Fármacos , RNA Viral , Ligantes , Algoritmos , Descoberta de Drogas

6.

Comprehensive, Open-Source, and Automated Workflow for Multisite λ-Dynamics in Lead Optimization.

Hu, Renling; Zhang, Jintu; Kang, Yu; Wang, Zhe; Pan, Peichen; Deng, Yafeng; Hsieh, Chang-Yu; Hou, Tingjun.

J Chem Theory Comput ; 20(3): 1465-1478, 2024 Feb 13.

Artigo em Inglês | MEDLINE | ID: mdl-38300792

RESUMO

Multisite λ-dynamics (MSLD) is a highly efficient binding free energy calculation method that samples multiple ligands in a single round by assigning different λ values to the alchemical part of each ligand. This method holds great promise for lead optimization (LO) in drug discovery. However, the complex data preparation and simulation process limits its widespread application in diverse protein-ligand systems. To address this challenge, we developed a comprehensive, open-source, and automated workflow for MSLD calculations based on the BLaDE dynamics engine. This workflow incorporates the Ligand Internal and Cartesian coordinate reconstruction-based alignment algorithm (LIC-align) and an optimized maximum common substructure (MCS) search algorithm to accurately generate MSLD multiple topologies with ideal perturbation patterns. Furthermore, our workflow is highly modularized, allowing straightforward integration and extension of various simulation techniques, and is highly accessible to nonexperts. This workflow was validated by calculating the relative binding free energies of large-scale congeneric ligands, many of which have large perturbing groups. The agreement between the calculations and experiments was excellent, with an average unsigned error of 1.08 ± 0.47 kcal/mol. More than 57.1% of the ligands had an error of less than 1.0 kcal/mol, and the perturbations of 6 targets were fully connected via the calculations, while those of 2 targets were connected via both calculations and experimental data. The Pearson correlation coefficient reached 0.88, indicating that the MSLD workflow provides accurate predictions that can guide lead optimization in drug discovery. We also examined the impact of single-site versus multisite perturbations, ligand grouping by perturbing group size, and the position of the anchor atom on the MSLD performance. By integrating our proposed LIC-align and optimized MCS search algorithm along with the coping strategies to handle challenging molecular substructures, our workflow can handle many realistic scenarios more reasonably than all previously published methods. Moreover, we observed that our MSLD workflow achieved similar accuracy to free energy perturbation (FEP) while improving computational efficiency by over 1 order of magnitude in speedup. These findings provide valuable insights and strategies for further MSLD development, making MSLD a competitive tool for lead optimization.

Assuntos

Simulação de Dinâmica Molecular , Proteínas , Termodinâmica , Ligantes , Fluxo de Trabalho , Proteínas/química , Ligação Proteica

7.

CarsiDock: a deep learning paradigm for accurate protein-ligand docking and screening based on large-scale pre-training.

Cai, Heng; Shen, Chao; Jian, Tianye; Zhang, Xujun; Chen, Tong; Han, Xiaoqi; Yang, Zhuo; Dang, Wei; Hsieh, Chang-Yu; Kang, Yu; Pan, Peichen; Ji, Xiangyang; Song, Jianfei; Hou, Tingjun; Deng, Yafeng.

Chem Sci ; 15(4): 1449-1471, 2024 Jan 24.

Artigo em Inglês | MEDLINE | ID: mdl-38274053

RESUMO

The expertise accumulated in deep neural network-based structure prediction has been widely transferred to the field of protein-ligand binding pose prediction, thus leading to the emergence of a variety of deep learning-guided docking models for predicting protein-ligand binding poses without relying on heavy sampling. However, their prediction accuracy and applicability are still far from satisfactory, partially due to the lack of protein-ligand binding complex data. To this end, we create a large-scale complex dataset containing â¼9 M protein-ligand docking complexes for pre-training, and propose CarsiDock, the first deep learning-guided docking approach that leverages pre-training of millions of predicted protein-ligand complexes. CarsiDock contains two main stages, i.e., a deep learning model for the prediction of protein-ligand atomic distance matrices, and a translation, rotation and torsion-guided geometry optimization procedure to reconstruct the matrices into a credible binding pose. The pre-training and multiple innovative architectural designs facilitate the dramatically improved docking accuracy of our approach over the baselines in terms of multiple docking scenarios, thereby contributing to its outstanding early recognition performance in several retrospective virtual screening campaigns. Further explorations demonstrate that CarsiDock can not only guarantee the topological reliability of the binding poses but also successfully reproduce the crucial interactions in crystalized structures, highlighting its superior applicability.

8.

Dissecting the role of ALK double mutations in drug resistance to lorlatinib with in-depth theoretical modeling and analysis.

Zhang, Xing; Tong, Jianbo; Wang, Tianhao; Wang, Tianyue; Xu, Lei; Wang, Zhe; Hou, Tingjun; Pan, Peichen.

Comput Biol Med ; 169: 107815, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38128254

RESUMO

Anaplastic lymphoma kinase (ALK) is implicated in the genesis of multiple malignant tumors. Lorlatinib stands out as the most advanced and effective inhibitor currently used in the clinic for the treatment of ALK-positive non-small cell lung cancer. However, resistance to lorlatinib has inevitably manifested over time, with double/triple mutations of G1202, L1196, L1198, C1156 and I1171 frequently observed in clinical practice, and tumors regrow within a short time after treatment with lorlatinib. Therefore, elucidating the mechanism of resistance to lorlatinib is paramount in paving the way for innovative therapeutic strategies and the development of next-generation drugs. In this study, we leveraged multiple computational methodologies to delve into the resistance mechanisms of three specific double mutations of ALKG1202R/L1196M, ALKG1202R/L1198F and ALKI1171N/L1198F to lorlatinib. We analyzed these mechanisms through qualitative (PCA, DCCM) and quantitative (MM/GBSA, US) kinetic analyses. The qualitative analysis shows that these mutations exert minimal perturbations on the conformational dynamics of the structural domains of ALK. The energetic and structural assessments show that the van der Waals interactions, formed by the conserved residue Leu1256 within the ATP-binding site and the residues Glu1197 and Met1199 in the hinge domain with lorlatinib, play integral roles in the occurrence of drug resistance. Furthermore, the US simulation results elucidate that the pathways through which lorlatinib dissociates vary across mutant systems, and the distinct environments during the dissociation process culminate in diverse resistance mechanisms. Collectively, these insights provide important clues for the design of novel inhibitors to combat resistance.

Assuntos

Aminopiridinas , Carcinoma Pulmonar de Células não Pequenas , Lactamas , Neoplasias Pulmonares , Pirazóis , Humanos , Aminopiridinas/farmacologia , Aminopiridinas/uso terapêutico , Quinase do Linfoma Anaplásico/genética , Quinase do Linfoma Anaplásico/metabolismo , Resistencia a Medicamentos Antineoplásicos , Lactamas/farmacologia , Lactamas/uso terapêutico , Lactamas Macrocíclicas/farmacologia , Lactamas Macrocíclicas/uso terapêutico , Neoplasias Pulmonares/genética , Mutação , Inibidores de Proteínas Quinases/farmacologia , Inibidores de Proteínas Quinases/uso terapêutico , Pirazóis/farmacologia , Pirazóis/uso terapêutico

9.

Discovery of Novel Inhibitors of BRD4 for Treating Prostate Cancer: A Comprehensive Case Study for Considering Water Networks in Virtual Screening and Drug Design.

Zhong, Haiyang; Wang, Xinyue; Chen, Shicheng; Wang, Zhe; Wang, Huating; Xu, Lei; Hou, Tingjun; Yao, Xiaojun; Li, Dan; Pan, Peichen.

J Med Chem ; 67(1): 138-151, 2024 01 11.

Artigo em Inglês | MEDLINE | ID: mdl-38153295

RESUMO

Androgen receptor (AR) is the primary target for treating prostate cancer (PCa), which inevitably progresses due to drug-resistant mutations. Bromodomain-containing protein 4 (BRD4) has been a new potential drug target for PCa treatment. Herein, we report the rational design and discovery of novel BRD4 inhibitors through computer-aided drug design (CADD), and a hit compound SQ-1 (IC50 = 676 nM) was identified by structure-based virtual screening (SBVS) with the conserved water network. To optimize the structure of SQ-1, the free energy landscape was constructed, and the binding mechanism was explored by characterizing the water profile and the dissociation mechanism. Finally, the compound SQ-17 with improved inhibitory activity (IC50 < 100 nM) was discovered, which showed potent antiproliferative activity against LNCaP. These data highlighted a successful attempt to identify and optimize a small molecule by comprehensive CADD application and provided essential clues for developing novel therapeutics for PCa treatment.

Assuntos

Antineoplásicos , Neoplasias da Próstata , Masculino , Humanos , Fatores de Transcrição , Proteínas Nucleares , Água/química , Detecção Precoce de Câncer , Desenho de Fármacos , Proteínas de Ciclo Celular/metabolismo , Neoplasias da Próstata/tratamento farmacológico , Neoplasias da Próstata/metabolismo , Relação Estrutura-Atividade , Antineoplásicos/química , Proteínas que Contêm Bromodomínio

10.

A flexible data-free framework for structure-based de novo drug design with reinforcement learning.

Du, Hongyan; Jiang, Dejun; Zhang, Odin; Wu, Zhenxing; Gao, Junbo; Zhang, Xujun; Wang, Xiaorui; Deng, Yafeng; Kang, Yu; Li, Dan; Pan, Peichen; Hsieh, Chang-Yu; Hou, Tingjun.

Chem Sci ; 14(43): 12166-12181, 2023 Nov 08.

Artigo em Inglês | MEDLINE | ID: mdl-37969589

RESUMO

Contemporary structure-based molecular generative methods have demonstrated their potential to model the geometric and energetic complementarity between ligands and receptors, thereby facilitating the design of molecules with favorable binding affinity and target specificity. Despite the introduction of deep generative models for molecular generation, the atom-wise generation paradigm that partially contradicts chemical intuition limits the validity and synthetic accessibility of the generated molecules. Additionally, the dependence of deep learning models on large-scale structural data has hindered their adaptability across different targets. To overcome these challenges, we present a novel search-based framework, 3D-MCTS, for structure-based de novo drug design. Distinct from prevailing atom-centric methods, 3D-MCTS employs a fragment-based molecular editing strategy. The fragments decomposed from small-molecule drugs are recombined under predefined retrosynthetic rules, offering improved drug-likeness and synthesizability, overcoming the inherent limitations of atom-based approaches. Leveraging multi-threaded parallel simulations combined with a real-time energy constraint-based pruning strategy, 3D-MCTS achieves remarkable efficiency. At a fixed computational cost, it outperforms other state-of-the-art (SOTA) methods by producing molecules with enhanced binding affinity. Furthermore, its fragment-based approach ensures the generation of more dependable binding conformations, exhibiting a success rate 43.6% higher than that of other SOTAs. This advantage becomes even more pronounced when handling targets that significantly deviate from the training dataset. 3D-MCTS is capable of achieving thirty times more hits with high binding affinity than traditional virtual screening methods, which demonstrates the superior ability of 3D-MCTS to explore chemical space. Moreover, the flexibility of our framework makes it easy to incorporate domain knowledge during the process, thereby enabling the generation of molecules with desirable pharmacophores and enhanced binding affinity. The adaptability of 3D-MCTS is further showcased in metalloprotein applications, highlighting its potential across various drug design scenarios.

11.

Small-Molecule Conformer Generators: Evaluation of Traditional Methods and AI Models on High-Quality Data Sets.

Wang, Zhe; Zhong, Haiyang; Zhang, Jintu; Pan, Peichen; Wang, Dong; Liu, Huanxiang; Yao, Xiaojun; Hou, Tingjun; Kang, Yu.

J Chem Inf Model ; 63(21): 6525-6536, 2023 11 13.

Artigo em Inglês | MEDLINE | ID: mdl-37883143

RESUMO

Small-molecule conformer generation (SMCG) is an extremely important task in both ligand- and structure-based computer-aided drug design, especially during the hit discovery phase. Recently, a multitude of artificial intelligence (AI) models tailored for SMCG have emerged. Despite developers typically furnishing performance evaluation data upon releasing their AI models, a comprehensive and equitable performance comparison between AI models and conventional methods is still lacking. In this study, we curated a new benchmarking data set comprising 3354 high-quality ligand bioactive conformations. Subsequently, we conducted a systematic assessment of the performance of four widely adopted traditional methods (i.e., ConfGenX, Conformator, OMEGA, and RDKit ETKDG) and five AI models (i.e., ConfGF, DMCG, GeoDiff, GeoMol, and torsional diffusion) in the tasks of reproducing bioactive and low-energy conformations of small molecules. In the former task, the AI models have no advantage, particularly with a maximum ensemble size of 1. Even the best-performing AI model GeoMol is still worse than any of the tested traditional methods. Conversely, in the latter task, the torsional diffusion model shows obvious advantages, surpassing the best-performing traditional method ConfGenX by 26.09 and 12.97% on the COV-R and COV-P metrics, respectively. Furthermore, the influence of force field-based fine-tuning on the quality of the generated conformers was also discussed. Finally, a user-friendly Web server called fastSMCG was developed to enable researchers to rapidly and flexibly generate small-molecule conformers using both traditional and AI methods. We anticipate that our work will offer valuable practical assistance to the scientific community in this field.

Assuntos

Inteligência Artificial , Desenho de Fármacos , Modelos Moleculares , Ligantes , Conformação Molecular

12.

COVID-19 impact on ED boarding likely related to increased timeframe for patient disposition to admission, discharge, and transfer.

Toy, Stanley; Chiu, Wen-Ta; Chon, John; Lin, Wan-Yi; Aflakian, Kaveh; Pan, Pei-Chen; Jiang, Ting-Yun; Yeh, Chia-Hsing; Wu, Su-Yen; Wu, Jonathan.

Am J Emerg Med ; 73: 212-216, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-37730453

13.

ML-PLIC: a web platform for characterizing protein-ligand interactions and developing machine learning-based scoring functions.

Zhang, Xujun; Shen, Chao; Wang, Tianyue; Deng, Yafeng; Kang, Yu; Li, Dan; Hou, Tingjun; Pan, Peichen.

Brief Bioinform ; 24(5)2023 09 20.

Artigo em Inglês | MEDLINE | ID: mdl-37738401

RESUMO

Cracking the entangling code of protein-ligand interaction (PLI) is of great importance to structure-based drug design and discovery. Different physical and biochemical representations can be used to describe PLI such as energy terms and interaction fingerprints, which can be analyzed by machine learning (ML) algorithms to create ML-based scoring functions (MLSFs). Here, we propose the ML-based PLI capturer (ML-PLIC), a web platform that automatically characterizes PLI and generates MLSFs to identify the potential binders of a specific protein target through virtual screening (VS). ML-PLIC comprises five modules, including Docking for ligand docking, Descriptors for PLI generation, Modeling for MLSF training, Screening for VS and Pipeline for the integration of the aforementioned functions. We validated the MLSFs constructed by ML-PLIC in three benchmark datasets (Directory of Useful Decoys-Enhanced, Active as Decoys and TocoDecoy), demonstrating accuracy outperforming traditional docking tools and competitive performance to the deep learning-based SF, and provided a case study of the Serine/threonine-protein kinase WEE1 in which MLSFs were developed by using the ML-based VS pipeline in ML-PLIC. Underpinning the latest version of ML-PLIC is a powerful platform that incorporates physical and biological knowledge about PLI, leveraging PLI characterization and MLSF generation into the design of structure-based VS pipeline. The ML-PLIC web platform is now freely available at http://cadd.zju.edu.cn/plic/.

Assuntos

Algoritmos , Benchmarking , Ligantes , Desenho de Fármacos , Aprendizado de Máquina

14.

A generalized protein-ligand scoring framework with balanced scoring, docking, ranking and screening powers.

Shen, Chao; Zhang, Xujun; Hsieh, Chang-Yu; Deng, Yafeng; Wang, Dong; Xu, Lei; Wu, Jian; Li, Dan; Kang, Yu; Hou, Tingjun; Pan, Peichen.

Chem Sci ; 14(30): 8129-8146, 2023 Aug 02.

Artigo em Inglês | MEDLINE | ID: mdl-37538816

RESUMO

Applying machine learning algorithms to protein-ligand scoring functions has aroused widespread attention in recent years due to the high predictive accuracy and affordable computational cost. Nevertheless, most machine learning-based scoring functions are only applicable to a specific task, e.g., binding affinity prediction, binding pose prediction or virtual screening, suggesting that the development of a scoring function with balanced performance in all critical tasks remains a grand challenge. To this end, we propose a novel parameterization strategy by introducing an adjustable binding affinity term that represents the correlation between the predicted outcomes and experimental data into the training of mixture density network. The resulting residue-atom distance likelihood potential not only retains the superior docking and screening power over all the other state-of-the-art approaches, but also achieves a remarkable improvement in scoring and ranking performance. We emphatically explore the impacts of several key elements on prediction accuracy as well as the task preference, and demonstrate that the performance of scoring/ranking and docking/screening tasks of a certain model could be well balanced through an appropriate manner. Overall, our study highlights the potential utility of our innovative parameterization strategy as well as the resulting scoring framework in future structure-based drug design.

15.

FFLOM: A Flow-Based Autoregressive Model for Fragment-to-Lead Optimization.

Jin, Jieyu; Wang, Dong; Shi, Guqin; Bao, Jingxiao; Wang, Jike; Zhang, Haotian; Pan, Peichen; Li, Dan; Yao, Xiaojun; Liu, Huanxiang; Hou, Tingjun; Kang, Yu.

J Med Chem ; 66(15): 10808-10823, 2023 08 10.

Artigo em Inglês | MEDLINE | ID: mdl-37471134

RESUMO

Recently, deep generative models have been regarded as promising tools in fragment-based drug design (FBDD). Despite the growing interest in these models, they still face challenges in generating molecules with desired properties in low data regimes. In this study, we propose a novel flow-based autoregressive model named FFLOM for linker and R-group design. In a large-scale benchmark evaluation on ZINC, CASF, and PDBbind test sets, FFLOM achieves state-of-the-art performance in terms of validity, uniqueness, novelty, and recovery of the generated molecules and can recover over 92% of the original molecules in the PDBbind test set (with at least five atoms). FFLOM also exhibits excellent potential applicability in several practical scenarios encompassing fragment linking, PROTAC design, R-group growing, and R-group optimization. In all four cases, FFLOM can perfectly reconstruct the ground-truth compounds and generate over 74% of molecules with novel fragments, some of which have higher binding affinity than the ground truth.

Assuntos

Desenho de Fármacos , Ligantes , Tiazóis/química

16.

TB-IECS: an accurate machine learning-based scoring function for virtual screening.

Zhang, Xujun; Shen, Chao; Jiang, Dejun; Zhang, Jintu; Ye, Qing; Xu, Lei; Hou, Tingjun; Pan, Peichen; Kang, Yu.

J Cheminform ; 15(1): 63, 2023 Jul 04.

Artigo em Inglês | MEDLINE | ID: mdl-37403155

RESUMO

Machine learning-based scoring functions (MLSFs) have shown potential for improving virtual screening capabilities over classical scoring functions (SFs). Due to the high computational cost in the process of feature generation, the numbers of descriptors used in MLSFs and the characterization of protein-ligand interactions are always limited, which may affect the overall accuracy and efficiency. Here, we propose a new SF called TB-IECS (theory-based interaction energy component score), which combines energy terms from Smina and NNScore version 2, and utilizes the eXtreme Gradient Boosting (XGBoost) algorithm for model training. In this study, the energy terms decomposed from 15 traditional SFs were firstly categorized based on their formulas and physicochemical principles, and 324 feature combinations were generated accordingly. Five best feature combinations were selected for further evaluation of the model performance in regard to the selection of feature vectors with various length, interaction types and ML algorithms. The virtual screening power of TB-IECS was assessed on the datasets of DUD-E and LIT-PCBA, as well as seven target-specific datasets from the ChemDiv database. The results showed that TB-IECS outperformed classical SFs including Glide SP and Dock, and effectively balanced the efficiency and accuracy for practical virtual screening.

17.

Topology-Based and Conformation-Based Decoys Database: An Unbiased Online Database for Training and Benchmarking Machine-Learning Scoring Functions.

Zhang, Xujun; Shen, Chao; Wang, Tianyue; Kang, Yu; Li, Dan; Pan, Peichen; Wang, Jike; Wang, Gaoang; Deng, Yafeng; Xu, Lei; Cao, Dongsheng; Hou, Tingjun; Wang, Zhe.

J Med Chem ; 66(13): 9174-9183, 2023 07 13.

Artigo em Inglês | MEDLINE | ID: mdl-37317043

RESUMO

Machine-learning-based scoring functions (MLSFs) have gained attention for their potential to improve accuracy in binding affinity prediction and structure-based virtual screening (SBVS) compared to classical SFs. Developing accurate MLSFs for SBVS requires a large and unbiased dataset that includes structurally diverse actives and decoys. Unfortunately, most datasets suffer from hidden biases and data insufficiency. Here, we developed topology-based and conformation-based decoys database (ToCoDDB). The biological targets and active ligands in ToCoDDB were collected from scientific literature and established datasets. The decoys were generated and debiased by using conditional recurrent neural networks and molecular docking. ToCoDDB is presently the largest unbiased database with 2.4 million decoys encompassing 155 targets. The detailed information and performance benchmark for each target are provided, which are beneficial for training and evaluating MLSFs. Moreover, the online decoys generation function of ToCoDDB further expands its application range to any target. ToCoDDB is freely available at http://cadd.zju.edu.cn/tocodecoy/.

Assuntos

Benchmarking , Aprendizado de Máquina , Simulação de Acoplamento Molecular , Conformação Molecular , Bases de Dados Factuais , Ligantes , Ligação Proteica

18.

Chemistry-intuitive explanation of graph neural networks for molecular property prediction with substructure masking.

Wu, Zhenxing; Wang, Jike; Du, Hongyan; Jiang, Dejun; Kang, Yu; Li, Dan; Pan, Peichen; Deng, Yafeng; Cao, Dongsheng; Hsieh, Chang-Yu; Hou, Tingjun.

Nat Commun ; 14(1): 2585, 2023 05 04.

Artigo em Inglês | MEDLINE | ID: mdl-37142585

RESUMO

Graph neural networks (GNNs) have been widely used in molecular property prediction, but explaining their black-box predictions is still a challenge. Most existing explanation methods for GNNs in chemistry focus on attributing model predictions to individual nodes, edges or fragments that are not necessarily derived from a chemically meaningful segmentation of molecules. To address this challenge, we propose a method named substructure mask explanation (SME). SME is based on well-established molecular segmentation methods and provides an interpretation that aligns with the understanding of chemists. We apply SME to elucidate how GNNs learn to predict aqueous solubility, genotoxicity, cardiotoxicity and blood-brain barrier permeation for small molecules. SME provides interpretation that is consistent with the understanding of chemists, alerts them to unreliable performance, and guides them in structural optimization for target properties. Hence, we believe that SME empowers chemists to confidently mine structure-activity relationship (SAR) from reliable GNNs through a transparent inspection on how GNNs pick up useful signals when learning from data.

Assuntos

Barreira Hematoencefálica , Cardiotoxicidade , Humanos , Dano ao DNA , Redes Neurais de Computação , Registros

19.

Learning on topological surface and geometric structure for 3D molecular generation.

Zhang, Odin; Wang, Tianyue; Weng, Gaoqi; Jiang, Dejun; Wang, Ning; Wang, Xiaorui; Zhao, Huifeng; Wu, Jialu; Wang, Ercheng; Chen, Guangyong; Deng, Yafeng; Pan, Peichen; Kang, Yu; Hsieh, Chang-Yu; Hou, Tingjun.

Nat Comput Sci ; 3(10): 849-859, 2023 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-38177756

RESUMO

Highly effective de novo design is a grand challenge of computer-aided drug discovery. Practical structure-specific three-dimensional molecule generations have started to emerge in recent years, but most approaches treat the target structure as a conditional input to bias the molecule generation and do not fully learn the detailed atomic interactions that govern the molecular conformation and stability of the binding complexes. The omission of these fine details leads to many models having difficulty in outputting reasonable molecules for a variety of therapeutic targets. Here, to address this challenge, we formulate a model, called SurfGen, that designs molecules in a fashion closely resembling the figurative key-and-lock principle. SurfGen comprises two equivariant neural networks, Geodesic-GNN and Geoatom-GNN, which capture the topological interactions on the pocket surface and the spatial interaction between ligand atoms and surface nodes, respectively. SurfGen outperforms other methods in a number of benchmarks, and its high sensitivity on the pocket structures enables an effective generative-model-based solution to the thorny issue of mutation-induced drug resistance.

Assuntos

Descoberta de Drogas , Redes Neurais de Computação , Descoberta de Drogas/métodos , Conformação Molecular

20.

Efficient and accurate large library ligand docking with KarmaDock.

Zhang, Xujun; Zhang, Odin; Shen, Chao; Qu, Wanglin; Chen, Shicheng; Cao, Hanqun; Kang, Yu; Wang, Zhe; Wang, Ercheng; Zhang, Jintu; Deng, Yafeng; Liu, Furui; Wang, Tianyue; Du, Hongyan; Wang, Langcheng; Pan, Peichen; Chen, Guangyong; Hsieh, Chang-Yu; Hou, Tingjun.

Nat Comput Sci ; 3(9): 789-804, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-38177786

RESUMO

Ligand docking is one of the core technologies in structure-based virtual screening for drug discovery. However, conventional docking tools and existing deep learning tools may suffer from limited performance in terms of speed, pose quality and binding affinity accuracy. Here we propose KarmaDock, a deep learning approach for ligand docking that integrates the functions of docking acceleration, binding pose generation and correction, and binding strength estimation. The three-stage model consists of the following components: (1) encoders for the protein and ligand to learn the representations of intramolecular interactions; (2) E(n) equivariant graph neural networks with self-attention to update the ligand pose based on both protein-ligand and intramolecular interactions, followed by post-processing to ensure chemically plausible structures; (3) a mixture density network for scoring the binding strength. KarmaDock was validated on four benchmark datasets and tested in a real-world virtual screening project that successfully identified experiment-validated active inhibitors of leukocyte tyrosine kinase (LTK).

Assuntos

Redes Neurais de Computação , Proteínas , Ligação Proteica , Ligantes , Simulação de Acoplamento Molecular , Proteínas/química

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA