Búsqueda | Portal Regional de la BVS

1.

Highly Accurate and Efficient Deep Learning Paradigm for Full-Atom Protein Loop Modeling with KarmaLoop.

Wang, Tianyue; Zhang, Xujun; Zhang, Odin; Chen, Guangyong; Pan, Peichen; Wang, Ercheng; Wang, Jike; Wu, Jialu; Zhou, Donghao; Wang, Langcheng; Jin, Ruofan; Chen, Shicheng; Shen, Chao; Kang, Yu; Hsieh, Chang-Yu; Hou, Tingjun.

Research (Wash D C) ; 7: 0408, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-39055686

RESUMEN

Protein loop modeling is a challenging yet highly nontrivial task in protein structure prediction. Despite recent progress, existing methods including knowledge-based, ab initio, hybrid, and deep learning (DL) methods fall substantially short of either atomic accuracy or computational efficiency. To overcome these limitations, we present KarmaLoop, a novel paradigm that distinguishes itself as the first DL method centered on full-atom (encompassing both backbone and side-chain heavy atoms) protein loop modeling. Our results demonstrate that KarmaLoop considerably outperforms conventional and DL-based methods of loop modeling in terms of both accuracy and efficiency, with the average RMSDs of 1.77 and 1.95 Å for the CASP13+14 and CASP15 benchmark datasets, respectively, and manifests at least 2 orders of magnitude speedup in general compared with other methods. Consequently, our comprehensive evaluations indicate that KarmaLoop provides a state-of-the-art DL solution for protein loop modeling, with the potential to hasten the advancement of protein engineering, antibody-antigen recognition, and drug design.

2.

DrugFlow: An AI-Driven One-Stop Platform for Innovative Drug Discovery.

Shen, Chao; Song, Jianfei; Hsieh, Chang-Yu; Cao, Dongsheng; Kang, Yu; Ye, Wenling; Wu, Zhenxing; Wang, Jike; Zhang, Odin; Zhang, Xujun; Zeng, Hao; Cai, Heng; Chen, Yu; Chen, Linkang; Luo, Hao; Zhao, Xinda; Jian, Tianye; Chen, Tong; Jiang, Dejun; Wang, Mingyang; Ye, Qing; Wu, Jialu; Du, Hongyan; Shi, Hui; Deng, Yafeng; Hou, Tingjun.

J Chem Inf Model ; 64(14): 5381-5391, 2024 Jul 22.

Artículo en Inglés | MEDLINE | ID: mdl-38920405

RESUMEN

Artificial intelligence (AI)-aided drug design has demonstrated unprecedented effects on modern drug discovery, but there is still an urgent need for user-friendly interfaces that bridge the gap between these sophisticated tools and scientists, particularly those who are less computer savvy. Herein, we present DrugFlow, an AI-driven one-stop platform that offers a clean, convenient, and cloud-based interface to streamline early drug discovery workflows. By seamlessly integrating a range of innovative AI algorithms, covering molecular docking, quantitative structure-activity relationship modeling, molecular generation, ADMET (absorption, distribution, metabolism, excretion and toxicity) prediction, and virtual screening, DrugFlow can offer effective AI solutions for almost all crucial stages in early drug discovery, including hit identification and hit/lead optimization. We hope that the platform can provide sufficiently valuable guidance to aid real-word drug design and discovery. The platform is available at https://drugflow.com.

Asunto(s)

Inteligencia Artificial , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Simulación del Acoplamiento Molecular , Relación Estructura-Actividad Cuantitativa , Algoritmos , Diseño de Fármacos , Programas Informáticos , Humanos , Nube Computacional

3.

Combining Transition Path Sampling with Data-Driven Collective Variables through a Reactivity-Biased Shooting Algorithm.

Zhang, Jintu; Zhang, Odin; Bonati, Luigi; Hou, TingJun.

J Chem Theory Comput ; 20(11): 4523-4532, 2024 Jun 11.

Artículo en Inglés | MEDLINE | ID: mdl-38801759

RESUMEN

Rare event sampling is a central problem in modern computational chemistry research. Among the existing methods, transition path sampling (TPS) can generate unbiased representations of reaction processes. However, its efficiency depends on the ability to generate reactive trial paths, which in turn depends on the quality of the shooting algorithm used. We propose a new algorithm based on the shooting success rate, i.e., reactivity, measured as a function of a reduced set of collective variables (CVs). These variables are extracted with a machine learning approach directly from TPS simulations, using a multitask objective function. Iteratively, this workflow significantly improves the shooting efficiency without any prior knowledge of the process. In addition, the optimized CVs can be used with biased enhanced sampling methodologies to accurately reconstruct the free energy profiles. We tested the method on three different systems: a two-dimensional toy model, conformational transitions of alanine dipeptide, and hydrolysis of acetyl chloride in bulk water. In the latter, we integrated our workflow with an active learning scheme to learn a reactive machine learning-based potential, which allowed us to study the mechanism and free energy profile with an ab initio-like accuracy.

4.

A flexible data-free framework for structure-based de novo drug design with reinforcement learning.

Du, Hongyan; Jiang, Dejun; Zhang, Odin; Wu, Zhenxing; Gao, Junbo; Zhang, Xujun; Wang, Xiaorui; Deng, Yafeng; Kang, Yu; Li, Dan; Pan, Peichen; Hsieh, Chang-Yu; Hou, Tingjun.

Chem Sci ; 14(43): 12166-12181, 2023 Nov 08.

Artículo en Inglés | MEDLINE | ID: mdl-37969589

RESUMEN

Contemporary structure-based molecular generative methods have demonstrated their potential to model the geometric and energetic complementarity between ligands and receptors, thereby facilitating the design of molecules with favorable binding affinity and target specificity. Despite the introduction of deep generative models for molecular generation, the atom-wise generation paradigm that partially contradicts chemical intuition limits the validity and synthetic accessibility of the generated molecules. Additionally, the dependence of deep learning models on large-scale structural data has hindered their adaptability across different targets. To overcome these challenges, we present a novel search-based framework, 3D-MCTS, for structure-based de novo drug design. Distinct from prevailing atom-centric methods, 3D-MCTS employs a fragment-based molecular editing strategy. The fragments decomposed from small-molecule drugs are recombined under predefined retrosynthetic rules, offering improved drug-likeness and synthesizability, overcoming the inherent limitations of atom-based approaches. Leveraging multi-threaded parallel simulations combined with a real-time energy constraint-based pruning strategy, 3D-MCTS achieves remarkable efficiency. At a fixed computational cost, it outperforms other state-of-the-art (SOTA) methods by producing molecules with enhanced binding affinity. Furthermore, its fragment-based approach ensures the generation of more dependable binding conformations, exhibiting a success rate 43.6% higher than that of other SOTAs. This advantage becomes even more pronounced when handling targets that significantly deviate from the training dataset. 3D-MCTS is capable of achieving thirty times more hits with high binding affinity than traditional virtual screening methods, which demonstrates the superior ability of 3D-MCTS to explore chemical space. Moreover, the flexibility of our framework makes it easy to incorporate domain knowledge during the process, thereby enabling the generation of molecules with desirable pharmacophores and enhanced binding affinity. The adaptability of 3D-MCTS is further showcased in metalloprotein applications, highlighting its potential across various drug design scenarios.

5.

Learning on topological surface and geometric structure for 3D molecular generation.

Zhang, Odin; Wang, Tianyue; Weng, Gaoqi; Jiang, Dejun; Wang, Ning; Wang, Xiaorui; Zhao, Huifeng; Wu, Jialu; Wang, Ercheng; Chen, Guangyong; Deng, Yafeng; Pan, Peichen; Kang, Yu; Hsieh, Chang-Yu; Hou, Tingjun.

Nat Comput Sci ; 3(10): 849-859, 2023 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-38177756

RESUMEN

Highly effective de novo design is a grand challenge of computer-aided drug discovery. Practical structure-specific three-dimensional molecule generations have started to emerge in recent years, but most approaches treat the target structure as a conditional input to bias the molecule generation and do not fully learn the detailed atomic interactions that govern the molecular conformation and stability of the binding complexes. The omission of these fine details leads to many models having difficulty in outputting reasonable molecules for a variety of therapeutic targets. Here, to address this challenge, we formulate a model, called SurfGen, that designs molecules in a fashion closely resembling the figurative key-and-lock principle. SurfGen comprises two equivariant neural networks, Geodesic-GNN and Geoatom-GNN, which capture the topological interactions on the pocket surface and the spatial interaction between ligand atoms and surface nodes, respectively. SurfGen outperforms other methods in a number of benchmarks, and its high sensitivity on the pocket structures enables an effective generative-model-based solution to the thorny issue of mutation-induced drug resistance.

Asunto(s)

Descubrimiento de Drogas , Redes Neurales de la Computación , Descubrimiento de Drogas/métodos , Conformación Molecular

6.

Efficient and accurate large library ligand docking with KarmaDock.

Zhang, Xujun; Zhang, Odin; Shen, Chao; Qu, Wanglin; Chen, Shicheng; Cao, Hanqun; Kang, Yu; Wang, Zhe; Wang, Ercheng; Zhang, Jintu; Deng, Yafeng; Liu, Furui; Wang, Tianyue; Du, Hongyan; Wang, Langcheng; Pan, Peichen; Chen, Guangyong; Hsieh, Chang-Yu; Hou, Tingjun.

Nat Comput Sci ; 3(9): 789-804, 2023 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-38177786

RESUMEN

Ligand docking is one of the core technologies in structure-based virtual screening for drug discovery. However, conventional docking tools and existing deep learning tools may suffer from limited performance in terms of speed, pose quality and binding affinity accuracy. Here we propose KarmaDock, a deep learning approach for ligand docking that integrates the functions of docking acceleration, binding pose generation and correction, and binding strength estimation. The three-stage model consists of the following components: (1) encoders for the protein and ligand to learn the representations of intramolecular interactions; (2) E(n) equivariant graph neural networks with self-attention to update the ligand pose based on both protein-ligand and intramolecular interactions, followed by post-processing to ensure chemically plausible structures; (3) a mixture density network for scoring the binding strength. KarmaDock was validated on four benchmark datasets and tested in a real-world virtual screening project that successfully identified experiment-validated active inhibitors of leukocyte tyrosine kinase (LTK).

Asunto(s)

Redes Neurales de la Computación , Proteínas , Unión Proteica , Ligandos , Simulación del Acoplamiento Molecular , Proteínas/química

7.

Comprehensive assessment of protein loop modeling programs on large-scale datasets: prediction accuracy and efficiency.

Wang, Tianyue; Wang, Langcheng; Zhang, Xujun; Shen, Chao; Zhang, Odin; Wang, Jike; Wu, Jialu; Jin, Ruofan; Zhou, Donghao; Chen, Shicheng; Liu, Liwei; Wang, Xiaorui; Hsieh, Chang-Yu; Chen, Guangyong; Pan, Peichen; Kang, Yu; Hou, Tingjun.

Brief Bioinform ; 25(1)2023 11 22.

Artículo en Inglés | MEDLINE | ID: mdl-38171930

RESUMEN

Protein loops play a critical role in the dynamics of proteins and are essential for numerous biological functions, and various computational approaches to loop modeling have been proposed over the past decades. However, a comprehensive understanding of the strengths and weaknesses of each method is lacking. In this work, we constructed two high-quality datasets (i.e. the General dataset and the CASP dataset) and systematically evaluated the accuracy and efficiency of 13 commonly used loop modeling approaches from the perspective of loop lengths, protein classes and residue types. The results indicate that the knowledge-based method FREAD generally outperforms the other tested programs in most cases, but encountered challenges when predicting loops longer than 15 and 30 residues on the CASP and General datasets, respectively. The ab initio method Rosetta NGK demonstrated exceptional modeling accuracy for short loops with four to eight residues and achieved the highest success rate on the CASP dataset. The well-known AlphaFold2 and RoseTTAFold require more resources for better performance, but they exhibit promise for predicting loops longer than 16 and 30 residues in the CASP and General datasets. These observations can provide valuable insights for selecting suitable methods for specific loop modeling tasks and contribute to future advancements in the field.

Asunto(s)

Proteínas , Conformación Proteica , Proteínas/química

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA