Búsqueda | Portal de Búsqueda de la BVS Ecuador

1.

Structural basis of GABA_B receptor-G_i protein coupling.

Shen, Cangsong; Mao, Chunyou; Xu, Chanjuan; Jin, Nan; Zhang, Huibing; Shen, Dan-Dan; Shen, Qingya; Wang, Xiaomei; Hou, Tingjun; Chen, Zhong; Rondard, Philippe; Pin, Jean-Philippe; Zhang, Yan; Liu, Jianfeng.

Nature ; 594(7864): 594-598, 2021 06.

Artículo en Inglés | MEDLINE | ID: mdl-33911284

RESUMEN

G-protein-coupled receptors (GPCRs) have central roles in intercellular communication1,2. Structural studies have revealed how GPCRs can activate G proteins. However, whether this mechanism is conserved among all classes of GPCR remains unknown. Here we report the structure of the class-C heterodimeric GABAB receptor, which is activated by the inhibitory transmitter GABA, in its active form complexed with Gi1 protein. We found that a single G protein interacts with the GB2 subunit of the GABAB receptor at a site that mainly involves intracellular loop 2 on the side of the transmembrane domain. This is in contrast to the G protein binding in a central cavity, as has been observed with other classes of GPCR. This binding mode results from the active form of the transmembrane domain of this GABAB receptor being different from that of other GPCRs, as it shows no outside movement of transmembrane helix 6. Our work also provides details of the inter- and intra-subunit changes that link agonist binding to G-protein activation in this heterodimeric complex.

Asunto(s)

Proteínas de Unión al GTP/química , Receptores de GABA-B/química , Microscopía por Crioelectrón , Humanos , Unión Proteica , Dominios Proteicos , Multimerización de Proteína , Estructura Terciaria de Proteína

2.

Exploring the activation mechanism of metabotropic glutamate receptor 2.

Zhu, Xiaohong; Luo, Mengqi; An, Ke; Shi, Danfeng; Hou, Tingjun; Warshel, Arieh; Bai, Chen.

Proc Natl Acad Sci U S A ; 121(21): e2401079121, 2024 May 21.

Artículo en Inglés | MEDLINE | ID: mdl-38739800

RESUMEN

Homomeric dimerization of metabotropic glutamate receptors (mGlus) is essential for the modulation of their functions and represents a promising avenue for the development of novel therapeutic approaches to address central nervous system diseases. Yet, the scarcity of detailed molecular and energetic data on mGlu2 impedes our in-depth comprehension of their activation process. Here, we employ computational simulation methods to elucidate the activation process and key events associated with the mGlu2, including a detailed analysis of its conformational transitions, the binding of agonists, Gi protein coupling, and the guanosine diphosphate (GDP) release. Our results demonstrate that the activation of mGlu2 is a stepwise process and several energy barriers need to be overcome. Moreover, we also identify the rate-determining step of the mGlu2's transition from the agonist-bound state to its active state. From the perspective of free-energy analysis, we find that the conformational dynamics of mGlu2's subunit follow coupled rather than discrete, independent actions. Asymmetric dimerization is critical for receptor activation. Our calculation results are consistent with the observation of cross-linking and fluorescent-labeled blot experiments, thus illustrating the reliability of our calculations. Besides, we also identify potential key residues in the Gi protein binding position on mGlu2, mGlu2 dimer's TM6-TM6 interface, and Gi α5 helix by the change of energy barriers after mutation. The implications of our findings could lead to a more comprehensive grasp of class C G protein-coupled receptor activation.

Asunto(s)

Receptores de Glutamato Metabotrópico , Receptores de Glutamato Metabotrópico/metabolismo , Receptores de Glutamato Metabotrópico/química , Humanos , Multimerización de Proteína , Simulación de Dinámica Molecular , Conformación Proteica , Unión Proteica

3.

Multiscale topology in interactomic network: from transcriptome to antiaddiction drug repurposing.

Du, Hongyan; Wei, Guo-Wei; Hou, Tingjun.

Brief Bioinform ; 25(2)2024 Jan 22.

Artículo en Inglés | MEDLINE | ID: mdl-38499497

RESUMEN

The escalating drug addiction crisis in the United States underscores the urgent need for innovative therapeutic strategies. This study embarked on an innovative and rigorous strategy to unearth potential drug repurposing candidates for opioid and cocaine addiction treatment, bridging the gap between transcriptomic data analysis and drug discovery. We initiated our approach by conducting differential gene expression analysis on addiction-related transcriptomic data to identify key genes. We propose a novel topological differentiation to identify key genes from a protein-protein interaction network derived from DEGs. This method utilizes persistent Laplacians to accurately single out pivotal nodes within the network, conducting this analysis in a multiscale manner to ensure high reliability. Through rigorous literature validation, pathway analysis and data-availability scrutiny, we identified three pivotal molecular targets, mTOR, mGluR5 and NMDAR, for drug repurposing from DrugBank. We crafted machine learning models employing two natural language processing (NLP)-based embeddings and a traditional 2D fingerprint, which demonstrated robust predictive ability in gauging binding affinities of DrugBank compounds to selected targets. Furthermore, we elucidated the interactions of promising drugs with the targets and evaluated their drug-likeness. This study delineates a multi-faceted and comprehensive analytical framework, amalgamating bioinformatics, topological data analysis and machine learning, for drug repurposing in addiction treatment, setting the stage for subsequent experimental validation. The versatility of the methods we developed allows for applications across a range of diseases and transcriptomic datasets.

Asunto(s)

Reposicionamiento de Medicamentos , Transcriptoma , Estados Unidos , Reposicionamiento de Medicamentos/métodos , Reproducibilidad de los Resultados , Perfilación de la Expresión Génica , Biología Computacional/métodos

4.

AttABseq: an attention-based deep learning prediction method for antigen-antibody binding affinity changes based on protein sequences.

Jin, Ruofan; Ye, Qing; Wang, Jike; Cao, Zheng; Jiang, Dejun; Wang, Tianyue; Kang, Yu; Xu, Wanting; Hsieh, Chang-Yu; Hou, Tingjun.

Brief Bioinform ; 25(4)2024 May 23.

Artículo en Inglés | MEDLINE | ID: mdl-38960407

RESUMEN

The optimization of therapeutic antibodies through traditional techniques, such as candidate screening via hybridoma or phage display, is resource-intensive and time-consuming. In recent years, computational and artificial intelligence-based methods have been actively developed to accelerate and improve the development of therapeutic antibodies. In this study, we developed an end-to-end sequence-based deep learning model, termed AttABseq, for the predictions of the antigen-antibody binding affinity changes connected with antibody mutations. AttABseq is a highly efficient and generic attention-based model by utilizing diverse antigen-antibody complex sequences as the input to predict the binding affinity changes of residue mutations. The assessment on the three benchmark datasets illustrates that AttABseq is 120% more accurate than other sequence-based models in terms of the Pearson correlation coefficient between the predicted and experimental binding affinity changes. Moreover, AttABseq also either outperforms or competes favorably with the structure-based approaches. Furthermore, AttABseq consistently demonstrates robust predictive capabilities across a diverse array of conditions, underscoring its remarkable capacity for generalization across a wide spectrum of antigen-antibody complexes. It imposes no constraints on the quantity of altered residues, rendering it particularly applicable in scenarios where crystallographic structures remain unavailable. The attention-based interpretability analysis indicates that the causal effects of point mutations on antibody-antigen binding affinity changes can be visualized at the residue level, which might assist automated antibody sequence optimization. We believe that AttABseq provides a fiercely competitive answer to therapeutic antibody optimization.

Asunto(s)

Complejo Antígeno-Anticuerpo , Aprendizaje Profundo , Complejo Antígeno-Anticuerpo/química , Antígenos/química , Antígenos/genética , Antígenos/metabolismo , Antígenos/inmunología , Afinidad de Anticuerpos , Secuencia de Aminoácidos , Biología Computacional/métodos , Humanos , Mutación , Anticuerpos/química , Anticuerpos/inmunología , Anticuerpos/genética , Anticuerpos/metabolismo

5.

ChemFH: an integrated tool for screening frequent false positives in chemical biology and drug discovery.

Shi, Shaohua; Fu, Li; Yi, Jiacai; Yang, Ziyi; Zhang, Xiaochen; Deng, Youchao; Wang, Wenxuan; Wu, Chengkun; Zhao, Wentao; Hou, Tingjun; Zeng, Xiangxiang; Lyu, Aiping; Cao, Dongsheng.

Nucleic Acids Res ; 52(W1): W439-W449, 2024 Jul 05.

Artículo en Inglés | MEDLINE | ID: mdl-38783035

RESUMEN

High-throughput screening rapidly tests an extensive array of chemical compounds to identify hit compounds for specific biological targets in drug discovery. However, false-positive results disrupt hit compound screening, leading to wastage of time and resources. To address this, we propose ChemFH, an integrated online platform facilitating rapid virtual evaluation of potential false positives, including colloidal aggregators, spectroscopic interference compounds, firefly luciferase inhibitors, chemical reactive compounds, promiscuous compounds, and other assay interferences. By leveraging a dataset containing 823 391 compounds, we constructed high-quality prediction models using multi-task directed message-passing network (DMPNN) architectures combining uncertainty estimation, yielding an average AUC value of 0.91. Furthermore, ChemFH incorporated 1441 representative alert substructures derived from the collected data and ten commonly used frequent hitter screening rules. ChemFH was validated with an external set of 75 compounds. Subsequently, the virtual screening capability of ChemFH was successfully confirmed through its application to five virtual screening libraries. Furthermore, ChemFH underwent additional validation on two natural products and FDA-approved drugs, yielding reliable and accurate results. ChemFH is a comprehensive, reliable, and computationally efficient screening pipeline that facilitates the identification of true positive results in assays, contributing to enhanced efficiency and success rates in drug discovery. ChemFH is freely available via https://chemfh.scbdd.com/.

Asunto(s)

Descubrimiento de Drogas , Ensayos Analíticos de Alto Rendimiento , Programas Informáticos , Descubrimiento de Drogas/métodos , Ensayos Analíticos de Alto Rendimiento/métodos , Evaluación Preclínica de Medicamentos/métodos , Reacciones Falso Positivas , Bibliotecas de Moléculas Pequeñas/farmacología , Bibliotecas de Moléculas Pequeñas/química , Humanos

6.

ADMETlab 3.0: an updated comprehensive online ADMET prediction platform enhanced with broader coverage, improved performance, API functionality and decision support.

Fu, Li; Shi, Shaohua; Yi, Jiacai; Wang, Ningning; He, Yuanhang; Wu, Zhenxing; Peng, Jinfu; Deng, Youchao; Wang, Wenxuan; Wu, Chengkun; Lyu, Aiping; Zeng, Xiangxiang; Zhao, Wentao; Hou, Tingjun; Cao, Dongsheng.

Nucleic Acids Res ; 52(W1): W422-W431, 2024 Jul 05.

Artículo en Inglés | MEDLINE | ID: mdl-38572755

RESUMEN

ADMETlab 3.0 is the second updated version of the web server that provides a comprehensive and efficient platform for evaluating ADMET-related parameters as well as physicochemical properties and medicinal chemistry characteristics involved in the drug discovery process. This new release addresses the limitations of the previous version and offers broader coverage, improved performance, API functionality, and decision support. For supporting data and endpoints, this version includes 119 features, an increase of 31 compared to the previous version. The updated number of entries is 1.5 times larger than the previous version with over 400 000 entries. ADMETlab 3.0 incorporates a multi-task DMPNN architecture coupled with molecular descriptors, a method that not only guaranteed calculation speed for each endpoint simultaneously, but also achieved a superior performance in terms of accuracy and robustness. In addition, an API has been introduced to meet the growing demand for programmatic access to large amounts of data in ADMETlab 3.0. Moreover, this version includes uncertainty estimates in the prediction results, aiding in the confident selection of candidate compounds for further studies and experiments. ADMETlab 3.0 is publicly for access without the need for registration at: https://admetlab3.scbdd.com.

Asunto(s)

Descubrimiento de Drogas , Internet , Programas Informáticos , Descubrimiento de Drogas/métodos , Humanos , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/metabolismo

7.

Cooperation of structural motifs controls drug selectivity in cyclin-dependent kinases: an advanced theoretical analysis.

Wang, Lingling; Xu, Lei; Wang, Zhe; Hou, Tingjun; Hao, Haiping; Sun, Huiyong.

Brief Bioinform ; 24(1)2023 01 19.

Artículo en Inglés | MEDLINE | ID: mdl-36578163

RESUMEN

Understanding drug selectivity mechanism is a long-standing issue for helping design drugs with high specificity. Designing drugs targeting cyclin-dependent kinases (CDKs) with high selectivity is challenging because of their highly conserved binding pockets. To reveal the underlying general selectivity mechanism, we carried out comprehensive analyses from both the thermodynamics and kinetics points of view on a representative CDK12 inhibitor. To fully capture the binding features of the drug-target recognition process, we proposed to use kinetic residue energy analysis (KREA) in conjunction with the community network analysis (CNA) to reveal the underlying cooperation effect between individual residues/protein motifs to the binding/dissociating process of the ligand. The general mechanism of drug selectivity in CDKs can be summarized as that the difference of structural cooperation between the ligand and the protein motifs leads to the difference of the energetic contribution of the key residues to the ligand. The proposed mechanisms may be prevalent in drug selectivity issues, and the insights may help design new strategies to overcome/attenuate the drug selectivity associated problems.

Asunto(s)

Quinasas Ciclina-Dependientes , Simulación de Dinámica Molecular , Quinasas Ciclina-Dependientes/metabolismo , Ligandos , Unión Proteica , Termodinámica

8.

Comprehensive assessment of nine target prediction web services: which should we choose for target fishing?

Ji, Kai-Yue; Liu, Chong; Liu, Zhao-Qian; Deng, Ya-Feng; Hou, Ting-Jun; Cao, Dong-Sheng.

Brief Bioinform ; 24(2)2023 03 19.

Artículo en Inglés | MEDLINE | ID: mdl-36681902

RESUMEN

Identification of potential targets for known bioactive compounds and novel synthetic analogs is of considerable significance. In silico target fishing (TF) has become an alternative strategy because of the expensive and laborious wet-lab experiments, explosive growth of bioactivity data and rapid development of high-throughput technologies. However, these TF methods are based on different algorithms, molecular representations and training datasets, which may lead to different results when predicting the same query molecules. This can be confusing for practitioners in practical applications. Therefore, this study systematically evaluated nine popular ligand-based TF methods based on target and ligand-target pair statistical strategies, which will help practitioners make choices among multiple TF methods. The evaluation results showed that SwissTargetPrediction was the best method to produce the most reliable predictions while enriching more targets. High-recall similarity ensemble approach (SEA) was able to find real targets for more compounds compared with other TF methods. Therefore, SwissTargetPrediction and SEA can be considered as primary selection methods in future studies. In addition, the results showed that k = 5 was the optimal number of experimental candidate targets. Finally, a novel ensemble TF method based on consensus voting is proposed to improve the prediction performance. The precision of the ensemble TF method outperforms the individual TF method, indicating that the ensemble TF method can more effectively identify real targets within a given top-k threshold. The results of this study can be used as a reference to guide practitioners in selecting the most effective methods in computational drug discovery.

Asunto(s)

Algoritmos , Ligandos

9.

ML-PLIC: a web platform for characterizing protein-ligand interactions and developing machine learning-based scoring functions.

Zhang, Xujun; Shen, Chao; Wang, Tianyue; Deng, Yafeng; Kang, Yu; Li, Dan; Hou, Tingjun; Pan, Peichen.

Brief Bioinform ; 24(5)2023 09 20.

Artículo en Inglés | MEDLINE | ID: mdl-37738401

RESUMEN

Cracking the entangling code of protein-ligand interaction (PLI) is of great importance to structure-based drug design and discovery. Different physical and biochemical representations can be used to describe PLI such as energy terms and interaction fingerprints, which can be analyzed by machine learning (ML) algorithms to create ML-based scoring functions (MLSFs). Here, we propose the ML-based PLI capturer (ML-PLIC), a web platform that automatically characterizes PLI and generates MLSFs to identify the potential binders of a specific protein target through virtual screening (VS). ML-PLIC comprises five modules, including Docking for ligand docking, Descriptors for PLI generation, Modeling for MLSF training, Screening for VS and Pipeline for the integration of the aforementioned functions. We validated the MLSFs constructed by ML-PLIC in three benchmark datasets (Directory of Useful Decoys-Enhanced, Active as Decoys and TocoDecoy), demonstrating accuracy outperforming traditional docking tools and competitive performance to the deep learning-based SF, and provided a case study of the Serine/threonine-protein kinase WEE1 in which MLSFs were developed by using the ML-based VS pipeline in ML-PLIC. Underpinning the latest version of ML-PLIC is a powerful platform that incorporates physical and biological knowledge about PLI, leveraging PLI characterization and MLSF generation into the design of structure-based VS pipeline. The ML-PLIC web platform is now freely available at http://cadd.zju.edu.cn/plic/.

Asunto(s)

Algoritmos , Benchmarking , Ligandos , Diseño de Fármacos , Aprendizaje Automático

10.

Learning with uncertainty to accelerate the discovery of histone lysine-specific demethylase 1A (KDM1A/LSD1) inhibitors.

Wang, Dong; Wu, Zhenxing; Shen, Chao; Bao, Lingjie; Luo, Hao; Wang, Zhe; Yao, Hucheng; Kong, De-Xin; Luo, Cheng; Hou, Tingjun.

Brief Bioinform ; 24(1)2023 01 19.

Artículo en Inglés | MEDLINE | ID: mdl-36573494

RESUMEN

Machine learning including modern deep learning models has been extensively used in drug design and screening. However, reliable prediction of molecular properties is still challenging when exploring out-of-domain regimes, even for deep neural networks. Therefore, it is important to understand the uncertainty of model predictions, especially when the predictions are used to guide further experiments. In this study, we explored the utility and effectiveness of evidential uncertainty in compound screening. The evidential Graphormer model was proposed for uncertainty-guided discovery of KDM1A/LSD1 inhibitors. The benchmarking results illustrated that (i) Graphormer exhibited comparative predictive power to state-of-the-art models, and (ii) evidential regression enabled well-ranked uncertainty estimates and calibrated predictions. Subsequently, we leveraged time-splitting on the curated KDM1A/LSD1 dataset to simulate out-of-distribution predictions. The retrospective virtual screening showed that the evidential uncertainties helped reduce false positives among the top-acquired compounds and thus enabled higher experimental validation rates. The trained model was then used to virtually screen an independent in-house compound set. The top 50 compounds ranked by two different ranking strategies were experimentally validated, respectively. In general, our study highlighted the importance to understand the uncertainty in prediction, which can be recognized as an interpretable dimension to model predictions.

Asunto(s)

Histonas , Lisina , Estudios Retrospectivos , Incertidumbre , Histona Demetilasas/metabolismo

11.

Can molecular dynamics simulations improve predictions of protein-ligand binding affinity with machine learning?

Gu, Shukai; Shen, Chao; Yu, Jiahui; Zhao, Hong; Liu, Huanxiang; Liu, Liwei; Sheng, Rong; Xu, Lei; Wang, Zhe; Hou, Tingjun; Kang, Yu.

Brief Bioinform ; 24(2)2023 03 19.

Artículo en Inglés | MEDLINE | ID: mdl-36681903

RESUMEN

Binding affinity prediction largely determines the discovery efficiency of lead compounds in drug discovery. Recently, machine learning (ML)-based approaches have attracted much attention in hopes of enhancing the predictive performance of traditional physics-based approaches. In this study, we evaluated the impact of structural dynamic information on the binding affinity prediction by comparing the models trained on different dimensional descriptors, using three targets (i.e. JAK1, TAF1-BD2 and DDR1) and their corresponding ligands as the examples. Here, 2D descriptors are traditional ECFP4 fingerprints, 3D descriptors are the energy terms of the Smina and NNscore scoring functions and 4D descriptors contain the structural dynamic information derived from the trajectories based on molecular dynamics (MD) simulations. We systematically investigate the MD-refined binding affinity prediction performance of three classical ML algorithms (i.e. RF, SVR and XGB) as well as two common virtual screening methods, namely Glide docking and MM/PBSA. The outcomes of the ML models built using various dimensional descriptors and their combinations reveal that the MD refinement with the optimized protocol can improve the predictive performance on the TAF1-BD2 target with considerable structural flexibility, but not for the less flexible JAK1 and DDR1 targets, when taking docking poses as the initial structure instead of the crystal structures. The results highlight the importance of the initial structures to the final performance of the model through conformational analysis on the three targets with different flexibility.

Asunto(s)

Simulación de Dinámica Molecular , Proteínas , Ligandos , Proteínas/química , Unión Proteica , Aprendizaje Automático , Simulación del Acoplamiento Molecular

12.

TransFoxMol: predicting molecular property with focused attention.

Gao, Jian; Shen, Zheyuan; Xie, Yufeng; Lu, Jialiang; Lu, Yang; Chen, Sikang; Bian, Qingyu; Guo, Yue; Shen, Liteng; Wu, Jian; Zhou, Binbin; Hou, Tingjun; He, Qiaojun; Che, Jinxin; Dong, Xiaowu.

Brief Bioinform ; 24(5)2023 09 20.

Artículo en Inglés | MEDLINE | ID: mdl-37605947

RESUMEN

Predicting the biological properties of molecules is crucial in computer-aided drug development, yet it's often impeded by data scarcity and imbalance in many practical applications. Existing approaches are based on self-supervised learning or 3D data and using an increasing number of parameters to improve performance. These approaches may not take full advantage of established chemical knowledge and could inadvertently introduce noise into the respective model. In this study, we introduce a more elegant transformer-based framework with focused attention for molecular representation (TransFoxMol) to improve the understanding of artificial intelligence (AI) of molecular structure property relationships. TransFoxMol incorporates a multi-scale 2D molecular environment into a graph neural network + Transformer module and uses prior chemical maps to obtain a more focused attention landscape compared to that obtained using existing approaches. Experimental results show that TransFoxMol achieves state-of-the-art performance on MoleculeNet benchmarks and surpasses the performance of baselines that use self-supervised learning or geometry-enhanced strategies on small-scale datasets. Subsequent analyses indicate that TransFoxMol's predictions are highly interpretable and the clever use of chemical knowledge enables AI to perceive molecules in a simple but rational way, enhancing performance.

Asunto(s)

Inteligencia Artificial , Benchmarking , Redes Neurales de la Computación

13.

Reducing false positive rate of docking-based virtual screening by active learning.

Wang, Lei; Shi, Shao-Hua; Li, Hui; Zeng, Xiang-Xiang; Liu, Su-You; Liu, Zhao-Qian; Deng, Ya-Feng; Lu, Ai-Ping; Hou, Ting-Jun; Cao, Dong-Sheng.

Brief Bioinform ; 24(1)2023 01 19.

Artículo en Inglés | MEDLINE | ID: mdl-36642412

RESUMEN

Machine learning-based scoring functions (MLSFs) have become a very favorable alternative to classical scoring functions because of their potential superior screening performance. However, the information of negative data used to construct MLSFs was rarely reported in the literature, and meanwhile the putative inactive molecules recorded in existing databases usually have obvious bias from active molecules. Here we proposed an easy-to-use method named AMLSF that combines active learning using negative molecular selection strategies with MLSF, which can iteratively improve the quality of inactive sets and thus reduce the false positive rate of virtual screening. We chose energy auxiliary terms learning as the MLSF and validated our method on eight targets in the diverse subset of DUD-E. For each target, we screened the IterBioScreen database by AMLSF and compared the screening results with those of the four control models. The results illustrate that the number of active molecules in the top 1000 molecules identified by AMLSF was significantly higher than those identified by the control models. In addition, the free energy calculation results for the top 10 molecules screened out by the AMLSF, null model and control models based on DUD-E also proved that more active molecules can be identified, and the false positive rate can be reduced by AMLSF.

Asunto(s)

Proteínas , Proteínas/metabolismo , Bases de Datos Factuales , Ligandos , Simulación del Acoplamiento Molecular , Unión Proteica

14.

Comprehensive assessment of protein loop modeling programs on large-scale datasets: prediction accuracy and efficiency.

Wang, Tianyue; Wang, Langcheng; Zhang, Xujun; Shen, Chao; Zhang, Odin; Wang, Jike; Wu, Jialu; Jin, Ruofan; Zhou, Donghao; Chen, Shicheng; Liu, Liwei; Wang, Xiaorui; Hsieh, Chang-Yu; Chen, Guangyong; Pan, Peichen; Kang, Yu; Hou, Tingjun.

Brief Bioinform ; 25(1)2023 11 22.

Artículo en Inglés | MEDLINE | ID: mdl-38171930

RESUMEN

Protein loops play a critical role in the dynamics of proteins and are essential for numerous biological functions, and various computational approaches to loop modeling have been proposed over the past decades. However, a comprehensive understanding of the strengths and weaknesses of each method is lacking. In this work, we constructed two high-quality datasets (i.e. the General dataset and the CASP dataset) and systematically evaluated the accuracy and efficiency of 13 commonly used loop modeling approaches from the perspective of loop lengths, protein classes and residue types. The results indicate that the knowledge-based method FREAD generally outperforms the other tested programs in most cases, but encountered challenges when predicting loops longer than 15 and 30 residues on the CASP and General datasets, respectively. The ab initio method Rosetta NGK demonstrated exceptional modeling accuracy for short loops with four to eight residues and achieved the highest success rate on the CASP dataset. The well-known AlphaFold2 and RoseTTAFold require more resources for better performance, but they exhibit promise for predicting loops longer than 16 and 30 residues in the CASP and General datasets. These observations can provide valuable insights for selecting suitable methods for specific loop modeling tasks and contribute to future advancements in the field.

Asunto(s)

Proteínas , Conformación Proteica , Proteínas/química

15.

Advancing Ligand Docking through Deep Learning: Challenges and Prospects in Virtual Screening.

Zhang, Xujun; Shen, Chao; Zhang, Haotian; Kang, Yu; Hsieh, Chang-Yu; Hou, Tingjun.

Acc Chem Res ; 57(10): 1500-1509, 2024 05 21.

Artículo en Inglés | MEDLINE | ID: mdl-38577892

RESUMEN

Molecular docking, also termed ligand docking (LD), is a pivotal element of structure-based virtual screening (SBVS) used to predict the binding conformations and affinities of protein-ligand complexes. Traditional LD methodologies rely on a search and scoring framework, utilizing heuristic algorithms to explore binding conformations and scoring functions to evaluate binding strengths. However, to meet the efficiency demands of SBVS, these algorithms and functions are often simplified, prioritizing speed over accuracy.The emergence of deep learning (DL) has exerted a profound impact on diverse fields, ranging from natural language processing to computer vision and drug discovery. DeepMind's AlphaFold2 has impressively exhibited its ability to accurately predict protein structures solely from amino acid sequences, highlighting the remarkable potential of DL in conformation prediction. This groundbreaking advancement circumvents the traditional search-scoring frameworks in LD, enhancing both accuracy and processing speed and thereby catalyzing a broader adoption of DL algorithms in binding pose prediction. Nevertheless, a consensus on certain aspects remains elusive.In this Account, we delineate the current status of employing DL to augment LD within the VS paradigm, highlighting our contributions to this domain. Furthermore, we discuss the challenges and future prospects, drawing insights from our scholarly investigations. Initially, we present an overview of VS and LD, followed by an introduction to DL paradigms, which deviate significantly from traditional search-scoring frameworks. Subsequently, we delve into the challenges associated with the development of DL-based LD (DLLD), encompassing evaluation metrics, application scenarios, and physical plausibility of the predicted conformations. In the evaluation of LD algorithms, it is essential to recognize the multifaceted nature of the metrics. While the accuracy of binding pose prediction, often measured by the success rate, is a pivotal aspect, the scoring/screening power and computational speed of these algorithms are equally important given the pivotal role of LD tools in VS. Regarding application scenarios, early methods focused on blind docking, where the binding site is unknown. However, recent studies suggest a shift toward identifying binding sites rather than solely predicting binding poses within these models. In contrast, LD with a known pocket in VS has been shown to be more practical. Physical plausibility poses another significant challenge. Although DLLD models often achieve higher success rates compared to traditional methods, they may generate poses with implausible local structures, such as incorrect bond angles or lengths, which are disadvantageous for postprocessing tasks like visualization. Finally, we discuss the future perspectives for DLLD, emphasizing the need to improve generalization ability, strike a balance between speed and accuracy, account for protein conformation flexibility, and enhance physical plausibility. Additionally, we delve into the comparison between generative and regression algorithms in this context, exploring their respective strengths and potential.

Asunto(s)

Aprendizaje Profundo , Simulación del Acoplamiento Molecular , Ligandos , Proteínas/química , Proteínas/metabolismo , Algoritmos , Descubrimiento de Drogas

16.

PROTAC-DB 2.0: an updated database of PROTACs.

Weng, Gaoqi; Cai, Xuanyan; Cao, Dongsheng; Du, Hongyan; Shen, Chao; Deng, Yafeng; He, Qiaojun; Yang, Bo; Li, Dan; Hou, Tingjun.

Nucleic Acids Res ; 51(D1): D1367-D1372, 2023 01 06.

Artículo en Inglés | MEDLINE | ID: mdl-36300631

RESUMEN

Proteolysis targeting chimeras (PROTACs), which harness the ubiquitin-proteasome system to selectively induce targeted protein degradation, represent an emerging therapeutic technology with the potential to modulate traditional undruggable targets. Over the past few years, this technology has moved from academia to industry and more than 10 PROTACs have been advanced into clinical trials. However, designing potent PROTACs with desirable drug-like properties still remains a great challenge. Here, we report an updated online database, PROTAC-DB 2.0, which is a repository of structural and experimental data about PROTACs. In this 2nd release, we expanded the number of PROTACs to 3270, which corresponds to a 96% expansion over the first version. Meanwhile, the numbers of warheads (small molecules targeting the proteins of interest), linkers, and E3 ligands (small molecules recruiting E3 ligases) have increased to over 360, 1500 and 80, respectively. In addition, given the importance and the limited number of the crystal target-PROTAC-E3 ternary complex structures, we provide the predicted ternary complex structures for PROTACs with good degradation capability using our PROTAC-Model method. To further facilitate the analysis of PROTAC data, a new filtering strategy based on the E3 ligases is also added. PROTAC-DB 2.0 is available online at http://cadd.zju.edu.cn/protacdb/.

Asunto(s)

Bases de Datos de Proteínas , Complejo de la Endopetidasa Proteasomal , Proteolisis , Complejo de la Endopetidasa Proteasomal/metabolismo , Proteínas/metabolismo , Ubiquitina/metabolismo , Ubiquitina-Proteína Ligasas/metabolismo

17.

RNAincoder: a deep learning-based encoder for RNA and RNA-associated interaction.

Wang, Yunxia; Chen, Zhen; Pan, Ziqi; Huang, Shijie; Liu, Jin; Xia, Weiqi; Zhang, Hongning; Zheng, Mingyue; Li, Honglin; Hou, Tingjun; Zhu, Feng.

Nucleic Acids Res ; 51(W1): W509-W519, 2023 07 05.

Artículo en Inglés | MEDLINE | ID: mdl-37166951

RESUMEN

Ribonucleic acids (RNAs) involve in various physiological/pathological processes by interacting with proteins, compounds, and other RNAs. A variety of powerful computational methods have been developed to predict such valuable interactions. However, all these methods rely heavily on the 'digitalization' (also known as 'encoding') of RNA-associated interacting pairs into a computer-recognizable descriptor. In other words, it is urgently needed to have a powerful tool that can not only represent each interacting partner but also integrate both partners into a computer-recognizable interaction. Herein, RNAincoder (deep learning-based encoder for RNA-associated interactions) was therefore proposed to (a) provide a comprehensive collection of RNA encoding features, (b) realize the representation of any RNA-associated interaction based on a well-established deep learning-based embedding strategy and (c) enable large-scale scanning of all possible feature combinations to identify the one of optimal performance in RNA-associated interaction prediction. The effectiveness of RNAincoder was extensively validated by case studies on benchmark datasets. All in all, RNAincoder is distinguished for its capability in providing a more accurate representation of RNA-associated interactions, which makes it an indispensable complement to other available tools. RNAincoder can be accessed at https://idrblab.org/rnaincoder/.

Asunto(s)

Biología Computacional , ARN , Biología Computacional/métodos , Aprendizaje Profundo , Proteínas/metabolismo , ARN/genética , ARN/metabolismo , Internet

18.

DrugMAP: molecular atlas and pharma-information of all drugs.

Li, Fengcheng; Yin, Jiayi; Lu, Mingkun; Mou, Minjie; Li, Zhaorong; Zeng, Zhenyu; Tan, Ying; Wang, Shanshan; Chu, Xinyi; Dai, Haibin; Hou, Tingjun; Zeng, Su; Chen, Yuzong; Zhu, Feng.

Nucleic Acids Res ; 51(D1): D1288-D1299, 2023 01 06.

Artículo en Inglés | MEDLINE | ID: mdl-36243961

RESUMEN

The efficacy and safety of drugs are widely known to be determined by their interactions with multiple molecules of pharmacological importance, and it is therefore essential to systematically depict the molecular atlas and pharma-information of studied drugs. However, our understanding of such information is neither comprehensive nor precise, which necessitates the construction of a new database providing a network containing a large number of drugs and their interacting molecules. Here, a new database describing the molecular atlas and pharma-information of drugs (DrugMAP) was therefore constructed. It provides a comprehensive list of interacting molecules for >30 000 drugs/drug candidates, gives the differential expression patterns for >5000 interacting molecules among different disease sites, ADME (absorption, distribution, metabolism and excretion)-relevant organs and physiological tissues, and weaves a comprehensive and precise network containing >200 000 interactions among drugs and molecules. With the great efforts made to clarify the complex mechanism underlying drug pharmacokinetics and pharmacodynamics and rapidly emerging interests in artificial intelligence (AI)-based network analyses, DrugMAP is expected to become an indispensable supplement to existing databases to facilitate drug discovery. It is now fully and freely accessible at: https://idrblab.org/drugmap/.

Asunto(s)

Inteligencia Artificial , Descubrimiento de Drogas , Bases de Datos Factuales , Preparaciones Farmacéuticas , Atlas como Asunto

19.

CovInter: interaction data between coronavirus RNAs and host proteins.

Amahong, Kuerbannisha; Zhang, Wei; Zhou, Ying; Zhang, Song; Yin, Jiayi; Li, Fengcheng; Xu, Hongquan; Yan, Tianci; Yue, Zixuan; Liu, Yuhong; Hou, Tingjun; Qiu, Yunqing; Tao, Lin; Han, Lianyi; Zhu, Feng.

Nucleic Acids Res ; 51(D1): D546-D556, 2023 01 06.

Artículo en Inglés | MEDLINE | ID: mdl-36200814

RESUMEN

Coronavirus has brought about three massive outbreaks in the past two decades. Each step of its life cycle invariably depends on the interactions among virus and host molecules. The interaction between virus RNA and host protein (IVRHP) is unique compared to other virus-host molecular interactions and represents not only an attempt by viruses to promote their translation/replication, but also the host's endeavor to combat viral pathogenicity. In other words, there is an urgent need to develop a database for providing such IVRHP data. In this study, a new database was therefore constructed to describe the interactions between coronavirus RNAs and host proteins (CovInter). This database is unique in (a) unambiguously characterizing the interactions between virus RNA and host protein, (b) comprehensively providing experimentally validated biological function for hundreds of host proteins key in viral infection and (c) systematically quantifying the differential expression patterns (before and after infection) of these key proteins. Given the devastating and persistent threat of coronaviruses, CovInter is highly expected to fill the gap in the whole process of the 'molecular arms race' between viruses and their hosts, which will then aid in the discovery of new antiviral therapies. It's now free and publicly accessible at: https://idrblab.org/covinter/.

Asunto(s)

Coronavirus , Interacciones Huésped-Patógeno , ARN Viral , Humanos , Coronavirus/genética , Coronavirus/metabolismo , Infecciones por Coronavirus/metabolismo , Interacciones Huésped-Patógeno/genética , ARN Viral/genética , ARN Viral/metabolismo , Replicación Viral , Bases de Datos Genéticas

20.

A task-specific encoding algorithm for RNAs and RNA-associated interactions based on convolutional autoencoder.

Wang, Yunxia; Pan, Ziqi; Mou, Minjie; Xia, Weiqi; Zhang, Hongning; Zhang, Hanyu; Liu, Jin; Zheng, Lingyan; Luo, Yongchao; Zheng, Hanqi; Yu, Xinyuan; Lian, Xichen; Zeng, Zhenyu; Li, Zhaorong; Zhang, Bing; Zheng, Mingyue; Li, Honglin; Hou, Tingjun; Zhu, Feng.

Nucleic Acids Res ; 51(21): e110, 2023 Nov 27.

Artículo en Inglés | MEDLINE | ID: mdl-37889083

RESUMEN

RNAs play essential roles in diverse physiological and pathological processes by interacting with other molecules (RNA/protein/compound), and various computational methods are available for identifying these interactions. However, the encoding features provided by existing methods are limited and the existing tools does not offer an effective way to integrate the interacting partners. In this study, a task-specific encoding algorithm for RNAs and RNA-associated interactions was therefore developed. This new algorithm was unique in (a) realizing comprehensive RNA feature encoding by introducing a great many of novel features and (b) enabling task-specific integration of interacting partners using convolutional autoencoder-directed feature embedding. Compared with existing methods/tools, this novel algorithm demonstrated superior performances in diverse benchmark testing studies. This algorithm together with its source code could be readily accessed by all user at: https://idrblab.org/corain/ and https://github.com/idrblab/corain/.

Asunto(s)

Biología Computacional , ARN , ARN/genética , Biología Computacional/métodos , Algoritmos , Programas Informáticos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA