Pesquisa | Portal Regional da BVS

MultiModRLBP: A Deep Learning Approach for Multi-Modal RNA-Small Molecule Ligand Binding Sites Prediction.

Wang, Junkai; Quan, Lijun; Jin, Zhi; Wu, Hongjie; Ma, Xuhao; Wang, Xuejiao; Xie, Jingxin; Pan, Deng; Chen, Taoning; Wu, Tingfang; Lyu, Qiang.

IEEE J Biomed Health Inform ; 28(8): 4995-5006, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-38739505

RESUMO

This study aims to tackle the intricate challenge of predicting RNA-small molecule binding sites to explore the potential value in the field of RNA drug targets. To address this challenge, we propose the MultiModRLBP method, which integrates multi-modal features using deep learning algorithms. These features include 3D structural properties at the nucleotide base level of the RNA molecule, relational graphs based on overall RNA structure, and rich RNA semantic information. In our investigation, we gathered 851 interactions between RNA and small molecule ligand from the RNAglib dataset and RLBind training set. Unlike conventional training sets, this collection broadened its scope by including RNA complexes that have the same RNA sequence but change their respective binding sites due to structural differences or the presence of different ligands. This enhancement enables the MultiModRLBP model to more accurately capture subtle changes at the structural level, ultimately improving its ability to discern nuances among similar RNA conformations. Furthermore, we evaluated MultiModRLBP on two classic test sets, Test18 and Test3, highlighting its performance disparities on small molecules based on metal and non-metal ions. Additionally, we conducted a structural sensitivity analysis on specific complex categories, considering RNA instances with varying degrees of structural changes and whether they share the same ligands. The research results indicate that MultiModRLBP outperforms the current state-of-the-art methods on multiple classic test sets, particularly excelling in predicting binding sites for non-metal ions and instances where the binding sites are widely distributed along the sequence. MultiModRLBP also can be used as a potential tool when the RNA structure is perturbed or the RNA experimental tertiary structure is not available. Most importantly, MultiModRLBP exhibits the capability to distinguish binding characteristics of RNA that are structurally diverse yet exhibit sequence similarity. These advancements hold promise in reducing the costs associated with the development of RNA-targeted drugs.

Assuntos

Aprendizado Profundo , RNA , Ligantes , Sítios de Ligação , RNA/química , Biologia Computacional/métodos , Algoritmos , Conformação de Ácido Nucleico , Bibliotecas de Moléculas Pequenas/química

Deep-Learning-Based End-to-End Predictions of CO₂ Capture in Metal-Organic Frameworks.

Lu, Cunxing; Wan, Xili; Ma, Xuhao; Guan, Xinjie; Zhu, Aichun.

J Chem Inf Model ; 62(14): 3281-3290, 2022 Jul 25.

Artigo em Inglês | MEDLINE | ID: mdl-35574760

RESUMO

Metal-organic frameworks (MOFs) have become an active topic because of their excellent carbon capture and storage (CCS) properties. However, it is quite challenging to identify MOFs with superior performance within a massive combinatorial search space. To this end, we propose a deep-learning-based end-to-end prediction model to rapidly and accurately predict the CO2 working capacity and CO2/N2 selectivity of a given MOF under low-pressure conditions. Different from previous methods, our prediction model relies only on the data from the Crystallographic Information File (CIF) rather than handcrafted geometric descriptors and chemical descriptors. The model was developed, trained, and tested on a dataset of 342489 topologically diverse MOFs. Experimental results on the dataset show that the proposed model achieves high prediction performance, i.e., R2 = 0.916 for predicting the CO2 working capacity and R2 = 0.911 for predicting the CO2/N2 selectivity. With regard to the identification of potential high-performing MOFs, 1020 of 1027 (top 3%) high-performance MOFs were recovered while screening only 12% of the entire dataset using our provided pretrained model, reducing the computation time by nearly an order of magnitude when the model was used to prescreen material prior to computationally intensive grand canonical Monte Carlo (GCMC) simulations while still capturing 99% of the high-performance MOFs. In the ab initio training task, the method can achieve R2 = 0.85 with only 20% of the labeled data used for training and recover 995 of 1027 (top 3%) high-performance MOFs with only 12% of the entire dataset screened.

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA