Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 126
Filtrar
1.
Adv Sci (Weinh) ; : e2400829, 2024 May 05.
Artigo em Inglês | MEDLINE | ID: mdl-38704695

RESUMO

Self-assembling peptides have numerous applications in medicine, food chemistry, and nanotechnology. However, their discovery has traditionally been serendipitous rather than driven by rational design. Here, HydrogelFinder, a foundation model is developed for the rational design of self-assembling peptides from scratch. This model explores the self-assembly properties by molecular structure, leveraging 1,377 self-assembling non-peptidal small molecules to navigate chemical space and improve structural diversity. Utilizing HydrogelFinder, 111 peptide candidates are generated and synthesized 17 peptides, subsequently experimentally validating the self-assembly and biophysical characteristics of nine peptides ranging from 1-10 amino acids-all achieved within a 19-day workflow. Notably, the two de novo-designed self-assembling peptides demonstrated low cytotoxicity and biocompatibility, as confirmed by live/dead assays. This work highlights the capacity of HydrogelFinder to diversify the design of self-assembling peptides through non-peptidal small molecules, offering a powerful toolkit and paradigm for future peptide discovery endeavors.

2.
J Med Chem ; 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38748846

RESUMO

Precisely predicting molecular properties is crucial in drug discovery, but the scarcity of labeled data poses a challenge for applying deep learning methods. While large-scale self-supervised pretraining has proven an effective solution, it often neglects domain-specific knowledge. To tackle this issue, we introduce Task-Oriented Multilevel Learning based on BERT (TOML-BERT), a dual-level pretraining framework that considers both structural patterns and domain knowledge of molecules. TOML-BERT achieved state-of-the-art prediction performance on 10 pharmaceutical datasets. It has the capability to mine contextual information within molecular structures and extract domain knowledge from massive pseudo-labeled data. The dual-level pretraining accomplished significant positive transfer, with its two components making complementary contributions. Interpretive analysis elucidated that the effectiveness of the dual-level pretraining lies in the prior learning of a task-related molecular representation. Overall, TOML-BERT demonstrates the potential of combining multiple pretraining tasks to extract task-oriented knowledge, advancing molecular property prediction in drug discovery.

3.
Nucleic Acids Res ; 2024 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-38572755

RESUMO

ADMETlab 3.0 is the second updated version of the web server that provides a comprehensive and efficient platform for evaluating ADMET-related parameters as well as physicochemical properties and medicinal chemistry characteristics involved in the drug discovery process. This new release addresses the limitations of the previous version and offers broader coverage, improved performance, API functionality, and decision support. For supporting data and endpoints, this version includes 119 features, an increase of 31 compared to the previous version. The updated number of entries is 1.5 times larger than the previous version with over 400 000 entries. ADMETlab 3.0 incorporates a multi-task DMPNN architecture coupled with molecular descriptors, a method that not only guaranteed calculation speed for each endpoint simultaneously, but also achieved a superior performance in terms of accuracy and robustness. In addition, an API has been introduced to meet the growing demand for programmatic access to large amounts of data in ADMETlab 3.0. Moreover, this version includes uncertainty estimates in the prediction results, aiding in the confident selection of candidate compounds for further studies and experiments. ADMETlab 3.0 is publicly for access without the need for registration at: https://admetlab3.scbdd.com.

4.
Drug Discov Today ; 29(6): 103985, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38642700

RESUMO

Active learning (AL) is an iterative feedback process that efficiently identifies valuable data within vast chemical space, even with limited labeled data. This characteristic renders it a valuable approach to tackle the ongoing challenges faced in drug discovery, such as the ever-expanding explore space and the limitations of labeled data. Consequently, AL is increasingly gaining prominence in the field of drug development. In this paper, we comprehensively review the application of AL at all stages of drug discovery, including compounds-target interaction prediction, virtual screening, molecular generation and optimization, as well as molecular properties prediction. Additionally, we discuss the challenges and prospects associated with the current applications of AL in drug discovery.

6.
Nat Protoc ; 19(4): 1105-1121, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38263521

RESUMO

Lead optimization is a crucial step in the drug discovery process, which aims to design potential drug candidates from biologically active hits. During lead optimization, active hits undergo modifications to improve their absorption, distribution, metabolism, excretion and toxicity (ADMET) profiles. Medicinal chemists face key questions regarding which compound(s) should be synthesized next and how to balance multiple ADMET properties. Reliable transformation rules from multiple experimental analyses are critical to improve this decision-making process. We developed OptADMET ( https://cadd.nscc-tj.cn/deploy/optadmet/ ), an integrated web-based platform that provides chemical transformation rules for 32 ADMET properties and leverages prior experimental data for lead optimization. The multiproperty transformation rule database contains a total of 41,779 validated transformation rules generated from the analysis of 177,191 reliable experimental datasets. Additionally, 146,450 rules were generated by analyzing 239,194 molecular data predictions. OptADMET provides the ADMET profiles of all optimized molecules from the queried molecule and enables the prediction of desirable substructure transformations and subsequent validation of drug candidates. OptADMET is based on matched molecular pairs analysis derived from synthetic chemistry, thus providing improved practicality over other methods. OptADMET is designed for use by both experimental and computational scientists.


Assuntos
Descoberta de Drogas , Internet , Bases de Dados Factuais
7.
Artigo em Inglês | MEDLINE | ID: mdl-38285569

RESUMO

Single-cell RNA sequencing (scRNA-seq) is widely used to study cellular heterogeneity in different samples. However, due to technical deficiencies, dropout events often result in zero gene expression values in the gene expression matrix. In this paper, we propose a new imputation method called scCAN, based on adaptive neighborhood clustering, to estimate the zero value of dropouts. Our method continuously updates cell-cell similarity information by simultaneously learning similarity relationships, clustering structures, and imposing new rank constraints on the Laplacian matrix of the similarity matrix, improving the imputation of dropout zero values. To evaluate the performance of this method, we used four simulated and eight real scRNA-seq data for downstream analyses, including cell clustering, recovered gene expression, and reconstructed cell trajectories. Our method improves the performance of the downstream analysis and is better than other imputation methods.


Assuntos
Perfilação da Expressão Gênica , Análise da Expressão Gênica de Célula Única , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Análise por Conglomerados
8.
Methods ; 222: 133-141, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38242382

RESUMO

The versatility of ChatGPT in performing a diverse range of tasks has elicited considerable interest on its potential applications within professional fields. Taking drug discovery as a testbed, this paper provides a comprehensive evaluation of ChatGPT's ability on molecule property prediction. The study focuses on three aspects: 1) Effects of different prompt settings, where we investigate the impact of varying prompts on the prediction outcomes of ChatGPT; 2) Comprehensive evaluation on molecule property prediction, where we conduct a comprehensive evaluation on 53 ADMET-related endpoints; 3) Analysis of ChatGPT's potential and limitations, where we make comparisons with models tailored for molecule property prediction, thus gaining a more accurate understanding of ChatGPT's capabilities and limitations in this area. Through comprehensive evaluation, we find that 1) With appropriate prompt settings, ChatGPT can attain satisfactory prediction outcomes that are competitive with specialized models designed for those tasks. 2) Prompt settings significantly affect ChatGPT's performance. Among all prompt settings, the strategy of selecting examples in few-shot has the greatest impact on results. Scaffold sampling greatly outperforms random sampling. 3) The capacity of ChatGPT to accomplish high-precision predictions is significantly influenced by the quality of examples provided, which may constrain its practical applicability in real-world scenarios. This work highlights ChatGPT's potential and limitations on molecule property prediction, which we hope can inspire future design and evaluation of Large Language Models within scientific domains.


Assuntos
Descoberta de Drogas , Projetos de Pesquisa
9.
J Chem Inf Model ; 64(7): 2174-2194, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37934070

RESUMO

The discovery of new drugs has important implications for human health. Traditional methods for drug discovery rely on experiments to optimize the structure of lead molecules, which are time-consuming and high-cost. Recently, artificial intelligence has exhibited promising and efficient performance for drug-like molecule generation. In particular, deep generative models achieve great success in de novo generation of drug-like molecules with desired properties, showing massive potential for novel drug discovery. In this study, we review the recent progress of molecule generation using deep generative models, mainly focusing on molecule representations, public databases, data processing tools, and advanced artificial intelligence based molecule generation frameworks. In particular, we present a comprehensive comparison of state-of-the-art deep generative models for molecule generation and a summary of commonly used molecular design strategies. We identify research gaps and challenges of molecule generation such as the need for better databases, missing 3D information in molecular representation, and the lack of high-precision evaluation metrics. We suggest future directions for molecular generation and drug discovery.


Assuntos
Inteligência Artificial , Benchmarking , Humanos , Bases de Dados Factuais , Descoberta de Drogas , Desenho de Fármacos
10.
IEEE J Biomed Health Inform ; 28(1): 569-579, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37991904

RESUMO

Adverse drug-drug interactions (DDIs) pose potential risks in polypharmacy due to unknown physicochemical incompatibilities between co-administered drugs. Recent studies have utilized multi-layer graph neural network architectures to model hierarchical molecular substructures of drugs, achieving excellent DDI prediction performance. While extant substructural frameworks effectively encode interactions from atom-level features, they overlook valuable chemical bond representations within molecular graphs. More critically, given the multifaceted nature of DDI prediction tasks involving both known and novel drug combinations, previous methods lack tailored strategies to address these distinct scenarios. The resulting lack of adaptability impedes further improvements to model performance. To tackle these challenges, we propose PEB-DDI, a DDI prediction learning framework with enhanced substructure extraction. First, the information of chemical bonds is integrated and synchronously updated with the atomic nodes. Then, different dual-view strategies are selected based on whether novel drugs are present in the prediction task. Particularly, we constructed Molecular fingerprint-Molecular graph view for transductive task, and Bipartite graph-Molecular graph view for inductive task. Rigorous evaluations on benchmark datasets underscore PEB-DDI's superior performance. Notably, on DrugBank, it achieves an outstanding accuracy rate of 98.18% when predicting previously unknown interactions among approved drugs. Even when faced with novel drugs, PEB-DDI consistently exhibits outstanding generalization capabilities with an accuracy rate of 88.06%, attributing to the proper migrating of molecular basic structure learning.


Assuntos
Redes Neurais de Computação , Humanos , Interações Medicamentosas
11.
IEEE J Biomed Health Inform ; 28(3): 1564-1574, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38153823

RESUMO

The prediction of molecular properties remains a challenging task in the field of drug design and development. Recently, there has been a growing interest in the analysis of biological images. Molecular images, as a novel representation, have proven to be competitive, yet they lack explicit information and detailed semantic richness. Conversely, semantic information in SMILES sequences is explicit but lacks spatial structural details. Therefore, in this study, we focus on and explore the relationship between these two types of representations, proposing a novel multimodal architecture named ISMol. ISMol relies on a cross-attention mechanism to extract information representations of molecules from both images and SMILES strings, thereby predicting molecular properties. Evaluation results on 14 small molecule ADMET datasets indicate that ISMol outperforms machine learning (ML) and deep learning (DL) models based on single-modal representations. In addition, we analyze our method through a large number of experiments to test the superiority, interpretability and generalizability of the method. In summary, ISMol offers a powerful deep learning toolbox for drug discovery in a variety of molecular properties.


Assuntos
Desenho de Fármacos , Descoberta de Drogas , Humanos , Aprendizado de Máquina , Semântica
12.
J Chem Inf Model ; 64(1): 238-249, 2024 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-38103039

RESUMO

Drug repositioning plays a key role in disease treatment. With the large-scale chemical data increasing, many computational methods are utilized for drug-disease association prediction. However, most of the existing models neglect the positive influence of non-Euclidean data and multisource information, and there is still a critical issue for graph neural networks regarding how to set the feature diffuse distance. To solve the problems, we proposed SiSGC, which makes full use of the biological knowledge information as initial features and learns the structure information from the constructed heterogeneous graph with the adaptive selection of the information diffuse distance. Then, the structural features are fused with the denoised similarity information and fed to the advanced classifier of CatBoost to make predictions. Three different data sets are used to confirm the robustness and generalization of SiSGC under two splitting strategies. Experiment results demonstrate that the proposed model achieves superior performance compared with the six leading methods and four variants. Our case study on breast neoplasms further indicates that SiSGC is trustworthy and robust yet simple. We also present four drugs for breast cancer treatment with high confidence and further give an explanation for demonstrating the rationality. There is no doubt that SiSGC can be used as a beneficial supplement for drug repositioning.


Assuntos
Reposicionamento de Medicamentos , Redes Neurais de Computação
14.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-38145949

RESUMO

Prediction of drug-target interactions (DTIs) is essential in medicine field, since it benefits the identification of molecular structures potentially interacting with drugs and facilitates the discovery and reposition of drugs. Recently, much attention has been attracted to network representation learning to learn rich information from heterogeneous data. Although network representation learning algorithms have achieved success in predicting DTI, several manually designed meta-graphs limit the capability of extracting complex semantic information. To address the problem, we introduce an adaptive meta-graph-based method, termed AMGDTI, for DTI prediction. In the proposed AMGDTI, the semantic information is automatically aggregated from a heterogeneous network by training an adaptive meta-graph, thereby achieving efficient information integration without requiring domain knowledge. The effectiveness of the proposed AMGDTI is verified on two benchmark datasets. Experimental results demonstrate that the AMGDTI method overall outperforms eight state-of-the-art methods in predicting DTI and achieves the accurate identification of novel DTIs. It is also verified that the adaptive meta-graph exhibits flexibility and effectively captures complex fine-grained semantic information, enabling the learning of intricate heterogeneous network topology and the inference of potential drug-target relationship.


Assuntos
Algoritmos , Medicina , Benchmarking , Sistemas de Liberação de Medicamentos , Semântica
15.
PLoS Comput Biol ; 19(11): e1011597, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37956212

RESUMO

The powerful combination of large-scale drug-related interaction networks and deep learning provides new opportunities for accelerating the process of drug discovery. However, chemical structures that play an important role in drug properties and high-order relations that involve a greater number of nodes are not tackled in current biomedical networks. In this study, we present a general hypergraph learning framework, which introduces Drug-Substructures relationship into Molecular interaction Networks to construct the micro-to-macro drug centric heterogeneous network (DSMN), and develop a multi-branches HyperGraph learning model, called HGDrug, for Drug multi-task predictions. HGDrug achieves highly accurate and robust predictions on 4 benchmark tasks (drug-drug, drug-target, drug-disease, and drug-side-effect interactions), outperforming 8 state-of-the-art task specific models and 6 general-purpose conventional models. Experiments analysis verifies the effectiveness and rationality of the HGDrug model architecture as well as the multi-branches setup, and demonstrates that HGDrug is able to capture the relations between drugs associated with the same functional groups. In addition, our proposed drug-substructure interaction networks can help improve the performance of existing network models for drug-related prediction tasks.


Assuntos
Algoritmos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Benchmarking , Sistemas de Liberação de Medicamentos , Descoberta de Drogas
16.
J Transl Med ; 21(1): 823, 2023 11 17.
Artigo em Inglês | MEDLINE | ID: mdl-37978379

RESUMO

BACKGROUND: Doxorubicin (DOX)-induced cardiotoxicity (DIC) is a major impediment to its clinical application. It is indispensable to explore alternative treatment molecules or drugs for mitigating DIC. WGX50, an organic extract derived from Zanthoxylum bungeanum Maxim, has anti-inflammatory and antioxidant biological activity, however, its function and mechanism in DIC remain unclear. METHODS: We established DOX-induced cardiotoxicity models both in vitro and in vivo. Echocardiography and histological analyses were used to determine the severity of cardiac injury in mice. The myocardial damage markers cTnT, CK-MB, ANP, BNP, and ferroptosis associated indicators Fe2+, MDA, and GPX4 were measured using ELISA, RT-qPCR, and western blot assays. The morphology of mitochondria was investigated with a transmission electron microscope. The levels of mitochondrial membrane potential, mitochondrial ROS, and lipid ROS were detected using JC-1, MitoSOX™, and C11-BODIPY 581/591 probes. RESULTS: Our findings demonstrate that WGX50 protects DOX-induced cardiotoxicity via restraining mitochondrial ROS and ferroptosis. In vivo, WGX50 effectively relieves doxorubicin-induced cardiac dysfunction, cardiac injury, fibrosis, mitochondrial damage, and redox imbalance. In vitro, WGX50 preserves mitochondrial function by reducing the level of mitochondrial membrane potential and increasing mitochondrial ATP production. Furthermore, WGX50 reduces iron accumulation and mitochondrial ROS, increases GPX4 expression, and regulates lipid metabolism to inhibit DOX-induced ferroptosis. CONCLUSION: Taken together, WGX50 protects DOX-induced cardiotoxicity via mitochondrial ROS and the ferroptosis pathway, which provides novel insights for WGX50 as a promising drug candidate for cardioprotection.


Assuntos
Cardiotoxicidade , Ferroptose , Camundongos , Animais , Cardiotoxicidade/tratamento farmacológico , Cardiotoxicidade/metabolismo , Cardiotoxicidade/patologia , Espécies Reativas de Oxigênio/metabolismo , Miócitos Cardíacos/patologia , Doxorrubicina/efeitos adversos , Mitocôndrias/metabolismo , Estresse Oxidativo , Antioxidantes/metabolismo , Apoptose
17.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37974508

RESUMO

Current methods of molecular image-based drug discovery face two major challenges: (1) work effectively in absence of labels, and (2) capture chemical structure from implicitly encoded images. Given that chemical structures are explicitly encoded by molecular graphs (such as nitrogen, benzene rings and double bonds), we leverage self-supervised contrastive learning to transfer chemical knowledge from graphs to images. Specifically, we propose a novel Contrastive Graph-Image Pre-training (CGIP) framework for molecular representation learning, which learns explicit information in graphs and implicit information in images from large-scale unlabeled molecules via carefully designed intra- and inter-modal contrastive learning. We evaluate the performance of CGIP on multiple experimental settings (molecular property prediction, cross-modal retrieval and distribution similarity), and the results show that CGIP can achieve state-of-the-art performance on all 12 benchmark datasets and demonstrate that CGIP transfers chemical knowledge in graphs to molecular images, enabling image encoder to perceive chemical structures in images. We hope this simple and effective framework will inspire people to think about the value of image for molecular representation learning.


Assuntos
Benchmarking , Aprendizagem , Humanos , Descoberta de Drogas
18.
Nat Commun ; 14(1): 6155, 2023 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-37788995

RESUMO

Automating retrosynthesis with artificial intelligence expedites organic chemistry research in digital laboratories. However, most existing deep-learning approaches are hard to explain, like a "black box" with few insights. Here, we propose RetroExplainer, formulizing the retrosynthesis task into a molecular assembly process, containing several retrosynthetic actions guided by deep learning. To guarantee a robust performance of our model, we propose three units: a multi-sense and multi-scale Graph Transformer, structure-aware contrastive learning, and dynamic adaptive multi-task learning. The results on 12 large-scale benchmark datasets demonstrate the effectiveness of RetroExplainer, which outperforms the state-of-the-art single-step retrosynthesis approaches. In addition, the molecular assembly process renders our model with good interpretability, allowing for transparent decision-making and quantitative attribution. When extended to multi-step retrosynthesis planning, RetroExplainer has identified 101 pathways, in which 86.9% of the single reactions correspond to those already reported in the literature. As a result, RetroExplainer is expected to offer valuable insights for reliable, high-throughput, and high-quality organic synthesis in drug development.

19.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37864294

RESUMO

Drug-gene interaction prediction occupies a crucial position in various areas of drug discovery, such as drug repurposing, lead discovery and off-target detection. Previous studies show good performance, but they are limited to exploring the binding interactions and ignoring the other interaction relationships. Graph neural networks have emerged as promising approaches owing to their powerful capability of modeling correlations under drug-gene bipartite graphs. Despite the widespread adoption of graph neural network-based methods, many of them experience performance degradation in situations where high-quality and sufficient training data are unavailable. Unfortunately, in practical drug discovery scenarios, interaction data are often sparse and noisy, which may lead to unsatisfactory results. To undertake the above challenges, we propose a novel Dynamic hyperGraph Contrastive Learning (DGCL) framework that exploits local and global relationships between drugs and genes. Specifically, graph convolutions are adopted to extract explicit local relations among drugs and genes. Meanwhile, the cooperation of dynamic hypergraph structure learning and hypergraph message passing enables the model to aggregate information in a global region. With flexible global-level messages, a self-augmented contrastive learning component is designed to constrain hypergraph structure learning and enhance the discrimination of drug/gene representations. Experiments conducted on three datasets show that DGCL is superior to eight state-of-the-art methods and notably gains a 7.6% performance improvement on the DGIdb dataset. Further analyses verify the robustness of DGCL for alleviating data sparsity and over-smoothing issues.


Assuntos
Descoberta de Drogas , Aprendizagem , Interações Medicamentosas , Reposicionamento de Medicamentos , Redes Neurais de Computação
20.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37668049

RESUMO

The Sequence Alignment/Map (SAM) format file is the text file used to record alignment information. Alignment is the core of sequencing analysis, and downstream tasks accept mapping results for further processing. Given the rapid development of the sequencing industry today, a comprehensive understanding of the SAM format and related tools is necessary to meet the challenges of data processing and analysis. This paper is devoted to retrieving knowledge in the broad field of SAM. First, the format of SAM is introduced to understand the overall process of the sequencing analysis. Then, existing work is systematically classified in accordance with generation, compression and application, and the involved SAM tools are specifically mined. Lastly, a summary and some thoughts on future directions are provided.


Assuntos
Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA