Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Artículo en Inglés | MEDLINE | ID: mdl-38739515

RESUMEN

Inductive bias in machine learning (ML) is the set of assumptions describing how a model makes predictions. Different ML-based methods for protein-ligand binding affinity (PLA) prediction have different inductive biases, leading to different levels of generalization capability and interpretability. Intuitively, the inductive bias of an ML-based model for PLA prediction should fit in with biological mechanisms relevant for binding to achieve good predictions with meaningful reasons. To this end, we propose an interaction-based inductive bias to restrict neural networks to functions relevant for binding with two assumptions: (1) A protein-ligand complex can be naturally expressed as a heterogeneous graph with covalent and non-covalent interactions; (2) The predicted PLA is the sum of pairwise atom-atom affinities determined by non-covalent interactions. The interaction-based inductive bias is embodied by an explainable heterogeneous interaction graph neural network (EHIGN) for explicitly modeling pairwise atom-atom interactions to predict PLA from 3D structures. Extensive experiments demonstrate that EHIGN achieves better generalization capability than other state-of-the-art ML-based baselines in PLA prediction and structure-based virtual screening. More importantly, comprehensive analyses of distance-affinity, pose-affinity, and substructure-affinity relations suggest that the interaction-based inductive bias can guide the model to learn atomic interactions that are consistent with physical reality. As a case study to demonstrate practical usefulness, our method is tested for predicting the efficacy of Nirmatrelvir against SARS-CoV-2 variants. EHIGN successfully recognizes the changes in the efficacy of Nirmatrelvir for different SARS-CoV-2 variants with meaningful reasons.

2.
J Chem Theory Comput ; 19(22): 8446-8459, 2023 Nov 28.
Artículo en Inglés | MEDLINE | ID: mdl-37938978

RESUMEN

Flexible modeling of the protein-ligand complex structure is a fundamental challenge for in silico drug development. Recent studies have improved commonly used docking tools by incorporating extra-deep learning-based steps. However, such strategies limit their accuracy and efficiency because they retain massive sampling pressure and lack consideration for flexible biomolecular changes. In this study, we propose FlexPose, a geometric graph network capable of direct flexible modeling of complex structures in Euclidean space without the following conventional sampling and scoring strategies. Our model adopts two key designs: scalar-vector dual feature representation and SE(3)-equivariant network, to manage dynamic structural changes, as well as two strategies: conformation-aware pretraining and weakly supervised learning, to boost model generalizability in unseen chemical space. Benefiting from these paradigms, our model dramatically outperforms all tested popular docking tools and recently advanced deep learning methods, especially in tasks involving protein conformation changes. We further investigate the impact of protein and ligand similarity on the model performance with two conformation-aware strategies. Moreover, FlexPose provides an affinity estimation and model confidence for postanalysis.


Asunto(s)
Aprendizaje Profundo , Ligandos , Simulación del Acoplamiento Molecular , Proteínas/química , Conformación Proteica , Unión Proteica
3.
Chem Sci ; 14(39): 10684-10701, 2023 Oct 11.
Artículo en Inglés | MEDLINE | ID: mdl-37829020

RESUMEN

Traditional Chinese Medicine (TCM) has long been viewed as a precious source of modern drug discovery. AI-assisted drug discovery (AIDD) has been investigated extensively. However, there are still two challenges in applying AIDD to guide TCM drug discovery: the lack of a large amount of standardized TCM-related information and AIDD is prone to pathological failures in out-of-domain data. We have released TCM Database@Taiwan in 2011, and it has been widely disseminated and used. Now, we developed TCMBank, the largest systematic free TCM database, which is an extension of TCM Database@Taiwan. TCMBank contains 9192 herbs, 61 966 ingredients (unduplicated), 15 179 targets, 32 529 diseases, and their pairwise relationships. By integrating multiple data sources, TCMBank provides 3D structure information of ingredients and provides a standard list and detailed information on herbs, ingredients, targets and diseases. TCMBank has an intelligent document identification module that continuously adds TCM-related information retrieved from the literature in PubChem. In addition, driven by TCMBank big data, we developed an ensemble learning-based drug discovery protocol for identifying potential leads and drug repurposing. We take colorectal cancer and Alzheimer's disease as examples to demonstrate how to accelerate drug discovery by artificial intelligence. Using TCMBank, researchers can view literature-driven relationship mapping between herbs/ingredients and genes/diseases, allowing the understanding of molecular action mechanisms for ingredients and identification of new potentially effective treatments. TCMBank is available at https://TCMBank.CN/.

4.
Comput Biol Med ; 164: 107268, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37494821

RESUMEN

The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structures of the transformer. After that, we depict the recent progress of the transformer in the field of MIA. We organize the applications in a sequence of different tasks, including classification, segmentation, captioning, registration, detection, enhancement, localization, and synthesis. The mainstream classification and segmentation tasks are further divided into eleven medical image modalities. A large number of experiments studied in this review illustrate that the transformer-based method outperforms existing methods through comparisons with multiple evaluation metrics. Finally, we discuss the open challenges and future opportunities in this field. This task-modality review with the latest contents, detailed information, and comprehensive comparison may greatly benefit the broad MIA community.


Asunto(s)
Benchmarking , Procesamiento de Lenguaje Natural , Procesamiento de Imagen Asistido por Computador
5.
Neural Netw ; 165: 94-105, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37276813

RESUMEN

Understanding drug-drug interactions (DDI) of new drugs is critical for minimizing unexpected adverse drug reactions. The modeling of new drugs is called a cold start scenario. In this scenario, Only a few structural information or physicochemical information about new drug is available. The 3D conformation of drug molecules usually plays a crucial role in chemical properties compared to the 2D structure. 3D graph network with few-shot learning is a promising solution. However, the 3D heterogeneity of drug molecules and the discretization of atomic distributions lead to spatial confusion in few-shot learning. Here, we propose a 3D graph neural network with few-shot learning, Meta3D-DDI, to predict DDI events in cold start scenario. The 3DGNN ensures rotation and translation invariance by calculating atomic pairwise distances, and incorporates 3D structure and distance information in the information aggregation stage. The continuous filter interaction module can continuously simulate the filter to obtain the interaction between the target atom and other atoms. Meta3D-DDI further develops a FSL strategy based on bilevel optimization to transfer meta-knowledge for DDI prediction tasks from existing drugs to new drugs. In addition, the existing cold start setting may cause the scaffold structure information in the training set to leak into the test set. We design scaffold-based cold start scenario to ensure that the drug scaffolds in the training set and test set do not overlap. The extensive experiments demonstrate that our architecture achieves the SOTA performance for DDI prediction under scaffold-based cold start scenario on two real-world datasets. The visual experiment shows that Meta3D-DDI significantly improves the learning for DDI prediction of new drugs. We also demonstrate how Meta3D-DDI can reduce the amount of data required to make meaningful DDI predictions.


Asunto(s)
Conocimiento , Aprendizaje , Interacciones Farmacológicas , Redes Neurales de la Computación , Rotación
6.
Nat Commun ; 14(1): 3009, 2023 May 25.
Artículo en Inglés | MEDLINE | ID: mdl-37230985

RESUMEN

Retrosynthesis planning, the process of identifying a set of available reactions to synthesize the target molecules, remains a major challenge in organic synthesis. Recently, computer-aided synthesis planning has gained renewed interest and various retrosynthesis prediction algorithms based on deep learning have been proposed. However, most existing methods are limited to the applicability and interpretability of model predictions, and further improvement of predictive accuracy to a more practical level is still required. In this work, inspired by the arrow-pushing formalism in chemical reaction mechanisms, we present an end-to-end architecture for retrosynthesis prediction called Graph2Edits. Specifically, Graph2Edits is based on graph neural network to predict the edits of the product graph in an auto-regressive manner, and sequentially generates transformation intermediates and final reactants according to the predicted edits sequence. This strategy combines the two-stage processes of semi-template-based methods into one-pot learning, improving the applicability in some complicated reactions, and also making its predictions more interpretable. Evaluated on the standard benchmark dataset USPTO-50k, our model achieves the state-of-the-art performance for semi-template-based retrosynthesis with a promising 55.1% top-1 accuracy.

7.
Artículo en Inglés | MEDLINE | ID: mdl-37028032

RESUMEN

Finding candidate molecules with favorable pharmacological activity, low toxicity, and proper pharmacokinetic properties is an important task in drug discovery. Deep neural networks have made impressive progress in accelerating and improving drug discovery. However, these techniques rely on a large amount of label data to form accurate predictions of molecular properties. At each stage of the drug discovery pipeline, usually, only a few biological data of candidate molecules and derivatives are available, indicating that the application of deep neural networks for low-data drug discovery is still a formidable challenge. Here, we propose a meta learning architecture with graph attention network, Meta-GAT, to predict molecular properties in low-data drug discovery. The GAT captures the local effects of atomic groups at the atom level through the triple attentional mechanism and implicitly captures the interactions between different atomic groups at the molecular level. GAT is used to perceive molecular chemical environment and connectivity, thereby effectively reducing sample complexity. Meta-GAT further develops a meta learning strategy based on bilevel optimization, which transfers meta knowledge from other attribute prediction tasks to low-data target tasks. In summary, our work demonstrates how meta learning can reduce the amount of data required to make meaningful predictions of molecules in low-data scenarios. Meta learning is likely to become the new learning paradigm in low-data drug discovery. The source code is publicly available at: https://github.com/lol88/Meta-GAT.

9.
J Phys Chem Lett ; 14(8): 2020-2033, 2023 Mar 02.
Artículo en Inglés | MEDLINE | ID: mdl-36794930

RESUMEN

Predicting protein-ligand binding affinities (PLAs) is a core problem in drug discovery. Recent advances have shown great potential in applying machine learning (ML) for PLA prediction. However, most of them omit the 3D structures of complexes and physical interactions between proteins and ligands, which are considered essential to understanding the binding mechanism. This paper proposes a geometric interaction graph neural network (GIGN) that incorporates 3D structures and physical interactions for predicting protein-ligand binding affinities. Specifically, we design a heterogeneous interaction layer that unifies covalent and noncovalent interactions into the message passing phase to learn node representations more effectively. The heterogeneous interaction layer also follows fundamental biological laws, including invariance to translations and rotations of the complexes, thus avoiding expensive data augmentation strategies. GIGN achieves state-of-the-art performance on three external test sets. Moreover, by visualizing learned representations of protein-ligand complexes, we show that the predictions of GIGN are biologically meaningful.


Asunto(s)
Redes Neurales de la Computación , Proteínas , Ligandos , Unión Proteica , Proteínas/química , Aprendizaje Automático
10.
Chem Sci ; 13(29): 8693-8703, 2022 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-35974769

RESUMEN

Drug-drug interactions (DDIs) can trigger unexpected pharmacological effects on the body, and the causal mechanisms are often unknown. Graph neural networks (GNNs) have been developed to better understand DDIs. However, identifying key substructures that contribute most to the DDI prediction is a challenge for GNNs. In this study, we presented a substructure-aware graph neural network, a message passing neural network equipped with a novel substructure attention mechanism and a substructure-substructure interaction module (SSIM) for DDI prediction (SA-DDI). Specifically, the substructure attention was designed to capture size- and shape-adaptive substructures based on the chemical intuition that the sizes and shapes are often irregular for functional groups in molecules. DDIs are fundamentally caused by chemical substructure interactions. Thus, the SSIM was used to model the substructure-substructure interactions by highlighting important substructures while de-emphasizing the minor ones for DDI prediction. We evaluated our approach in two real-world datasets and compared the proposed method with the state-of-the-art DDI prediction models. The SA-DDI surpassed other approaches on the two datasets. Moreover, the visual interpretation results showed that the SA-DDI was sensitive to the structure information of drugs and was able to detect the key substructures for DDIs. These advantages demonstrated that the proposed method improved the generalization and interpretation capability of DDI prediction modeling.

11.
Chem Sci ; 13(3): 816-833, 2022 Jan 19.
Artículo en Inglés | MEDLINE | ID: mdl-35173947

RESUMEN

Predicting drug-target affinity (DTA) is beneficial for accelerating drug discovery. Graph neural networks (GNNs) have been widely used in DTA prediction. However, existing shallow GNNs are insufficient to capture the global structure of compounds. Besides, the interpretability of the graph-based DTA models highly relies on the graph attention mechanism, which can not reveal the global relationship between each atom of a molecule. In this study, we proposed a deep multiscale graph neural network based on chemical intuition for DTA prediction (MGraphDTA). We introduced a dense connection into the GNN and built a super-deep GNN with 27 graph convolutional layers to capture the local and global structure of the compound simultaneously. We also developed a novel visual explanation method, gradient-weighted affinity activation mapping (Grad-AAM), to analyze a deep learning model from the chemical perspective. We evaluated our approach using seven benchmark datasets and compared the proposed method to the state-of-the-art deep learning (DL) models. MGraphDTA outperforms other DL-based approaches significantly on various datasets. Moreover, we show that Grad-AAM creates explanations that are consistent with pharmacologists, which may help us gain chemical insights directly from data beyond human perception. These advantages demonstrate that the proposed method improves the generalization and interpretation capability of DTA prediction modeling.

12.
Phys Chem Chem Phys ; 24(9): 5383-5393, 2022 Mar 02.
Artículo en Inglés | MEDLINE | ID: mdl-35169821

RESUMEN

Predicting quantum mechanical properties (QMPs) is very important for the innovation of material and chemistry science. Multitask deep learning models have been widely used in QMPs prediction. However, existing multitask learning models often train multiple QMPs prediction tasks simultaneously without considering the internal relationships and differences between tasks, which may cause the model to overfit easy tasks. In this study, we first proposed a multiscale dynamic attention graph neural network (MDGNN) for molecular representation learning. The MDGNN was designed in a multitask learning fashion that can solve multiple learning tasks at the same time. We then introduced a dynamic task balancing (DTB) strategy combining task differences and difficulties to reduce overfitting across multiple tasks. Finally, we adopted gradient-weighted class activation mapping (Grad-CAM) to analyze a deep learning model for frontier molecular orbital, highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) energy level predictions. We evaluated our approach using two large QMPs datasets and compared the proposed method to the state-of-the-art multitask learning models. The MDGNN outperforms other multitask learning approaches on two datasets. The DTB strategy can further improve the performance of MDGNN significantly. Moreover, we show that Grad-CAM creates explanations that are consistent with the molecular orbitals theory. These advantages demonstrate that the proposed method improves the generalization and interpretation capability of QMPs prediction modeling.


Asunto(s)
Aprendizaje Profundo , Aprendizaje Automático , Redes Neurales de la Computación
13.
Med Phys ; 48(11): 7127-7140, 2021 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-34528263

RESUMEN

PURPOSE: Coronavirus disease 2019 (COVID-19) has caused a serious global health crisis. It has been proven that the deep learning method has great potential to assist doctors in diagnosing COVID-19 by automatically segmenting the lesions in computed tomography (CT) slices. However, there are still several challenges restricting the application of these methods, including high variation in lesion characteristics and low contrast between lesion areas and healthy tissues. Moreover, the lack of high-quality labeled samples and large number of patients lead to the urgency to develop a high accuracy model, which performs well not only under supervision but also with semi-supervised methods. METHODS: We propose a content-aware lung infection segmentation deep residual network (content-aware residual UNet (CARes-UNet)) to segment the lesion areas of COVID-19 from the chest CT slices. In our CARes-UNet, the residual connection was used in the convolutional block, which alleviated the degradation problem during the training. Then, the content-aware upsampling modules were introduced to improve the performance of the model while reducing the computation cost. Moreover, to achieve faster convergence, an advanced optimizer named Ranger was utilized to update the model's parameters during training. Finally, we employed a semi-supervised segmentation framework to deal with the problem of lacking pixel-level labeled data. RESULTS: We evaluated our approach using three public datasets with multiple metrics and compared its performance to several models. Our method outperforms other models in multiple indicators, for instance in terms of Dice coefficient on COVID-SemiSeg Dataset, CARes-UNet got the score 0.731, and semi-CARes-UNet further boosted it to 0.776. More ablation studies were done and validated the effectiveness of each key component of our proposed model. CONCLUSIONS: Compared with the existing neural network methods applied to the COVID-19 lesion segmentation tasks, our CARes-UNet can gain more accurate segmentation results, and semi-CARes-UNet can further improve it using semi-supervised learning methods while presenting a possible way to solve the problem of lack of high-quality annotated samples. Our CARes-UNet and semi-CARes-UNet can be used in artificial intelligence-empowered computer-aided diagnosis system to improve diagnostic accuracy in this ongoing COVID-19 pandemic.


Asunto(s)
COVID-19 , Pandemias , Inteligencia Artificial , Humanos , Procesamiento de Imagen Asistido por Computador , SARS-CoV-2 , Tomografía Computarizada por Rayos X
14.
J Mol Graph Model ; 107: 107965, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34167067

RESUMEN

Since the Limk1 is a promising drug target and few inhibitors with good Limk1/ROCK2 selectivity have been reported, discovering potential and selective Limk1 inhibitors with novel scaffolds is becoming an urgent need to develop new treatments for the related diseases. Here, we utilized molecular docking to screen potential compounds of Limk1 from Traditional Chinese Medicine (TCM) database. Meanwhile, we performed a three-dimensional graph convolutional network (3DGCN), based on 3D molecular graph, to predict the inhibitory activity of Limk1 and ROCK2. Compared with the baseline models (RF, GCN and Weave), the 3DGCN achieved higher accuracy and the averaged RMSE values on test sets for Limk1 and ROCK2 were 0.721 and 0.852 respectively. In 3DGCN, above 80% of the test-set molecules from both two datasets were predicted within absolute error of 1.0 and the feature visualization suggested that it could automatically learn relevant structure features including 3D molecular information from a specific task for prediction. Furthermore, molecular dynamics (MD) simulations within 100 ns were employed to verify the stability of ligand-protein complexes and reveal the binding modes of the potential selective lead compounds of Limk1. Finally, integrating docking results, the predicted values by the 3DGCN and the MD analysis, we found that 7549 and 2007_15649 might be the potential and selective inhibitors for Limk1 receptor.


Asunto(s)
Simulación de Dinámica Molecular , Ligandos , Simulación del Acoplamiento Molecular
15.
J Phys Chem Lett ; 12(17): 4247-4261, 2021 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-33904745

RESUMEN

Deep learning (DL) provides opportunities for the identification of drug-target interactions (DTIs). The challenges of applying DL lie primarily with the lack of interpretability. Also, most of the existing DL-based methods formulate the drug and target encoder as two independent modules without considering the relationship between them. In this study, we propose a mutual learning mechanism to bridge the gap between the two encoders. We formulated the DTI problem from a global perspective by inserting mutual learning layers between the two encoders. The mutual learning layer was achieved by multihead attention and position-aware attention. The neural attention mechanism also provides effective visualization, which makes it easier to analyze a model. We evaluated our approach using three benchmark kinase data sets under different experimental settings and compared the proposed method to three baseline models. We found that the four methods yielded similar results in the random split setting (training and test sets share common drugs and targets), while the proposed method increases the predictive performance significantly in the orphan-target and orphan-drug split setting (training and test sets share only targets or drugs). The experimental results demonstrated that the proposed method improved the generalization and interpretation capability of DTI modeling.


Asunto(s)
Aprendizaje Profundo , Compuestos Orgánicos/metabolismo , Preparaciones Farmacéuticas/metabolismo , Proteínas/metabolismo , Compuestos Orgánicos/química , Preparaciones Farmacéuticas/química , Unión Proteica , Proteínas/química
16.
IEEE J Biomed Health Inform ; 25(6): 1864-1872, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33739926

RESUMEN

Chest computed tomography (CT) image data is necessary for early diagnosis, treatment, and prognosis of Coronavirus Disease 2019 (COVID-19). Artificial intelligence has been tried to help clinicians in improving the diagnostic accuracy and working efficiency of CT. Whereas, existing supervised approaches on CT image of COVID-19 pneumonia require voxel-based annotations for training, which take a lot of time and effort. This paper proposed a weakly-supervised method for COVID-19 lesion localization based on generative adversarial network (GAN) with image-level labels only. We first introduced a GAN-based framework to generate normal-looking CT slices from CT slices with COVID-19 lesions. We then developed a novel feature match strategy to improve the reality of generated images by guiding the generator to capture the complex texture of chest CT images. Finally, the localization map of lesions can be easily obtained by subtracting the output image from its corresponding input image. By adding a classifier branch to the GAN-based framework to classify localization maps, we can further develop a diagnosis system with improved classification accuracy. Three CT datasets from hospitals of Sao Paulo, Italian Society of Medical and Interventional Radiology, and China Medical University about COVID-19 were collected in this article for evaluation. Our weakly supervised learning method obtained AUC of 0.883, dice coefficient of 0.575, accuracy of 0.884, sensitivity of 0.647, specificity of 0.929, and F1-score of 0.640, which exceeded other widely used weakly supervised object localization methods by a significant margin. We also compared the proposed method with fully supervised learning methods in COVID-19 lesion segmentation task, the proposed weakly supervised method still leads to a competitive result with dice coefficient of 0.575. Furthermore, we also analyzed the association between illness severity and visual score, we found that the common severity cohort had the largest sample size as well as the highest visual score which suggests our method can help rapid diagnosis of COVID-19 patients, especially in massive common severity cohort. In conclusion, we proposed this novel method can serve as an accurate and efficient tool to alleviate the bottleneck of expert annotation cost and advance the progress of computer-aided COVID-19 diagnosis.


Asunto(s)
COVID-19/diagnóstico por imagen , Pulmón/diagnóstico por imagen , Aprendizaje Automático Supervisado , Tomografía Computarizada por Rayos X/métodos , COVID-19/virología , Conjuntos de Datos como Asunto , Humanos , Reproducibilidad de los Resultados , SARS-CoV-2/aislamiento & purificación
17.
Med Phys ; 48(4): 1771-1780, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-33555048

RESUMEN

PURPOSE: This study aimed to improve the accuracy of the hippocampus segmentation through multitask edge-aware learning. METHOD: We developed a multitask framework for computerized hippocampus segmentation. We used three-dimensional (3D) U-net as our backbone model with two training objectives: (a) to minimize the difference between the targeted binary mask and the model prediction; and (b) to optimize an auxiliary edge-prediction task which is designed to guide the model detection of the weak boundary of the hippocampus in model optimization. To balance the multiple task objectives, we proposed an improved gradient normalization by adaptively adjusting the weight of losses from different tasks. A total of 247 T1-weighted MRIs including 131 without contrast and 116 with contrast were collected from 247 patients to train and validate the proposed method. Segmentation was quantitatively evaluated with the dice coefficient (Dice), Hausdorff distance (HD), and average Hausdorff distance (AVD). The 3D U-net was used for baseline comparison. We used a Wilcoxon signed-rank test to compare repeated measurements (Dice, HD, and AVD) by different segmentations. RESULTS: Through fivefold cross-validation, our multitask edge-aware learning achieved Dice of 0.8483 ± 0.0036, HD of 7.5706 ± 1.2330 mm, and AVD of 0.1522 ± 0.0165 mm, respectively. Conversely, the baseline results were 0.8340 ± 0.0072, 10.4631 ± 2.3736 mm, and 0.1884 ± 0.0286 mm, respectively. With a Wilcoxon signed-rank test, we found that the differences between our method and the baseline were statistically significant (P < 0.05). CONCLUSION: Our results demonstrated the efficiency of multitask edge-aware learning in hippocampus segmentation for hippocampal sparing whole-brain radiotherapy. The proposed framework may also be useful for other low-contrast small organ segmentations on medical imaging modalities.


Asunto(s)
Hipocampo , Imagen por Resonancia Magnética , Hipocampo/diagnóstico por imagen , Humanos , Aprendizaje
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...