Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
Artículo en Inglés | MEDLINE | ID: mdl-38739515

RESUMEN

Inductive bias in machine learning (ML) is the set of assumptions describing how a model makes predictions. Different ML-based methods for protein-ligand binding affinity (PLA) prediction have different inductive biases, leading to different levels of generalization capability and interpretability. Intuitively, the inductive bias of an ML-based model for PLA prediction should fit in with biological mechanisms relevant for binding to achieve good predictions with meaningful reasons. To this end, we propose an interaction-based inductive bias to restrict neural networks to functions relevant for binding with two assumptions: (1) A protein-ligand complex can be naturally expressed as a heterogeneous graph with covalent and non-covalent interactions; (2) The predicted PLA is the sum of pairwise atom-atom affinities determined by non-covalent interactions. The interaction-based inductive bias is embodied by an explainable heterogeneous interaction graph neural network (EHIGN) for explicitly modeling pairwise atom-atom interactions to predict PLA from 3D structures. Extensive experiments demonstrate that EHIGN achieves better generalization capability than other state-of-the-art ML-based baselines in PLA prediction and structure-based virtual screening. More importantly, comprehensive analyses of distance-affinity, pose-affinity, and substructure-affinity relations suggest that the interaction-based inductive bias can guide the model to learn atomic interactions that are consistent with physical reality. As a case study to demonstrate practical usefulness, our method is tested for predicting the efficacy of Nirmatrelvir against SARS-CoV-2 variants. EHIGN successfully recognizes the changes in the efficacy of Nirmatrelvir for different SARS-CoV-2 variants with meaningful reasons.

2.
Neural Netw ; 177: 106367, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38754215

RESUMEN

While computer vision has proven valuable for medical image segmentation, its application faces challenges such as limited dataset sizes and the complexity of effectively leveraging unlabeled images. To address these challenges, we present a novel semi-supervised, consistency-based approach termed the data-efficient medical segmenter (DEMS). The DEMS features an encoder-decoder architecture and incorporates the developed online automatic augmenter (OAA) and residual robustness enhancement (RRE) blocks. The OAA augments input data with various image transformations, thereby diversifying the dataset to improve the generalization ability. The RRE enriches feature diversity and introduces perturbations to create varied inputs for different decoders, thereby providing enhanced variability. Moreover, we introduce a sensitive loss to further enhance consistency across different decoders and stabilize the training process. Extensive experimental results on both our own and three public datasets affirm the effectiveness of DEMS. Under extreme data shortage scenarios, our DEMS achieves 16.85% and 10.37% improvement in dice score compared with the U-Net and top-performed state-of-the-art method, respectively. Given its superior data efficiency, DEMS could present significant advancements in medical segmentation under small data regimes. The project homepage can be accessed at https://github.com/NUS-Tim/DEMS.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Redes Neurales de la Computación , Algoritmos , Bases de Datos Factuales
3.
Chem Sci ; 14(39): 10684-10701, 2023 Oct 11.
Artículo en Inglés | MEDLINE | ID: mdl-37829020

RESUMEN

Traditional Chinese Medicine (TCM) has long been viewed as a precious source of modern drug discovery. AI-assisted drug discovery (AIDD) has been investigated extensively. However, there are still two challenges in applying AIDD to guide TCM drug discovery: the lack of a large amount of standardized TCM-related information and AIDD is prone to pathological failures in out-of-domain data. We have released TCM Database@Taiwan in 2011, and it has been widely disseminated and used. Now, we developed TCMBank, the largest systematic free TCM database, which is an extension of TCM Database@Taiwan. TCMBank contains 9192 herbs, 61 966 ingredients (unduplicated), 15 179 targets, 32 529 diseases, and their pairwise relationships. By integrating multiple data sources, TCMBank provides 3D structure information of ingredients and provides a standard list and detailed information on herbs, ingredients, targets and diseases. TCMBank has an intelligent document identification module that continuously adds TCM-related information retrieved from the literature in PubChem. In addition, driven by TCMBank big data, we developed an ensemble learning-based drug discovery protocol for identifying potential leads and drug repurposing. We take colorectal cancer and Alzheimer's disease as examples to demonstrate how to accelerate drug discovery by artificial intelligence. Using TCMBank, researchers can view literature-driven relationship mapping between herbs/ingredients and genes/diseases, allowing the understanding of molecular action mechanisms for ingredients and identification of new potentially effective treatments. TCMBank is available at https://TCMBank.CN/.

4.
Heliyon ; 9(9): e19585, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37809802

RESUMEN

Medical Ultrasound (US) is one of the most widely used imaging modalities in clinical practice, but its usage presents unique challenges such as variable imaging quality. Deep Learning (DL) models can serve as advanced medical US image analysis tools, but their performance is greatly limited by the scarcity of large datasets. To solve the common data shortage, we develop GSDA, a Generative Adversarial Network (GAN)-based semi-supervised data augmentation method. GSDA consists of the GAN and Convolutional Neural Network (CNN). The GAN synthesizes and pseudo-labels high-resolution, high-quality US images, and both real and synthesized images are then leveraged to train the CNN. To address the training challenges of both GAN and CNN with limited data, we employ transfer learning techniques during their training. We also introduce a novel evaluation standard that balances classification accuracy with computational time. We evaluate our method on the BUSI dataset and GSDA outperforms existing state-of-the-art methods. With the high-resolution and high-quality images synthesized, GSDA achieves a 97.9% accuracy using merely 780 images. Given these promising results, we believe that GSDA holds potential as an auxiliary tool for medical US analysis.

5.
Comput Biol Med ; 164: 107268, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37494821

RESUMEN

The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structures of the transformer. After that, we depict the recent progress of the transformer in the field of MIA. We organize the applications in a sequence of different tasks, including classification, segmentation, captioning, registration, detection, enhancement, localization, and synthesis. The mainstream classification and segmentation tasks are further divided into eleven medical image modalities. A large number of experiments studied in this review illustrate that the transformer-based method outperforms existing methods through comparisons with multiple evaluation metrics. Finally, we discuss the open challenges and future opportunities in this field. This task-modality review with the latest contents, detailed information, and comprehensive comparison may greatly benefit the broad MIA community.


Asunto(s)
Benchmarking , Procesamiento de Lenguaje Natural , Procesamiento de Imagen Asistido por Computador
6.
Neural Netw ; 165: 94-105, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37276813

RESUMEN

Understanding drug-drug interactions (DDI) of new drugs is critical for minimizing unexpected adverse drug reactions. The modeling of new drugs is called a cold start scenario. In this scenario, Only a few structural information or physicochemical information about new drug is available. The 3D conformation of drug molecules usually plays a crucial role in chemical properties compared to the 2D structure. 3D graph network with few-shot learning is a promising solution. However, the 3D heterogeneity of drug molecules and the discretization of atomic distributions lead to spatial confusion in few-shot learning. Here, we propose a 3D graph neural network with few-shot learning, Meta3D-DDI, to predict DDI events in cold start scenario. The 3DGNN ensures rotation and translation invariance by calculating atomic pairwise distances, and incorporates 3D structure and distance information in the information aggregation stage. The continuous filter interaction module can continuously simulate the filter to obtain the interaction between the target atom and other atoms. Meta3D-DDI further develops a FSL strategy based on bilevel optimization to transfer meta-knowledge for DDI prediction tasks from existing drugs to new drugs. In addition, the existing cold start setting may cause the scaffold structure information in the training set to leak into the test set. We design scaffold-based cold start scenario to ensure that the drug scaffolds in the training set and test set do not overlap. The extensive experiments demonstrate that our architecture achieves the SOTA performance for DDI prediction under scaffold-based cold start scenario on two real-world datasets. The visual experiment shows that Meta3D-DDI significantly improves the learning for DDI prediction of new drugs. We also demonstrate how Meta3D-DDI can reduce the amount of data required to make meaningful DDI predictions.


Asunto(s)
Conocimiento , Aprendizaje , Interacciones Farmacológicas , Redes Neurales de la Computación , Rotación
7.
Artículo en Inglés | MEDLINE | ID: mdl-37028032

RESUMEN

Finding candidate molecules with favorable pharmacological activity, low toxicity, and proper pharmacokinetic properties is an important task in drug discovery. Deep neural networks have made impressive progress in accelerating and improving drug discovery. However, these techniques rely on a large amount of label data to form accurate predictions of molecular properties. At each stage of the drug discovery pipeline, usually, only a few biological data of candidate molecules and derivatives are available, indicating that the application of deep neural networks for low-data drug discovery is still a formidable challenge. Here, we propose a meta learning architecture with graph attention network, Meta-GAT, to predict molecular properties in low-data drug discovery. The GAT captures the local effects of atomic groups at the atom level through the triple attentional mechanism and implicitly captures the interactions between different atomic groups at the molecular level. GAT is used to perceive molecular chemical environment and connectivity, thereby effectively reducing sample complexity. Meta-GAT further develops a meta learning strategy based on bilevel optimization, which transfers meta knowledge from other attribute prediction tasks to low-data target tasks. In summary, our work demonstrates how meta learning can reduce the amount of data required to make meaningful predictions of molecules in low-data scenarios. Meta learning is likely to become the new learning paradigm in low-data drug discovery. The source code is publicly available at: https://github.com/lol88/Meta-GAT.

9.
J Phys Chem Lett ; 14(8): 2020-2033, 2023 Mar 02.
Artículo en Inglés | MEDLINE | ID: mdl-36794930

RESUMEN

Predicting protein-ligand binding affinities (PLAs) is a core problem in drug discovery. Recent advances have shown great potential in applying machine learning (ML) for PLA prediction. However, most of them omit the 3D structures of complexes and physical interactions between proteins and ligands, which are considered essential to understanding the binding mechanism. This paper proposes a geometric interaction graph neural network (GIGN) that incorporates 3D structures and physical interactions for predicting protein-ligand binding affinities. Specifically, we design a heterogeneous interaction layer that unifies covalent and noncovalent interactions into the message passing phase to learn node representations more effectively. The heterogeneous interaction layer also follows fundamental biological laws, including invariance to translations and rotations of the complexes, thus avoiding expensive data augmentation strategies. GIGN achieves state-of-the-art performance on three external test sets. Moreover, by visualizing learned representations of protein-ligand complexes, we show that the predictions of GIGN are biologically meaningful.


Asunto(s)
Redes Neurales de la Computación , Proteínas , Ligandos , Unión Proteica , Proteínas/química , Aprendizaje Automático
10.
Chem Sci ; 13(29): 8693-8703, 2022 Jul 29.
Artículo en Inglés | MEDLINE | ID: mdl-35974769

RESUMEN

Drug-drug interactions (DDIs) can trigger unexpected pharmacological effects on the body, and the causal mechanisms are often unknown. Graph neural networks (GNNs) have been developed to better understand DDIs. However, identifying key substructures that contribute most to the DDI prediction is a challenge for GNNs. In this study, we presented a substructure-aware graph neural network, a message passing neural network equipped with a novel substructure attention mechanism and a substructure-substructure interaction module (SSIM) for DDI prediction (SA-DDI). Specifically, the substructure attention was designed to capture size- and shape-adaptive substructures based on the chemical intuition that the sizes and shapes are often irregular for functional groups in molecules. DDIs are fundamentally caused by chemical substructure interactions. Thus, the SSIM was used to model the substructure-substructure interactions by highlighting important substructures while de-emphasizing the minor ones for DDI prediction. We evaluated our approach in two real-world datasets and compared the proposed method with the state-of-the-art DDI prediction models. The SA-DDI surpassed other approaches on the two datasets. Moreover, the visual interpretation results showed that the SA-DDI was sensitive to the structure information of drugs and was able to detect the key substructures for DDIs. These advantages demonstrated that the proposed method improved the generalization and interpretation capability of DDI prediction modeling.

11.
Phys Chem Chem Phys ; 24(9): 5383-5393, 2022 Mar 02.
Artículo en Inglés | MEDLINE | ID: mdl-35169821

RESUMEN

Predicting quantum mechanical properties (QMPs) is very important for the innovation of material and chemistry science. Multitask deep learning models have been widely used in QMPs prediction. However, existing multitask learning models often train multiple QMPs prediction tasks simultaneously without considering the internal relationships and differences between tasks, which may cause the model to overfit easy tasks. In this study, we first proposed a multiscale dynamic attention graph neural network (MDGNN) for molecular representation learning. The MDGNN was designed in a multitask learning fashion that can solve multiple learning tasks at the same time. We then introduced a dynamic task balancing (DTB) strategy combining task differences and difficulties to reduce overfitting across multiple tasks. Finally, we adopted gradient-weighted class activation mapping (Grad-CAM) to analyze a deep learning model for frontier molecular orbital, highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) energy level predictions. We evaluated our approach using two large QMPs datasets and compared the proposed method to the state-of-the-art multitask learning models. The MDGNN outperforms other multitask learning approaches on two datasets. The DTB strategy can further improve the performance of MDGNN significantly. Moreover, we show that Grad-CAM creates explanations that are consistent with the molecular orbitals theory. These advantages demonstrate that the proposed method improves the generalization and interpretation capability of QMPs prediction modeling.


Asunto(s)
Aprendizaje Profundo , Aprendizaje Automático , Redes Neurales de la Computación
12.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34428290

RESUMEN

With the rapid development of proteomics and the rapid increase of target molecules for drug action, computer-aided drug design (CADD) has become a basic task in drug discovery. One of the key challenges in CADD is molecular representation. High-quality molecular expression with chemical intuition helps to promote many boundary problems of drug discovery. At present, molecular representation still faces several urgent problems, such as the polysemy of substructures and unsmooth information flow between atomic groups. In this research, we propose a deep contextualized Bi-LSTM architecture, Mol2Context-vec, which can integrate different levels of internal states to bring dynamic representations of molecular substructures. And the obtained molecular context representation can capture the interactions between any atomic groups, especially a pair of atomic groups that are topologically distant. Experiments show that Mol2Context-vec achieves state-of-the-art performance on multiple benchmark datasets. In addition, the visual interpretation of Mol2Context-vec is very close to the structural properties of chemical molecules as understood by humans. These advantages indicate that Mol2Context-vec can be used as a reliable and effective tool for molecular expression. Availability: The source code is available for download in https://github.com/lol88/Mol2Context-vec.


Asunto(s)
Quimioinformática/métodos , Aprendizaje Profundo , Diseño de Fármacos/métodos , Descubrimiento de Drogas/métodos , Algoritmos , Humanos , Modelos Moleculares , Teoría Cuántica , Relación Estructura-Actividad
13.
IEEE J Transl Eng Health Med ; 8: 1900111, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32082952

RESUMEN

BACKGROUND: Cardiovascular diseases (CVD) are the leading cause of death globally. Electrocardiogram (ECG) analysis can provide thoroughly assessment for different CVDs efficiently. We propose a multi-task group bidirectional long short-term memory (MTGBi-LSTM) framework to intelligent recognize multiple CVDs based on multi-lead ECG signals. METHODS: This model employs a Group Bi-LSTM (GBi-LSTM) and Residual Group Convolutional Neural Network (Res-GCNN) to learn the dual feature representation of ECG space and time series. GBi-LSTM is divided into Global Bi-LSTM and Intra-Group Bi-LSTM, which can learn the features of each ECG lead and the relationship between leads. Then, through attention mechanism, the different lead information of ECG is integrated to make the model to possess the powerful feature discriminability. Through multi-task learning, the model can fully mine the association information between diseases and obtain more accurate diagnostic results. In addition, we propose a dynamic weighted loss function to better quantify the loss to overcome the imbalance between classes. RESULTS: Based on more than 170,000 clinical 12-lead ECG analysis, the MTGBi-LSTM method achieved accuracy, precision, recall and F1 of 88.86%, 90.67%, 94.19% and 92.39%, respectively. The experimental results show that the proposed MTGBi-LSTM method can reliably realize ECG analysis and provide an effective tool for computer-aided diagnosis of CVD.

14.
J Phys Chem Lett ; 10(17): 4947-4961, 2019 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-31411476

RESUMEN

Longevity is a very important and interesting topic, and Klotho has been demonstrated to be related to longevity. We combined network pharmacology, machine learning, deep learning, and molecular dynamics (MD) simulation to investigate potent lead drugs. Related protein insulin-like growth factor 1 receptor (IGF1R) and insulin receptor (IR) were docked with the traditional Chinese medicine (TCM) database to screen out several novel candidates. Besides, nine different machine learning algorithms were performed to build reliable and accurate predicted models. Moreover, we used the novel deep learning algorithm to build predicted models. All of these models obtained significant R2, which are all greater than 0.87 on the training set and higher than 0.88 for the test set, respectively. The long time 500 ns molecular dynamics simulation was also performed to verify protein-ligand properties and stability. Finally, we obtained Antifebrile Dichroa, Holarrhena antidysenterica, and Gelsemium sempervirens, which might be potent TCMs for two targets.


Asunto(s)
Inteligencia Artificial , Descubrimiento de Drogas , Receptor IGF Tipo 1/antagonistas & inhibidores , Receptor de Insulina/antagonistas & inhibidores , Algoritmos , Antígenos CD/metabolismo , Sitios de Unión , Bases de Datos Factuales , Glucuronidasa/antagonistas & inhibidores , Glucuronidasa/metabolismo , Concentración 50 Inhibidora , Proteínas Klotho , Ligandos , Medicina Tradicional China , Simulación del Acoplamiento Molecular , Unión Proteica , Mapas de Interacción de Proteínas , Receptor IGF Tipo 1/metabolismo , Receptor de Insulina/metabolismo , Transducción de Señal , Termodinámica
15.
J Phys Chem Lett ; 10(15): 4382-4400, 2019 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-31304749

RESUMEN

It has been demonstrated that MMP13 enzyme is related to most cancer cell tumors. The world's largest traditional Chinese medicine database was applied to screen for structure-based drug design and ligand-based drug design. To predict drug activity, machine learning models (Random Forest (RF), AdaBoost Regressor (ABR), Gradient Boosting Regressor (GBR)), and Deep Learning models were utilized to validate the Docking results, and we obtained an R2 of 0.922 on the training set and 0.804 on the test set in the RF algorithm. For the Deep Learning algorithm, R2 of the training set is 0.90, and R2 of the test set is 0.810. However, these TCM compounds fly away during the molecular dynamics (MD) simulation. We seek another method: peptide design. All peptide database were screened by the Docking process. Modification peptides were optimized the interaction modes, and the affinities were assessed with ZDOCK protocol and Refine Docked protein protocol. The 300 ns MD simulation evaluated the stability of receptor-peptide complexes. The double-site effect appeared on S2, a designed peptide based on a known inhibitor, when complexed with BCL2. S3, a designed peptide referred from endogenous inhibitor P16, competed against cyclin when binding with CDK6. The MDM2 inhibitors S5 and S6 were derived from the P53 structure and stable binding with MDM2. A flexible region of peptides S5 and S6 may enhance the binding ability by changing its own conformation, which was unforeseen. These peptides (S2, S3, S5, and S6) are potentially interesting to treat cancer; however, these findings need to be affirmed by biological testing, which will be conducted in the near future.


Asunto(s)
Antineoplásicos/química , Aprendizaje Profundo , Aprendizaje Automático , Modelos Moleculares , Péptidos/química , Proteínas/química , Algoritmos , Sitios de Unión , Quinasa 6 Dependiente de la Ciclina/química , Inhibidor p16 de la Quinasa Dependiente de Ciclina/química , Bases de Datos Farmacéuticas , Bases de Datos de Proteínas , Diseño de Fármacos , Ligandos , Metaloproteinasa 13 de la Matriz/química , Mutación , Proteínas Proto-Oncogénicas c-bcl-2/química , Proteínas Proto-Oncogénicas c-mdm2/química , Proteína p53 Supresora de Tumor/química , Proteína p53 Supresora de Tumor/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA