Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38555479

RESUMEN

MOTIVATION: Accurately predicting molecular metabolic stability is of great significance to drug research and development, ensuring drug safety and effectiveness. Existing deep learning methods, especially graph neural networks, can reveal the molecular structure of drugs and thus efficiently predict the metabolic stability of molecules. However, most of these methods focus on the message passing between adjacent atoms in the molecular graph, ignoring the relationship between bonds. This makes it difficult for these methods to estimate accurate molecular representations, thereby being limited in molecular metabolic stability prediction tasks. RESULTS: We propose the MS-BACL model based on bond graph augmentation technology and contrastive learning strategy, which can efficiently and reliably predict the metabolic stability of molecules. To our knowledge, this is the first time that bond-to-bond relationships in molecular graph structures have been considered in the task of metabolic stability prediction. We build a bond graph based on 'atom-bond-atom', and the model can simultaneously capture the information of atoms and bonds during the message propagation process. This enhances the model's ability to reveal the internal structure of the molecule, thereby improving the structural representation of the molecule. Furthermore, we perform contrastive learning training based on the molecular graph and its bond graph to learn the final molecular representation. Multiple sets of experimental results on public datasets show that the proposed MS-BACL model outperforms the state-of-the-art model. AVAILABILITY AND IMPLEMENTATION: The code and data are publicly available at https://github.com/taowang11/MS.


Asunto(s)
Redes Neurales de la Computación
2.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38446739

RESUMEN

Antimicrobial peptides (AMPs), short peptides with diverse functions, effectively target and combat various organisms. The widespread misuse of chemical antibiotics has led to increasing microbial resistance. Due to their low drug resistance and toxicity, AMPs are considered promising substitutes for traditional antibiotics. While existing deep learning technology enhances AMP generation, it also presents certain challenges. Firstly, AMP generation overlooks the complex interdependencies among amino acids. Secondly, current models fail to integrate crucial tasks like screening, attribute prediction and iterative optimization. Consequently, we develop a integrated deep learning framework, Diff-AMP, that automates AMP generation, identification, attribute prediction and iterative optimization. We innovatively integrate kinetic diffusion and attention mechanisms into the reinforcement learning framework for efficient AMP generation. Additionally, our prediction module incorporates pre-training and transfer learning strategies for precise AMP identification and screening. We employ a convolutional neural network for multi-attribute prediction and a reinforcement learning-based iterative optimization strategy to produce diverse AMPs. This framework automates molecule generation, screening, attribute prediction and optimization, thereby advancing AMP research. We have also deployed Diff-AMP on a web server, with code, data and server details available in the Data Availability section.


Asunto(s)
Aminoácidos , Péptidos Antimicrobianos , Antibacterianos , Difusión , Cinética
3.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38171927

RESUMEN

Exploring microbial stress responses to drugs is crucial for the advancement of new therapeutic methods. While current artificial intelligence methodologies have expedited our understanding of potential microbial responses to drugs, the models are constrained by the imprecise representation of microbes and drugs. To this end, we combine deep autoencoder and subgraph augmentation technology for the first time to propose a model called JDASA-MRD, which can identify the potential indistinguishable responses of microbes to drugs. In the JDASA-MRD model, we begin by feeding the established similarity matrices of microbe and drug into the deep autoencoder, enabling to extract robust initial features of both microbes and drugs. Subsequently, we employ the MinHash and HyperLogLog algorithms to account intersections and cardinality data between microbe and drug subgraphs, thus deeply extracting the multi-hop neighborhood information of nodes. Finally, by integrating the initial node features with subgraph topological information, we leverage graph neural network technology to predict the microbes' responses to drugs, offering a more effective solution to the 'over-smoothing' challenge. Comparative analyses on multiple public datasets confirm that the JDASA-MRD model's performance surpasses that of current state-of-the-art models. This research aims to offer a more profound insight into the adaptability of microbes to drugs and to furnish pivotal guidance for drug treatment strategies. Our data and code are publicly available at: https://github.com/ZZCrazy00/JDASA-MRD.


Asunto(s)
Algoritmos , Inteligencia Artificial , Redes Neurales de la Computación
4.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37427977

RESUMEN

Studies have shown that the mechanism of action of many drugs is related to miRNA. In-depth research on the relationship between miRNA and drugs can provide theoretical foundations and practical approaches for various areas, such as drug target discovery, drug repositioning and biomarker research. Traditional biological experiments to test miRNA-drug susceptibility are costly and time-consuming. Thus, sequence- or topology-based deep learning methods are recognized in this field for their efficiency and accuracy. However, these methods have limitations in dealing with sparse topologies and higher-order information of miRNA (drug) feature. In this work, we propose GCFMCL, a model for multi-view contrastive learning based on graph collaborative filtering. To the best of our knowledge, this is the first attempt that incorporates contrastive learning strategy into the graph collaborative filtering framework to predict the sensitivity relationships between miRNA and drug. The proposed multi-view contrastive learning method is divided into topological contrastive objective and feature contrastive objective: (1) For the homogeneous neighbors of the topological graph, we propose a novel topological contrastive learning method via constructing the contrastive target through the topological neighborhood information of nodes. (2) The proposed model obtains feature contrastive targets from high-order feature information according to the correlation of node features, and mines potential neighborhood relationships in the feature space. The proposed multi-view comparative learning effectively alleviates the impact of heterogeneous node noise and graph data sparsity in graph collaborative filtering, and significantly enhances the performance of the model. Our study employs a dataset derived from the NoncoRNA and ncDR databases, encompassing 2049 experimentally validated miRNA-drug sensitivity associations. Five-fold cross-validation shows that the Area Under the Curve (AUC), Area Under the Precision-Recall Curve (AUPR) and F1-score (F1) of GCFMCL reach 95.28%, 95.66% and 89.77%, which outperforms the state-of-the-art (SOTA) method by the margin of 2.73%, 3.42% and 4.96%, respectively. Our code and data can be accessed at https://github.com/kkkayle/GCFMCL.


Asunto(s)
Sistemas de Liberación de Medicamentos , MicroARNs , Área Bajo la Curva , Bases de Datos Factuales , Descubrimiento de Drogas , MicroARNs/genética
5.
Bioinformatics ; 40(5)2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38648052

RESUMEN

MOTIVATION: Accurate inference of potential drug-protein interactions (DPIs) aids in understanding drug mechanisms and developing novel treatments. Existing deep learning models, however, struggle with accurate node representation in DPI prediction, limiting their performance. RESULTS: We propose a new computational framework that integrates global and local features of nodes in the drug-protein bipartite graph for efficient DPI inference. Initially, we employ pre-trained models to acquire fundamental knowledge of drugs and proteins and to determine their initial features. Subsequently, the MinHash and HyperLogLog algorithms are utilized to estimate the similarity and set cardinality between drug and protein subgraphs, serving as their local features. Then, an energy-constrained diffusion mechanism is integrated into the transformer architecture, capturing interdependencies between nodes in the drug-protein bipartite graph and extracting their global features. Finally, we fuse the local and global features of nodes and employ multilayer perceptrons to predict the likelihood of potential DPIs. A comprehensive and precise node representation guarantees efficient prediction of unknown DPIs by the model. Various experiments validate the accuracy and reliability of our model, with molecular docking results revealing its capability to identify potential DPIs not present in existing databases. This approach is expected to offer valuable insights for furthering drug repurposing and personalized medicine research. AVAILABILITY AND IMPLEMENTATION: Our code and data are accessible at: https://github.com/ZZCrazy00/DPI.


Asunto(s)
Algoritmos , Simulación del Acoplamiento Molecular , Proteínas , Proteínas/química , Proteínas/metabolismo , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/metabolismo , Biología Computacional/métodos , Aprendizaje Profundo
6.
Bioinformatics ; 2024 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-38967119

RESUMEN

MOTIVATION: Accurate prediction of acute dermal toxicity (ADT) is essential for the safe and effective development of contact drugs. Currently, graph neural networks (GNNs), a form of deep learning technology, accurately model the structure of compound molecules, enhancing predictions of their ADT. However, many existing methods emphasize atom-level information transfer and overlook crucial data conveyed by molecular bonds and their interrelationships. Additionally, these methods often generate" equal" node representations across the entire graph, failing to accentuate" important" substructures like functional groups, pharmacophores, and toxicophores, thereby reducing interpretability. RESULTS: We introduce a novel model, GraphADT, utilizing structure remapping and multi-view graph pooling technologies to accurately predict compound ADT. Initially, our model applies structure remapping to better delineate bonds, transforming" bonds" into new nodes and" bond-atom-bond" interactions into new edges, thereby reconstructing the compound molecular graph. Subsequently, we employ multi-view graph pooling to amalgamate data from various perspectives, minimizing biases inherent to single-view analyses. Following this, the model generates a robust node ranking collaboratively, emphasizing critical nodes or substructures to enhance model interpretability. Lastly, we apply a graph comparison learning strategy to train both the original and structure remapped molecular graphs, deriving the final molecular representation. Experimental results on public datasets indicate that the GraphADT model outperforms existing state-of-the-art models. The GraphADT model has been demonstrated to effectively predict compound ADT, offering potential guidance for the development of contact drugs and related treatments. AVAILABILITY AND IMPLEMENTATION: Our code and data are accessible at: https://github.com/mxqmxqmxq/GraphADT.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

7.
Methods ; 221: 73-81, 2024 01.
Artículo en Inglés | MEDLINE | ID: mdl-38123109

RESUMEN

Research indicates that miRNAs present in herbal medicines are crucial for identifying disease markers, advancing gene therapy, facilitating drug delivery, and so on. These miRNAs maintain stability in the extracellular environment, making them viable tools for disease diagnosis. They can withstand the digestive processes in the gastrointestinal tract, positioning them as potential carriers for specific oral drug delivery. By engineering plants to generate effective, non-toxic miRNA interference sequences, it's possible to broaden their applicability, including the treatment of diseases such as hepatitis C. Consequently, delving into the miRNA-disease associations (MDAs) within herbal medicines holds immense promise for diagnosing and addressing miRNA-related diseases. In our research, we propose the SGAE-MDA model, which harnesses the strengths of a graph autoencoder (GAE) combined with a semi-supervised approach to uncover potential MDAs in herbal medicines more effectively. Leveraging the GAE framework, the SGAE-MDA model exactly integrates the inherent feature vectors of miRNAs and disease nodes with the regulatory data in the miRNA-disease network. Additionally, the proposed semi-supervised learning approach randomly hides the partial structure of the miRNA-disease network, subsequently reconstructing them within the GAE framework. This technique effectively minimizes network noise interference. Through comparison against other leading deep learning models, the results consistently highlighted the superior performance of the proposed SGAE-MDA model. Our code and dataset can be available at: https://github.com/22n9n23/SGAE-MDA.


Asunto(s)
MicroARNs , MicroARNs/genética , Algoritmos , Biología Computacional/métodos , Aprendizaje Automático Supervisado , Extractos Vegetales
8.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-36063562

RESUMEN

Noncoding RNAs (ncRNAs) have recently attracted considerable attention due to their key roles in biology. The ncRNA-proteins interaction (NPI) is often explored to reveal some biological activities that ncRNA may affect, such as biological traits, diseases, etc. Traditional experimental methods can accomplish this work but are often labor-intensive and expensive. Machine learning and deep learning methods have achieved great success by exploiting sufficient sequence or structure information. Graph Neural Network (GNN)-based methods consider the topology in ncRNA-protein graphs and perform well on tasks like NPI prediction. Based on GNN, some pairwise constraint methods have been developed to apply on homogeneous networks, but not used for NPI prediction on heterogeneous networks. In this paper, we construct a pairwise constrained NPI predictor based on dual Graph Convolutional Network (GCN) called NPI-DGCN. To our knowledge, our method is the first to train a heterogeneous graph-based model using a pairwise learning strategy. Instead of binary classification, we use a rank layer to calculate the score of an ncRNA-protein pair. Moreover, our model is the first to predict NPIs on the ncRNA-protein bipartite graph rather than the homogeneous graph. We transform the original ncRNA-protein bipartite graph into two homogenous graphs on which to explore second-order implicit relationships. At the same time, we model direct interactions between two homogenous graphs to explore explicit relationships. Experimental results on the four standard datasets indicate that our method achieves competitive performance with other state-of-the-art methods. And the model is available at https://github.com/zhuoninnin1992/NPIPredict.


Asunto(s)
Redes Neurales de la Computación , ARN no Traducido , Aprendizaje Automático , Proteínas/química , ARN no Traducido/genética
9.
J Chem Inf Model ; 64(7): 2798-2806, 2024 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-37643082

RESUMEN

Plant small secretory peptides (SSPs) play an important role in the regulation of biological processes in plants. Accurately predicting SSPs enables efficient exploration of their functions. Traditional experimental verification methods are very reliable and accurate, but they require expensive equipment and a lot of time. The method of machine learning speeds up the prediction process of SSPs, but the instability of feature extraction will also lead to further limitations of this type of method. Therefore, this paper proposes a new feature-correction-based model for SSP recognition in plants, abbreviated as SE-SSP. The model mainly includes the following three advantages: First, the use of transformer encoders can better reveal implicit features. Second, design a feature correction module suitable for sequences, named 2-D SENET, to adaptively adjust the features to obtain a more robust feature representation. Third, stack multiple linear modules to further dig out the deep information on the sample. At the same time, the training based on a contrastive learning strategy can alleviate the problem of sparse samples. We construct experiments on publicly available data sets, and the results verify that our model shows an excellent performance. The proposed model can be used as a convenient and effective SSP prediction tool in the future. Our data and code are publicly available at https://github.com/wrab12/SE-SSP/.


Asunto(s)
Suministros de Energía Eléctrica , Aprendizaje Automático , Transporte Biológico , Péptidos , Proyectos de Investigación
10.
J Chem Inf Model ; 64(7): 2912-2920, 2024 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-37920888

RESUMEN

Deep learning methods can accurately study noncoding RNA protein interactions (NPI), which is of great significance in gene regulation, human disease, and other fields. However, the computational method for predicting NPI in large-scale dynamic ncRNA protein bipartite graphs is rarely discussed, which is an online modeling and prediction problem. In addition, the results published by researchers on the Web site cannot meet real-time needs due to the large amount of basic data and long update cycles. Therefore, we propose a real-time method based on the dynamic ncRNA-protein bipartite graph learning framework, termed ML-GNN, which can model and predict the NPIs in real time. Our proposed method has the following advantages: first, the meta-learning strategy can alleviate the problem of large prediction errors in sparse neighborhood samples; second, dynamic modeling of newly added data can reduce computational pressure and predict NPIs in real-time. In the experiment, we built a dynamic bipartite graph based on 300000 NPIs from the NPInterv4.0 database. The experimental results indicate that our model achieved excellent performance in multiple experiments. The code for the model is available at https://github.com/taowang11/ML-NPI, and the data can be downloaded freely at http://bigdata.ibp.ac.cn/npinter4.


Asunto(s)
ARN no Traducido , Investigadores , Humanos , Bases de Datos Factuales , ARN no Traducido/genética
11.
BMC Genomics ; 24(1): 742, 2023 Dec 05.
Artículo en Inglés | MEDLINE | ID: mdl-38053026

RESUMEN

BACKGROUND: DNA methylation, instrumental in numerous life processes, underscores the paramount importance of its accurate prediction. Recent studies suggest that deep learning, due to its capacity to extract profound insights, provides a more precise DNA methylation prediction. However, issues related to the stability and generalization performance of these models persist. RESULTS: In this study, we introduce an efficient and stable DNA methylation prediction model. This model incorporates a feature fusion approach, adaptive feature correction technology, and a contrastive learning strategy. The proposed model presents several advantages. First, DNA sequences are encoded at four levels to comprehensively capture intricate information across multi-scale and low-span features. Second, we design a sequence-specific feature correction module that adaptively adjusts the weights of sequence features. This improvement enhances the model's stability and scalability, or its generality. Third, our contrastive learning strategy mitigates the instability issues resulting from sparse data. To validate our model, we conducted multiple sets of experiments on commonly used datasets, demonstrating the model's robustness and stability. Simultaneously, we amalgamate various datasets into a single, unified dataset. The experimental outcomes from this combined dataset substantiate the model's robust adaptability. CONCLUSIONS: Our research findings affirm that the StableDNAm model is a general, stable, and effective instrument for DNA methylation prediction. It holds substantial promise for providing invaluable assistance in future methylation-related research and analyses.


Asunto(s)
Metilación de ADN , Procesamiento Proteico-Postraduccional
12.
Methods ; 207: 97-102, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36155251

RESUMEN

The research of miRNA-lncRNA interactions (MLIs) has received great attention recently due to their vital roles in microbiology and profound significance in diseases. Currently, many related studies mainly focus on animals and the link prediction problem on plants is rarely discussed comprehensively. Motivated by this, we achieve link prediction task based on the concept of bipartite graph and verify encouraging performance of our conclusions by conducting experiments on plant datasets. In this work, we firstly extract attribute information and structure information as base features and further process these information for network embedding. Intra-partition and inter-partition proximity modelling are conducted to construct the loss function, which facilitates the training of parameters. Finally, the superiority of our presented approach is shown by carrying out experiments on four plant datasets, which reflects the significance of this work to the research of microbiology and disease.


Asunto(s)
MicroARNs , ARN Largo no Codificante , ARN Largo no Codificante/genética , MicroARNs/genética , Biología Computacional/métodos , Algoritmos
13.
Methods ; 207: 74-80, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36108992

RESUMEN

Non-coding RNA (ncRNA) s play an considerable role in the current biological sciences, such as gene transcription, gene expression, etc. Exploring the ncRNA-protein interactions(NPI) is of great significance, while some experimental techniques are very expensive in terms of time consumption and labor cost. This has promoted the birth of some computational algorithms related to traditional statistics and artificial intelligence. However, these algorithms usually require the sequence or structural feature vector of the molecule. Although graph neural network (GNN) s has been widely used in recent academic and industrial researches, its potential remains unexplored in the field of detecting NPI. Hence, we present a novel GNN-based model to detect NPI in this paper, where the detecting problem of NPI is transformed into the graph link prediction problem. Specifically, the proposed method utilizes two groups of labels to distinguish two different types of nodes: ncRNA and protein, which alleviates the problem of over-coupling in graph network. Subsequently, ncRNA and protein embedding is initially optimized based on the cluster ownership relationship of nodes in the graph. Moreover, the model applies a self-attention mechanism to preserve the graph topology to reduce information loss during pooling. The experimental results indicate that the proposed model indeed has superior performance.


Asunto(s)
Inteligencia Artificial , Redes Neurales de la Computación , ARN no Traducido/genética , ARN no Traducido/metabolismo , Algoritmos , Proteínas
14.
Artículo en Inglés | MEDLINE | ID: mdl-38386576

RESUMEN

Improving the drug development process can expedite the introduction of more novel drugs that cater to the demands of precision medicine. Accurately predicting molecular properties remains a fundamental challenge in drug discovery and development. Currently, a plethora of computer-aided drug discovery (CADD) methods have been widely employed in the field of molecular prediction. However, most of these methods primarily analyze molecules using low-dimensional representations such as SMILES notations, molecular fingerprints, and molecular graph-based descriptors. Only a few approaches have focused on incorporating and utilizing high-dimensional spatial structural representations of molecules. In light of the advancements in artificial intelligence, we introduce a 3D graph-spatial co-representation model called AEGNN-M, which combines two graph neural networks, GAT and EGNN. AEGNN-M enables learning of information from both molecular graphs representations and 3D spatial structural representations to predict molecular properties accurately. We conducted experiments on seven public datasets, three regression datasets and 14 breast cancer cell line phenotype screening datasets, comparing the performance of AEGNN-M with state-of-the-art deep learning methods. Extensive experimental results demonstrate the satisfactory performance of the AEGNN-M model. Furthermore, we analyzed the performance impact of different modules within AEGNN-M and the influence of spatial structural representations on the model's performance. The interpretability analysis also revealed the significance of specific atoms in determining particular molecular properties.

15.
Mol Ther Nucleic Acids ; 35(1): 102103, 2024 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-38261851

RESUMEN

Inferring small molecule-miRNA associations (MMAs) is crucial for revealing the intricacies of biological processes and disease mechanisms. Deep learning, renowned for its exceptional speed and accuracy, is extensively used for predicting MMAs. However, given their heavy reliance on data, inaccuracies during data collection can make these methods susceptible to noise interference. To address this challenge, we introduce the joint masking and self-supervised (JMSS)-MMA model. This model synergizes graph autoencoders with a probability distribution-based masking strategy, effectively countering the impact of noisy data and enabling precise predictions of unknown MMAs. Operating in a self-supervised manner, it deeply encodes the relationship data of small molecules and miRNA through the graph autoencoder, delving into its latent information. Our masking strategy has successfully reduced data noise, enhancing prediction accuracy. To our knowledge, this is the pioneering integration of a masking strategy with graph autoencoders for MMA prediction. Furthermore, the JMSS-MMA model incorporates a node-degree-based decoder, deepening the understanding of the network's structure. Experiments on two mainstream datasets confirm the model's efficiency and precision, and ablation studies further attest to its robustness. We firmly believe that this model will revolutionize drug development, personalized medicine, and biomedical research.

16.
Mol Ther Nucleic Acids ; 35(2): 102187, 2024 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-38706631

RESUMEN

Long non-coding RNAs (lncRNAs) are important factors involved in biological regulatory networks. Accurately predicting lncRNA-protein interactions (LPIs) is vital for clarifying lncRNA's functions and pathogenic mechanisms. Existing deep learning models have yet to yield satisfactory results in LPI prediction. Recently, graph autoencoders (GAEs) have seen rapid development, excelling in tasks like link prediction and node classification. We employed GAE technology for LPI prediction, devising the FMSRT-LPI model based on path masking and degree regression strategies and thereby achieving satisfactory outcomes. This represents the first known integration of path masking and degree regression strategies into the GAE framework for potential LPI inference. The effectiveness of our FMSRT-LPI model primarily relies on four key aspects. First, within the GAE framework, our model integrates multi-source relationships of lncRNAs and proteins with LPN's topological data. Second, the implemented masking strategy efficiently identifies LPN's key paths, reconstructs the network, and reduces the impact of redundant or incorrect data. Third, the integrated degree decoder balances degree and structural information, enhancing node representation. Fourth, the PolyLoss function we introduced is more appropriate for LPI prediction tasks. The results on multiple public datasets further demonstrate our model's potential in LPI prediction.

17.
Brief Funct Genomics ; 2024 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-38391194

RESUMEN

MicroRNAs (miRNAs) are found ubiquitously in biological cells and play a pivotal role in regulating the expression of numerous target genes. Therapies centered around miRNAs are emerging as a promising strategy for disease treatment, aiming to intervene in disease progression by modulating abnormal miRNA expressions. The accurate prediction of miRNA-drug resistance (MDR) is crucial for the success of miRNA therapies. Computational models based on deep learning have demonstrated exceptional performance in predicting potential MDRs. However, their effectiveness can be compromised by errors in the data acquisition process, leading to inaccurate node representations. To address this challenge, we introduce the GAM-MDR model, which combines the graph autoencoder (GAE) with random path masking techniques to precisely predict potential MDRs. The reliability and effectiveness of the GAM-MDR model are mainly reflected in two aspects. Firstly, it efficiently extracts the representations of miRNA and drug nodes in the miRNA-drug network. Secondly, our designed random path masking strategy efficiently reconstructs critical paths in the network, thereby reducing the adverse impact of noisy data. To our knowledge, this is the first time that a random path masking strategy has been integrated into a GAE to infer MDRs. Our method was subjected to multiple validations on public datasets and yielded promising results. We are optimistic that our model could offer valuable insights for miRNA therapeutic strategies and deepen the understanding of the regulatory mechanisms of miRNAs. Our data and code are publicly available at GitHub:https://github.com/ZZCrazy00/GAM-MDR.

18.
Comput Biol Med ; 171: 108104, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38335821

RESUMEN

Drug-food interactions (DFIs) crucially impact patient safety and drug efficacy by modifying absorption, distribution, metabolism, and excretion. The application of deep learning for predicting DFIs is promising, yet the development of computational models remains in its early stages. This is mainly due to the complexity of food compounds, challenging dataset developers in acquiring comprehensive ingredient data, often resulting in incomplete or vague food component descriptions. DFI-MS tackles this issue by employing an accurate feature representation method alongside a refined computational model. It innovatively achieves a more precise characterization of food features, a previously daunting task in DFI research. This is accomplished through modules designed for perturbation interactions, feature alignment and domain separation, and inference feedback. These modules extract essential information from features, using a perturbation module and a feature interaction encoder to establish robust representations. The feature alignment and domain separation modules are particularly effective in managing data with diverse frequencies and characteristics. DFI-MS stands out as the first in its field to combine data augmentation, feature alignment, domain separation, and contrastive learning. The flexibility of the inference feedback module allows its application in various downstream tasks. Demonstrating exceptional performance across multiple datasets, DFI-MS represents a significant advancement in food presentations technology. Our code and data are available at https://github.com/kkkayle/DFI-MS.


Asunto(s)
Interacciones Alimento-Droga , Alimentos , Humanos , Aprendizaje Automático Supervisado
19.
Comput Biol Med ; 174: 108484, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38643595

RESUMEN

Accurately identifying cancer driver genes (CDGs) is crucial for guiding cancer treatment and has recently received great attention from researchers. However, the high complexity and heterogeneity of cancer gene regulatory networks limit the precition accuracy of existing deep learning models. To address this, we introduce a model called SCIS-CDG that utilizes Schur complement graph augmentation and independent subspace feature extraction techniques to effectively predict potential CDGs. Firstly, a random Schur complement strategy is adopted to generate two augmented views of gene network within a graph contrastive learning framework. Rapid randomization of the random Schur complement strategy enhances the model's generalization and its ability to handle complex networks effectively. Upholding the Schur complement principle in expectations promotes the preservation of the original gene network's vital structure in the augmented views. Subsequently, we employ feature extraction technology using multiple independent subspaces, each trained with independent weights to reduce inter-subspace dependence and improve the model's expressiveness. Concurrently, we introduced a feature expansion component based on the structure of the gene network to address issues arising from the limited dimensionality of node features. Moreover, it can alleviate the challenges posed by the heterogeneity of cancer gene networks to some extent. Finally, we integrate a learnable attention weight mechanism into the graph neural network (GNN) encoder, utilizing feature expansion technology to optimize the significance of various feature levels in the prediction task. Following extensive experimental validation, the SCIS-CDG model has exhibited high efficiency in identifying known CDGs and uncovering potential unknown CDGs in external datasets. Particularly when compared to previous conventional GNN models, its performance has seen significant improved. The code and data are publicly available at: https://github.com/mxqmxqmxq/SCIS-CDG.


Asunto(s)
Redes Reguladoras de Genes , Neoplasias , Humanos , Neoplasias/genética , Biología Computacional/métodos , Aprendizaje Profundo , Algoritmos
20.
Comput Biol Med ; 171: 108177, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38422957

RESUMEN

With the increasing number of microRNAs (miRNAs), identifying essential miRNAs has become an important task that needs to be solved urgently. However, there are few computational methods for essential miRNA identification. Here, we proposed a novel framework called Rotation Forest for Essential MicroRNA identification (RFEM) to predict the essentiality of miRNAs in mice. We first constructed 1,264 miRNA features of all miRNA samples by fusing 38 miRNA features obtained from the PESM paper and 1,226 miRNA functional features calculated based on miRNA-target gene interactions. Then, we employed 182 training samples with 1,264 features to train the rotation forest model, which was applied to compute the essentiality scores of the candidate samples. The main innovations of RFEM were as follows: 1) miRNA functional features were introduced to enrich the diversity of miRNA features; 2) the rotation forest model used decision tree as the base classifier and could increase the difference among base classifiers through feature transformation to achieve better ensemble results. Experimental results show that RFEM significantly outperformed two previous models with the AUC (AUPR) of 0.942 (0.944) in three comparison experiments under 5-fold cross validation, which proved the model's reliable performance. Moreover, ablation study was further conducted to demonstrate the effectiveness of the novel miRNA functional features. Additionally, in the case studies of assessing the essentiality of unlabeled miRNAs, experimental literature confirmed that 7 of the top 10 predicted miRNAs have crucial biological functions in mice. Therefore, RFEM would be a reliable tool for identifying essential miRNAs.


Asunto(s)
MicroARNs , Ratones , Animales , MicroARNs/genética , Rotación , Biología Computacional/métodos , Algoritmos , Predisposición Genética a la Enfermedad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA